The Prague Stringology Conference 2004

Heikki Hyyrö

A Note on Bit-Parallel Alignment Computation

Abstract:
The edit distance between strings A and B is defined as the minimum number of edit operations needed in converting A into B or vice versa. Typically the allowed edit operations are one or more of the following: an insertion, a deletion or a substitution of a character, or a transposition between two adjacent characters. Simple edit distance allows the first two operation types, Levenshtein edit distance the first three, and Damerau distance all four. There exist very efficient O(⌈ m/w ⌉ n) bit-parallel algorithms for computing each of these three distances, where m is the length of A, n is the length of B, and w is the computed word size. In this paper we discuss augmenting the bit-parallel algorithms to recover an optimal alignment between A and B. Such an alignment depicts how to transform A into B by using ed(A,B) operations, where ed(A,B) is the used edit distance (one of the three mentioned above). Previously Iliopoulos and Pinzon have given such an algorithm for the longest common subsequence, which in effect corresponds to the simple edit distance. We propose a simpler method, which is faster and also more general in that our method can be used with any of the above three distances.

Download paper: Article in PostScript Article in PDF
 PostScript   PDF