The Prague Stringology Conference 2005

Heikki Hyyrö and Gonzalo Navarro

Bit-Parallel Computation of Local Similarity Score Matrices with Unitary Weights

Local similarity computation between two sequences permits detecting all the relevant alignments present between subsequences thereof. A well-known dynamic programming algorithm works in time $O(mn)$, $m$ and $n$ being the lengths of the subsequences. The algorithm is rather slow when applied over many sequence pairs.

In this paper we present the first bit-parallel computation of the score matrix, for a simplified choice of scores. If the computer word has $w$ bits, then the resulting algorithm works in O(mn log(min(m,n,w)/w)) time, achieving up to 8-fold speedups in practice. Some DNA comparison applications use precisely the simplified scores we handle, and thus our algorithm is directly applicable. In others, our method can be used as a raw filter to discard most of the strings, so the classical algorithm can be focused only on the substring pairs that can yield relevant results.

Download paper: Article in PostScript Article in PDF
 PostScript   PDF