Self-indexes are largely studied and widely applied structures in string matching.
However, the exact matching of multiple patterns using self-indexes is a topic that has not been the subject of concentrated study although it is an area that may have direct and indirect applications and uses in fields such as bioinformatics.
This paper presents a method of improving the exact search of multiple patterns from a compressed suffix array.
The proposed method is able to cut down run-times for the handled patterns by as much as 71.6%. A set of 1000 patterns of length 1000 nucleotides each is found from a text of 50 MB in size 14.0% faster than by searching the patterns using the locate functionality of the compressed suffix array.
|