Shunsuke Inenaga, Ayumi Shinohara and Masayuki Takeda
A Fully Compressed Pattern Matching Algorithm for Simple Collage Systems
Abstract: |
We study the fully compressed pattern matching problem (FCPM problem): Given T and P which are descriptions of text T and pattern P respectively, find the occurrences of P in T without decompressing T or P. This problem is rather challenging since patterns are also given in a compressed form. In this paper we present an FCPM algorithm for simple collage systems. Collage systems are a general framework that can represent various kinds of dictionary-based compressions, and simple collage systems are a subclass that includes LZW and LZ78 compressions. Collage systems are of the form 〈 D, S 〉, where D is a dictionary and S is a sequence of variables from D. Our FCPM algorithm performs in O(||D||2 + mn log |S|) time, where n = |T| = || D || + |S| and m = |P|. This is faster than the previous best result of O(m2n2) time. |
Download paper: | ||
PostScript |