The Prague Stringology Conference 2003

Loek Cleophas, Gerard Zwaan and Bruce W. Watson

Constructing Factor Oracles

Abstract:
A factor oracle is a data structure for weak factor recognition. It is an automaton built on a string p of length m that is acyclic, recognizes at least all factors of p, has m+1 states which are all final, and has m to 2m-1 transitions. In this paper, we give two alternative algorithms for its construction and prove the constructed automata to be equivalent to the automata constructed by the algorithms in [ACR01]. Although these new O(m2) algorithms are practically inefficient compared to the O(m) algorithm given in [ACR01], they give more insight into factor oracles. Our first algorithm constructs a factor oracle based on the suffixes of p in a way that is more intuitive. Some of the crucial properties of factor oracles, which in [ACR01] need several lemmas to be proven, are immediately obvious. Another important property however becomes less obvious. A second algorithm gives a clear insight in the relationship between the trie or dawg recognizing the factors of p and the factor oracle recognizing a superset thereof. We conjecture that an O(m) version of this trie-based algorithm exists.

Download paper: Article in PostScript Article in PDF
 PostScript   PDF