This decade has witnessed the raise of what I consider the most important
breakthrough of modern times in text compression and indexed string matching.
Self-indexing is the mechanism by which a text is simultaneously
compressed and indexed, so that the self-index occupies space close to that
of the compressed text, provides random access to any part of it, and in
addition supports efficient indexed pattern matching. Thus a self-index can
replace the text by a compressed version with enhanced search functionalities.
Self-indexing builds on a large base of compressed data structures,
which is another fascinating algorithmic area that has appeared two decades
ago with the aim of obtaining compact representations of classical data
structures. Although they usually require more instructions than their
classical counterparts to operate, they can benefit from the memory hierarchy.
This is particularly noticeable when they can operate in main memory in cases
where the classical structures require disk storage.
|