The Succinct Format with Direct Accessibility (SFDC) is an encoding scheme
originally designed for efficient data compression and quick access to
elements within compressed sequences. While SFDC performs well under stable
character frequency conditions, its efficacy diminishes in text corpora with
high variability in character frequencies, typical of natural language
environments.
Addressing this limitation, this paper presents three variant of SFDC based
on block segmentation methods, each offering unique enhancements over the
original SFDC representation. By tailoring the segmentation process to the
distribution of characters within the text, these methods aim to optimize
compression efficiency and decoding performance. The paper presents
experimental results demonstrating the effectiveness of these approaches,
highlighting their ability to improve upon the original scheme in several
scenarios. The findings underscore the potential of these advanced
segmentation strategies to provide superior compression and performance
across a range of text datasets.
|