Enno Adler, Stefan Böttcher and Rita Hartel
String Partition for Building Long Burrows-Wheeler Transforms
Abstract: |
Constructing the Burrows-Wheeler transform (BWT) for long strings poses significant challenges regarding construction time and memory usage. We use a prefix of the suffix array to partition a long string into shorter substrings, thereby enabling the use of multi-string BWT construction algorithms to process these partitions fast. We provide an implementation, partDNA, for DNA sequences. Through comparison with state-of-the-art BWT construction algorithms, we show that partDNA with the BWT construction algorithm IBB by Adler et al.\cite{adler2025} offers a novel trade-off for construction time and memory usage for BWT construction on real genome datasets. Beyond this, the proposed partitioning strategy is applicable to strings of any alphabet. |
Download paper: | ![]() |
![]() |
![]() |
PostScript | BibTeX reference |