Cedric Chauve, Marni Mishna and France Paquet-Nadeau
Refined Upper Bounds on the Size of the Condensed Neighbourhood of Sequences
Abstract: |
The d-neighbourhood of a sequence s is the set of sequences that are at distance at most d from s, for a given measure of distance and parameter d. The condensed d-neighbourhood is obtained by removing from the neighbourhood any sequence having a prefix also in the neighbourhood. Estimating the maximum size of the condensed neighbourhood over all DNA sequences of a given length k for the Levenshtein distance is a problem related, among others, to the analysis of the BLAST (Basic Local Alignment Search Tool, Altschul et al., 1990). In this work, we analyse recurrences for computing an upper bound to the maximum size of the condensed d-neighbourhood for sequences of length k and provide a simpler asymptotic expression that we conjecture results in a dramatically improved upper bound. |
Download paper: | |||
PostScript | BibTeX reference |
Download presentation: |