Prague Stringology Conference 2018

B. Riva Shalom

Parameterized Dictionary Matching with One Gap

Abstract:
Dictionary Matching is a variant of the Pattern Matching problem where multiple patterns are simultaneously matched to a single text. In case the patterns contain sequences of don't care symbols, the problem is called Dictionary Matching with Gaps. Another famous variant of Pattern matching is the Parameterized Matching, where two equal-length strings are a parameterized match if there exists a bijection on the alphabets such that one string matches the other under the bijection. In this paper we suggest the problem of Parameterized Dictionary Matching with one Gap, stemming from cyber security, where the patterns are the malware sequences we want to detect in the text, and the necessity of a parameterized match is due to their encryption. We present two algorithms solving the Prameterized Dictionary Matching with one Gap. The first solves the problem for dictionaries with variable length gaps and has query time of O(nmax - αmin) log2d + occ), where n is the size of the text, d is the number of gapped patterns in the dictionary, βmax - αmin is the maximal size of gap and occ is the number of the gapped patterns reported as output. The second solution considers dictionaries with a single set of gap boundaries and has query time of O(n(β - α) + occ), where n is the size of the text, β - α is the size of the gap and occ is the number of the gapped patterns reported as output.

Download paper: Article in PostScript Article in PDF BibTeX Reference
 PostScript   PDF   BibTeX reference