A new taxonomy of sublinear keyword pattern matching algorithms

Autor: Loek Cleophas, Watson, B. W., Zwaan, G.
Přispěvatelé: Software Engineering and Technology - W, Software Engineering
Jazyk: angličtina
Rok vydání: 2004
Zdroj: Pure TUe
Popis: This paper presents a new taxonomy of sublinear (multiple) keyword pattern matching algorithms. Based on an earlier taxonomy by Watson and Zwaan [WZ96, WZ95], this new taxonomy includes not only suffix-based algorithms related to the Boyer-Moore, Commentz-Walter and Fan-Su algorithms, but factor- and factor oracle-based algorithms such as Backward DAWG Matching and Backward Oracle Matching as well. In particular, we show how suffix-based (Commentz-Walter like), factor- and factor oracle-based sublinear keyword pattern matching algorithms can all be seen as instantiations of a general sublinear algorithm skeleton. In addition, we show all shift functions defined for the suffix-based algorithms to be in principle reusable for factor- and factor oracle-based algorithms. The taxonomy is based on deriving the algorithms from a common starting point by adding algorithm and problem details, in order to arrive at efficient or well-known algorithms. Such a presentation provides correctness arguments for the algorithms as well as clarity on how the algorithms are related to one another. In addition, it is helpful in the construction of a toolkit of the algorithms.
Databáze: OpenAIRE