In-place Update of Suffix Array while Recoding Words
Autor: | Matthias Gallé, François Coste, Pierre Peterlongo |
---|---|
Přispěvatelé: | Biological systems and models, bioinformatics and sequences (SYMBIOSE), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria), Jan Holub and Jan {\v{Z}}{\v{d}}{\'{a}}rek, Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Rennes – Bretagne Atlantique |
Jazyk: | angličtina |
Rok vydání: | 2008 |
Předmět: |
Compressed suffix array
Speedup word-interval Computer science [INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS] Generalized suffix tree suffix array dynamic indexing 0102 computer and information sciences 02 engineering and technology 01 natural sciences law.invention law 0202 electrical engineering electronic engineering information engineering Computer Science (miscellaneous) in-place update Suffix array [SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] Grammar induction 010201 computation theory & mathematics 020201 artificial intelligence & image processing Suffix [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] Algorithm Word (computer architecture) FM-index |
Zdroj: | Prague Stringology Conference 2008 Prague Stringology Conference 2008, Sep 2008, Prague, Czech Republic. pp.54--67 International Journal of Foundations of Computer Science International Journal of Foundations of Computer Science, World Scientific Publishing, 2009, 20 (6), pp.1025-1045. ⟨10.1142/S0129054109007029⟩ International Journal of Foundations of Computer Science, 2009, 20 (6), pp.1025-1045. ⟨10.1142/S0129054109007029⟩ HAL |
ISSN: | 0129-0541 |
DOI: | 10.1142/S0129054109007029⟩ |
Popis: | International audience; Motivated by grammatical inference and data compression applications, we propose an algorithm to update a suffix array after the substitution, in the indexed text, of some occurrences of a given word by a new character. Compared to other published index update methods, the problem addressed here may require the modification of a large number of distinct positions over the original text. The proposed algorithm uses the specific internal order of suffix arrays in order to update simultaneously groups of entries, and ensures that only entries to be modified are visited. Experiments confirm a significant execution time speed-up compared to the construction of suffix array from scratch at each step of the application. |
Databáze: | OpenAIRE |
Externí odkaz: |