Computing NP-hard Repetitiveness Measures via MAX-SAT
Autor: | Bannai, Hideo, Goto, Keisuke, Ishihata, Masakazu, Kanda, Shunsuke, Köppl, Dominik, Nishimoto, Takaaki |
---|---|
Rok vydání: | 2022 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | Repetitiveness measures reveal profound characteristics of datasets, and give rise to compressed data structures and algorithms working in compressed space. Alas, the computation of some of these measures is NP-hard, and straight-forward computation is infeasible for datasets of even small sizes. Three such measures are the smallest size of a string attractor, the smallest size of a bidirectional macro scheme, and the smallest size of a straight-line program. While a vast variety of implementations for heuristically computing approximations exist, exact computation of these measures has received little to no attention. In this paper, we present MAX-SAT formulations that provide the first non-trivial implementations for exact computation of smallest string attractors, smallest bidirectional macro schemes, and smallest straight-line programs. Computational experiments show that our implementations work for texts of length up to a few hundred for straight-line programs and bidirectional macro schemes, and texts even over a million for string attractors. Comment: paper accepted to ESA 2022 (plus Appendix); corrected attribution of Python program for computing https://oeis.org/A339391 |
Databáze: | arXiv |
Externí odkaz: |