Approximation Ratios of $$\mathsf {RePair}$$ , $$\mathsf {LongestMatch}$$ and $$\mathsf {Greedy}$$ on Unary Strings

Autor: Danny Hucke
Rok vydání: 2019
Předmět:
Zdroj: String Processing and Information Retrieval ISBN: 9783030326852
DOI: 10.1007/978-3-030-32686-9_1
Popis: A grammar-based compressor computes for a given input w a context-free grammar that produces only w. So-called global grammar-based compressors (\(\mathsf {RePair}\), \(\mathsf {LongestMatch}\) and \(\mathsf {Greedy}\)) achieve impressive practical compression results, but the recursive character of those algorithms makes it hard to achieve strong theoretical results. To this end, this paper studies the approximation ratio of those algorithms for unary input strings, which is strongly related to the field of addition chains. We show that in this setting, \(\mathsf {RePair}\) and \(\mathsf {LongestMatch}\) produce equal size grammars that are by a factor of at most \(\log _2(3)\) larger than a smallest grammar. We also provide a matching lower bound. The main result of this paper is a new lower bound for \(\mathsf {Greedy}\) of 1.348..., which improves the best known lower bound for arbitrary (not necessarily unary) input strings.
Databáze: OpenAIRE