On prefix normal words and prefix normal forms

Autor: Joe Sawada, Péter Burcsi, Zsuzsanna Lipták, Gabriele Fici, Frank Ruskey
Přispěvatelé: Burcsi, P., Fici, G., Lipták, Z., Ruskey, F., Sawada, J.
Rok vydání: 2017
Předmět:
FOS: Computer and information sciences
Prefix code
Prefix normal word
Pre-necklace
Discrete Mathematics (cs.DM)
General Computer Science
Formal Languages and Automata Theory (cs.FL)
Binary number
Computer Science - Formal Languages and Automata Theory
Context (language use)
Binary language
Lyndon words
0102 computer and information sciences
02 engineering and technology
Prefix grammar
prefix normal forms
Kraft's inequality
Characterization (mathematics)
Lyndon word
01 natural sciences
Prefix normal form
enumeration
Theoretical Computer Science
FOS: Mathematics
0202 electrical engineering
electronic engineering
information engineering

Mathematics - Combinatorics
Mathematics
Discrete mathematics
prefix normal words
prefix normal forms
binary languages
binary jumbled pattern matching
pre-necklaces
Lyndon words
enumeration

binary jumbled pattern matching
Settore INF/01 - Informatica
Computer Science (all)
pre-necklaces
Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
prefix normal words
Prefix
010201 computation theory & mathematics
020201 artificial intelligence & image processing
Combinatorics (math.CO)
binary languages
Computer Science::Formal Languages and Automata Theory
Word (group theory)
Computer Science - Discrete Mathematics
Zdroj: Theoretical Computer Science. 659:1-13
ISSN: 0304-3975
Popis: A $1$-prefix normal word is a binary word with the property that no factor has more $1$s than the prefix of the same length; a $0$-prefix normal word is defined analogously. These words arise in the context of indexed binary jumbled pattern matching, where the aim is to decide whether a word has a factor with a given number of $1$s and $0$s (a given Parikh vector). Each binary word has an associated set of Parikh vectors of the factors of the word. Using prefix normal words, we provide a characterization of the equivalence class of binary words having the same set of Parikh vectors of their factors. We prove that the language of prefix normal words is not context-free and is strictly contained in the language of pre-necklaces, which are prefixes of powers of Lyndon words. We give enumeration results on $\textit{pnw}(n)$, the number of prefix normal words of length $n$, showing that, for sufficiently large $n$, \[ 2^{n-4 \sqrt{n \lg n}} \le \textit{pnw}(n) \le 2^{n - \lg n + 1}. \] For fixed density (number of $1$s), we show that the ordinary generating function of the number of prefix normal words of length $n$ and density $d$ is a rational function. Finally, we give experimental results on $\textit{pnw}(n)$, discuss further properties, and state open problems.
To appear in Theoretical Computer Science
Databáze: OpenAIRE