Computational and Structural Approaches to Periodicities in Strings

Autor: Baker, Andrew R.
Rok vydání: 2012
Předmět:
Druh dokumentu: Diplomová práce
Popis: We investigate the function ρd(n) = max { r(x) | x is a (d, n)-string } where r(x) is the number of runs in the string x, and a (d, n)-string is a string with length n and exactly d distinct symbols. Our investigation is motivated by the conjecture that ρd(n) ≤ n-d. We present and discuss fundamental properties of the ρd(n) function. The values of ρd(n) are presented in the (d, n-d)-table with rows indexed by d and columns indexed by n-d which reveals the regularities of the function. We introduce the concepts of the r-cover and core vector of a string, yielding a novel computational framework for determining ρd(n) values. The computation of the previously intractable instances is achieved via first computing a lower bound, and then using the structural properties to limit our exhaustive search only to strings that can possibly exceed this number of runs. Using this approach, we extended the known maximum number of runs in binary string from 60 to 74. In doing so, we find the first examples of run-maximal strings containing four consecutive identical symbols. Our framework is also applied for an arbitrary number of distinct symbols, d. For example, we are able to determine that the maximum number of runs in a string with 23 distinct symbols and length 46 is 23. Further, we discuss the structural properties of a shortest (d, n)-string x such that r(x) > n-d, should such a string exist.
Doctor of Philosophy (PhD)
Databáze: Networked Digital Library of Theses & Dissertations