Zobrazeno 1 - 10
of 61
pro vyhledávání: '"Louza, Felipe A."'
A standard format used for storing the output of high-throughput sequencing experiments is the FASTQ format. It comprises three main components: (i) headers, (ii) bases (nucleotide sequences), and (iii) quality scores. FASTQ files are widely used for
Externí odkaz:
http://arxiv.org/abs/2304.08534
We evaluate the influence of different alphabet orderings on the Lyndon factorization of a string. Experiments with Pizza & Chili datasets show that for most alphabet reorderings, the number of Lyndon factors is usually small, and the length of the l
Externí odkaz:
http://arxiv.org/abs/2108.04988
In this paper we propose a new, more appropriate definition of regular and indeterminate strings. A regular string is one that is "isomorphic" to a string whose entries all consist of a single letter, but which nevertheless may itself include entries
Externí odkaz:
http://arxiv.org/abs/2012.07892
A grammar compression algorithm, called GCIS, is introduced in this work. GCIS is based on the induced suffix sorting algorithm SAIS, presented by Nong et al. in 2009. The proposed solution builds on the factorization performed by SAIS during suffix
Externí odkaz:
http://arxiv.org/abs/2011.12898
The merging of succinct data structures is a well established technique for the space efficient construction of large succinct indexes. In the first part of the paper we propose a new algorithm for merging succinct representations of de Bruijn graphs
Externí odkaz:
http://arxiv.org/abs/2009.03675
Autor:
Louza, Felipe A., Mantaci, Sabrina, Manzini, Giovanni, Sciortino, Marinella, Telles, Guilherme P.
In this paper we propose a variant of the induced suffix sorting algorithm by Nong (TOIS, 2013) that computes simultaneously the Lyndon array and the suffix array of a text in $O(n)$ time using $\sigma + O(1)$ words of working space, where $n$ is the
Externí odkaz:
http://arxiv.org/abs/1905.12987
The Burrows-Wheeler transform (BWT) is a well studied text transformation widely used in data compression and text indexing. The BWT of two strings can also provide similarity measures between them, based on the observation that the more their symbol
Externí odkaz:
http://arxiv.org/abs/1903.10583
We propose a new algorithm for merging succinct representations of de Bruijn graphs introduced in [Bowe et al. WABI 2012]. Our algorithm is based on the lightweight BWT merging approach by Holt and McMillan [Bionformatics 2014, ACM-BCB 2014]. Our alg
Externí odkaz:
http://arxiv.org/abs/1902.02889
Autor:
Louza, Felipe A.
We present a simple algorithm for computing the document array given a string collection and its suffix array as input. Our algorithm runs in linear time using constant additional space for strings from constant alphabets.
Externí odkaz:
http://arxiv.org/abs/1812.09094