Telescoper: de novo assembly of highly repetitive regions
Autor: | Ma'ayan Bresler, Sara Sheehan, Yun S. Song, Andrew H. Chan |
---|---|
Rok vydání: | 2012 |
Předmět: |
Statistics and Probability
Sequencing and Sequence Analysis Sequence assembly Genomics Hybrid genome assembly Saccharomyces cerevisiae Computational biology Biology Biochemistry Genome Field (computer science) 03 medical and health sciences 0302 clinical medicine Cot analysis Molecular Biology Repetitive Sequences Nucleic Acid 030304 developmental biology Genetics 0303 health sciences High-Throughput Nucleotide Sequencing DNA Sequence Analysis DNA Repetitive Regions Original Papers Computer Science Applications Computational Mathematics Computational Theory and Mathematics Simulated data Algorithms 030217 neurology & neurosurgery |
Zdroj: | Bioinformatics |
ISSN: | 1367-4811 1367-4803 |
DOI: | 10.1093/bioinformatics/bts399 |
Popis: | Motivation: With advances in sequencing technology, it has become faster and cheaper to obtain short-read data from which to assemble genomes. Although there has been considerable progress in the field of genome assembly, producing high-quality de novo assemblies from short-reads remains challenging, primarily because of the complex repeat structures found in the genomes of most higher organisms. The telomeric regions of many genomes are particularly difficult to assemble, though much could be gained from the study of these regions, as their evolution has not been fully characterized and they have been linked to aging. Results: In this article, we tackle the problem of assembling highly repetitive regions by developing a novel algorithm that iteratively extends long paths through a series of read-overlap graphs and evaluates them based on a statistical framework. Our algorithm, Telescoper, uses short- and long-insert libraries in an integrated way throughout the assembly process. Results on real and simulated data demonstrate that our approach can effectively resolve much of the complex repeat structures found in the telomeres of yeast genomes, especially when longer long-insert libraries are used. Availability: Telescoper is publicly available for download at sourceforge.net/p/telescoper. Contact: yss@eecs.berkeley.edu Supplementary Information: Supplementary data are available at Bioinformatics online. |
Databáze: | OpenAIRE |
Externí odkaz: |