Haplotype assembly in polyploid genomes and identical by descent shared tracts
Autor: | Sorin Istrail, Derek Aguiar |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2013 |
Předmět: |
Statistics and Probability
Population Genomics Computational biology Biology Biochemistry Identity by descent Genome Polyploidy 03 medical and health sciences 0302 clinical medicine Humans 1000 Genomes Project education Molecular Biology 030304 developmental biology Genetics 0303 health sciences education.field_of_study Ismb/Eccb 2013 Proceedings Papers Committee July 21 to July 23 2013 Berlin Germany Genome Human Haplotype Sequence Analysis DNA Original Papers Computer Science Applications Computational Mathematics Computational Theory and Mathematics Haplotypes Human genome Haplotype estimation Sequence Analysis 030217 neurology & neurosurgery Algorithms |
Zdroj: | Bioinformatics |
ISSN: | 1367-4811 1367-4803 |
Popis: | Motivation: Genome-wide haplotype reconstruction from sequence data, or haplotype assembly, is at the center of major challenges in molecular biology and life sciences. For complex eukaryotic organisms like humans, the genome is vast and the population samples are growing so rapidly that algorithms processing high-throughput sequencing data must scale favorably in terms of both accuracy and computational efficiency. Furthermore, current models and methodologies for haplotype assembly (i) do not consider individuals sharing haplotypes jointly, which reduces the size and accuracy of assembled haplotypes, and (ii) are unable to model genomes having more than two sets of homologous chromosomes (polyploidy). Polyploid organisms are increasingly becoming the target of many research groups interested in the genomics of disease, phylogenetics, botany and evolution but there is an absence of theory and methods for polyploid haplotype reconstruction. Results: In this work, we present a number of results, extensions and generalizations of compass graphs and our HapCompass framework. We prove the theoretical complexity of two haplotype assembly optimizations, thereby motivating the use of heuristics. Furthermore, we present graph theory–based algorithms for the problem of haplotype assembly using our previously developed HapCompass framework for (i) novel implementations of haplotype assembly optimizations (minimum error correction), (ii) assembly of a pair of individuals sharing a haplotype tract identical by descent and (iii) assembly of polyploid genomes. We evaluate our methods on 1000 Genomes Project, Pacific Biosciences and simulated sequence data. Availability and Implementation: HapCompass is available for download at http://www.brown.edu/Research/Istrail_Lab/. Contact: ude.nworb@liartsI_niroS Supplementary information: Supplementary data are available at Bioinformatics online. |
Databáze: | OpenAIRE |
Externí odkaz: |