Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss

Autor: Eric J. Alm, Manolis Kellis, Mukul S. Bansal
Přispěvatelé: Massachusetts Institute of Technology. Department of Biological Engineering, Massachusetts Institute of Technology. Department of Civil and Environmental Engineering, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Kellis, Manolis, Alm, Eric J.
Rok vydání: 2012
Předmět:
Statistics and Probability
Theoretical computer science
Gene Transfer
Horizontal

0206 medical engineering
Ismb 2012 Proceedings Papers Committee July 15 to July 19
2012
Long Beach
Ca
Usa

Genomics
02 engineering and technology
Biology
Biochemistry
Evolution
Molecular

03 medical and health sciences
Software
Gene Duplication
Gene duplication
Genetic algorithm
Gene family
Molecular Biology
Phylogeny
030304 developmental biology
Genetics
0303 health sciences
Efficient algorithm
business.industry
Original Papers
3. Good health
Computer Science Applications
Computational Mathematics
Tree (data structure)
ComputingMethodologies_PATTERNRECOGNITION
Computational Theory and Mathematics
Multigene Family
Evolution and Comparative Genomics
Horizontal gene transfer
ComputingMethodologies_GENERAL
business
Algorithms
Gene Deletion
020602 bioinformatics
Zdroj: Bioinformatics
ISSN: 1367-4811
1367-4803
DOI: 10.1093/bioinformatics/bts225
Popis: Motivation: Gene family evolution is driven by evolutionary events such as speciation, gene duplication, horizontal gene transfer and gene loss, and inferring these events in the evolutionary history of a given gene family is a fundamental problem in comparative and evolutionary genomics with numerous important applications. Solving this problem requires the use of a reconciliation framework, where the input consists of a gene family phylogeny and the corresponding species phylogeny, and the goal is to reconcile the two by postulating speciation, gene duplication, horizontal gene transfer and gene loss events. This reconciliation problem is referred to as duplication-transfer-loss (DTL) reconciliation and has been extensively studied in the literature. Yet, even the fastest existing algorithms for DTL reconciliation are too slow for reconciling large gene families and for use in more sophisticated applications such as gene tree or species tree reconstruction. Results: We present two new algorithms for the DTL reconciliation problem that are dramatically faster than existing algorithms, both asymptotically and in practice. We also extend the standard DTL reconciliation model by considering distance-dependent transfer costs, which allow for more accurate reconciliation and give an efficient algorithm for DTL reconciliation under this extended model. We implemented our new algorithms and demonstrated up to 100 000-fold speed-up over existing methods, using both simulated and biological datasets. This dramatic improvement makes it possible to use DTL reconciliation for performing rigorous evolutionary analyses of large gene families and enables its use in advanced reconciliation-based gene and species tree reconstruction methods.
National Science Foundation (U.S.) (Career award 0644282)
National Institutes of Health (U.S.) (RC2 HG005639)
National Science Foundation (U.S.). (AToL 0936234)
Databáze: OpenAIRE