Combining DNA and protein alignments to improve genome annotation with LiftOn.

Autor: Chao KH; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.; Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA., Heinz JM; Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA., Hoh C; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.; Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA., Mao A; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.; Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA.; Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA., Shumate A; Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA.; Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA., Pertea M; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.; Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA.; Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA., Salzberg SL; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.; Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA.; Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.; Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21211, USA.
Jazyk: angličtina
Zdroj: BioRxiv : the preprint server for biology [bioRxiv] 2024 May 17. Date of Electronic Publication: 2024 May 17.
DOI: 10.1101/2024.05.16.593026
Abstrakt: As the number and variety of assembled genomes continues to grow, the number of annotated genomes is falling behind, particularly for eukaryotes. DNA-based mapping tools help to address this challenge, but they are only able to transfer annotation between closely-related species. Here we introduce LiftOn, a homology-based software tool that integrates DNA and protein alignments to enhance the accuracy of genome-scale annotation and to allow mapping between relatively distant species. LiftOn's protein-centric algorithm considers both types of alignments, chooses optimal open reading frames, resolves overlapping gene loci, and finds additional gene copies where they exist. LiftOn can reliably transfer annotation between genomes representing members of the same species, as we demonstrate on human, mouse, honey bee, rice, and Arabidopsis thaliana . It can further map annotation effectively across species pairs as far apart as mouse and rat or Drosophila melanogaster and D. erecta .
Databáze: MEDLINE