metaDMG – A Fast and Accurate Ancient DNA Damage Toolkit for Metagenomic Data

Autor: Christian Michelsen, Mikkel Winther Pedersen, Antonio Fernandez-Guerra, Lei Zhao, Troels C. Petersen, Thorfinn Sand Korneliussen
Rok vydání: 2022
Popis: MotivationUnder favourable conditions DNA molecules can persist for hundreds of thousands of years. Such genetic remains make up invaluable resources to study past assemblages, populations, and even the evolution of species. However, DNA is subject to degradation, and hence over time decrease to ultra low concentrations which makes it highly prone to contamination by modern sources. Strict precautions are therefore necessary to ensure that DNA from modern sources does not appear in the final data is authenticated as ancient. The most generally accepted and widely applied authenticity for ancient DNA studies is to test for elevated deaminated cytosine residues towards the termini of the molecules: DNA damage. To date, this has primarily been used for single organisms and recently for read assemblies, however, these methods are not applicable for estimating DNA damage for ancient metagenomes with tens and even hundreds of thousands of species.MethodsWe presentmetaDMG, a novel framework and toolkit that allows for the estimation, quantification and visualization of postmortem damage for single reads, single genomes and even metagenomic environmental DNA by utilizing the taxonomic branching structure. It bypasses any need for initial classification, splitting reads by individual organisms, and realignment. We have implemented a Bayesian approach that combines a modified geometric damage profile with a beta-binomial model to fit the entire model to the individual misincorporations at all taxonomic levels.ResultsWe evaluated the performance using both simulated and published environmental DNA datasets and compared to existing methods when relevant. We findmetaDMGto be an order of magnitude faster than previous methods and more accurate – even for complex metagenomes. Our simulations show thatmetaDMGcan estimate DNA damage at taxonomic levels down to 100 reads, that the estimated uncertainties decrease with increased number of reads and that the estimates are more significant with increased number of C to T misincorporations.ConclusionmetaDMGis a state-of-the-art program for aDNA damage estimation and allows for the computation of nucleotide misincorporation, GC-content, and DNA fragmentation for both simple and complex ancient genomic datasets, making it a complete package for ancient DNA damage authentication.
Databáze: OpenAIRE