Multi-factorial examination of amplicon sequencing workflows from sample preparation to bioinformatic analysis.

Autor: De Wolfe TJ; Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 450 Technology Drive Rm. 426, Pittsburgh, PA, 15219, USA.; Department of Pediatrics, BC Children's Hospital Research Institute, University of British Columbia, 4480 Oak Street Rm. 208B, Vancouver, BC, V6H 4E4, Canada.; Gut4Health, BC Children's Hospital Research Institute, University of British Columbia, 950 West 28th Avenue Rm. 211, Vancouver, BC, V5Z 4H4, Canada., Wright ES; Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 450 Technology Drive Rm. 426, Pittsburgh, PA, 15219, USA. eswright@pitt.edu.
Jazyk: angličtina
Zdroj: BMC microbiology [BMC Microbiol] 2023 Apr 19; Vol. 23 (1), pp. 107. Date of Electronic Publication: 2023 Apr 19.
DOI: 10.1186/s12866-023-02851-8
Abstrakt: Background: The development of sequencing technologies to evaluate bacterial microbiota composition has allowed new insights into the importance of microbial ecology. However, the variety of methodologies used among amplicon sequencing workflows leads to uncertainty about best practices as well as reproducibility and replicability among microbiome studies. Using a bacterial mock community composed of 37 soil isolates, we performed a comprehensive methodological evaluation of workflows, each with a different combination of methodological factors spanning sample preparation to bioinformatic analysis to define sources of artifacts that affect coverage, accuracy, and biases in the resulting compositional profiles.
Results: Of the workflows examined, those using the V4-V4 primer set enabled the highest level of concordance between the original mock community and resulting microbiome sequence composition. Use of a high-fidelity polymerase, or a lower-fidelity polymerase with an increased PCR elongation time, limited chimera formation. Bioinformatic pipelines presented a trade-off between the fraction of distinct community members identified (coverage) and fraction of correct sequences (accuracy). DADA2 and QIIME2 assembled V4-V4 reads amplified by Taq polymerase resulted in the highest accuracy (100%) but had a coverage of only 52%. Using mothur to assemble and denoise V4-V4 reads resulted in a coverage of 75%, albeit with marginally lower accuracy (99.5%).
Conclusions: Optimization of microbiome workflows is critical for accuracy and to support reproducibility and replicability among microbiome studies. These considerations will help reveal the guiding principles of microbial ecology and impact the translation of microbiome research to human and environmental health.
(© 2023. The Author(s).)
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje