Impact of DNA extraction, PCR amplification, sequencing, and bioinformatic analysis on food-associated mock communities using PacBio long-read amplicon sequencing.

Autor: Baer M; Institute of Nutritional and Food Sciences, Food Microbiology and Hygiene, University of Bonn, Friedrich-Hirzebruch-Allee 7, 53115, Bonn, Germany. mabaer@uni-bonn.de., Höppe L; Institute of Nutritional and Food Sciences, Food Microbiology and Hygiene, University of Bonn, Friedrich-Hirzebruch-Allee 7, 53115, Bonn, Germany., Seel W; Institute of Nutritional and Food Sciences, Nutrition and Microbiota, University of Bonn, Katzenburgweg 7, 53115, Bonn, Germany., Lipski A; Institute of Nutritional and Food Sciences, Food Microbiology and Hygiene, University of Bonn, Friedrich-Hirzebruch-Allee 7, 53115, Bonn, Germany.
Jazyk: angličtina
Zdroj: BMC microbiology [BMC Microbiol] 2024 Dec 06; Vol. 24 (1), pp. 521. Date of Electronic Publication: 2024 Dec 06.
DOI: 10.1186/s12866-024-03677-8
Abstrakt: Background: Long-read 16S rRNA gene amplicon sequencing has a high potential for characterizing food-associated microbiomes. The advantage results from sequencing the full-length (1,500 bp) gene, enabling taxonomic resolution at species level. Here we present a benchmarking study using mock communities representative of milking machine biofilms and raw meat, revealing challenges relevant to food-associated habitats. These were varying species abundances, reliable intra-genus differentiation of species, and detection of novel species with < 98.7% sequence identity to type strains. By using mock communities at different levels of preparation - as mixed whole cells, mixed extracted DNA, and mixed PCR products - we systematically investigated the influence of DNA extraction using two different kits, PCR amplification of 16S rRNA genes, sequencing, and bioinformatics analysis including reference database and gene copy number normalization on bacterial composition and alpha diversity.
Results: We demonstrated that PacBio ccs-reads allowed for correct taxonomic assignment of all species present within the mock communities using a custom Refseq database. However, choice of percent identity values for taxonomic assignment had a strong influence on identification and processing of reads from novel species. PCR amplification of 16S rRNA genes produced the strongest bias on the observed community composition, while sequencing alone reproduced the preset composition well. The PCR bias can in part be attributed to differences in mol% G + C content of 16S rRNA genes resulting in preferred amplification of low mol% G + C-containing taxa.
Conclusions: This study underlines the importance of benchmarking studies with mock communities representing the habitat of interest to evaluate the methodology prior to analyzing real samples of unknown composition. It demonstrates the advantage of long-read sequencing over short-read sequencing, as species level identification enables in-depth characterization of the habitat. One benefit is improved risk assessment by enabling differentiation between pathogenic and apathogenic species of the same genus.
Competing Interests: Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.
(© 2024. The Author(s).)
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje