Assessment of urban microbiome assemblies with the help of targeted in silico gold standards
Autor: | Thomas Rattei, Samuel M. Gerner, Alexandra B. Graf |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
0301 basic medicine
Bioinformatics In silico Immunology Assembly Bacterial genome size Computational biology Biology In silico gold standard Genome General Biochemistry Genetics and Molecular Biology Deep sequencing 03 medical and health sciences Computer Simulation Microbiome Cities lcsh:QH301-705.5 Ecology Evolution Behavior and Systematics Selection (genetic algorithm) Bacteria Shotgun sequencing Microbiota Research Applied Mathematics CAMDA challenge High-Throughput Nucleotide Sequencing Binning 030104 developmental biology lcsh:Biology (General) Metagenomics Modeling and Simulation Metagenome General Agricultural and Biological Sciences Genome Bacterial Software |
Zdroj: | Biology Direct, Vol 13, Iss 1, Pp 1-21 (2018) Biology Direct |
Popis: | Background Microbial communities play a crucial role in our environment and may influence human health tremendously. Despite being the place where human interaction is most abundant we still know little about the urban microbiome. This is highlighted by the large amount of unclassified DNA reads found in urban metagenome samples. The only in silico approach that allows us to find unknown species, is the assembly and classification of draft genomes from a metagenomic dataset. In this study we (1) investigate the applicability of an assembly and binning approach for urban metagenome datasets, and (2) develop a new method for the generation of in silico gold standards to better understand the specific challenges of such datasets and provide a guide in the selection of available software. Results We applied combinations of three assembly (Megahit, SPAdes and MetaSPAdes) and three binning tools (MaxBin, MetaBAT and CONCOCT) to whole genome shotgun datasets from the CAMDA 2017 Challenge. Complex in silico gold standards with a simulated bacterial fraction were generated for representative samples of each surface type and city. Using these gold standards, we found the combination of SPAdes and MetaBAT to be optimal for urban metagenome datasets by providing the best trade-off between the number of high-quality genome draft bins (MIMAG standards) retrieved, the least amount of misassemblies and contamination. The assembled draft genomes included known species like Propionibacterium acnes but also novel species according to respective ANI values. Conclusions In our work, we showed that, even for datasets with high diversity and low sequencing depth from urban environments, assembly and binning-based methods can provide high-quality genome drafts. Of vital importance to retrieve high-quality genome drafts is sequence depth but even more so a high proportion of the bacterial sequence fraction too achieve high coverage for bacterial genomes. In contrast to read-based methods relying on database knowledge, genome-centric methods as applied in this study can provide valuable information about unknown species and strains as well as functional contributions of single community members within a sample. Furthermore, we present a method for the generation of sample-specific highly complex in silico gold standards. Reviewers This article was reviewed by Craig Herbold, Serghei Mangul and Yana Bromberg. Electronic supplementary material The online version of this article (10.1186/s13062-018-0225-6) contains supplementary material, which is available to authorized users. |
Databáze: | OpenAIRE |
Externí odkaz: |