pime: A package for discovery of novel differences among microbial communities
Autor: | Eric W. Triplett, Priscila Caroline Thiago Dobbler, Bryan Kolaczkowski, Luiz Fernando Wurdig Roesch, Jennifer C. Drew, Victor Satler Pylro |
---|---|
Rok vydání: | 2019 |
Předmět: |
0106 biological sciences
0301 basic medicine DNA Bacterial Bacteria Microbiota Decision tree Word error rate Computational Biology Microbial biomarkers Biology 010603 evolutionary biology 01 natural sciences Data set 03 medical and health sciences 030104 developmental biology RNA Ribosomal 16S Statistics Genetics Independent data Relative species abundance Ecology Evolution Behavior and Systematics Phylogeny Biotechnology Type I and type II errors |
Zdroj: | Molecular ecology resourcesREFERENCES. 20(2) |
ISSN: | 1755-0998 |
Popis: | The data used for profiling microbial communities is usually sparse with some microbes having high abundance in a few samples and being nearly absent in others. However, current bioinformatics tools able to deal with this sparsity are lacking. pime (Prevalence Interval for Microbiome Evaluation) was designed to remove those taxa that may be high in relative abundance in just a few samples but have a low prevalence overall. The reliability and robustness of pime were compared against existing methods and tested using 16S rRNA independent data sets. pime filters microbial taxa not shared in a per treatment prevalence interval started at 5% prevalence with increasing increments of 5% at each filtering step. For each prevalence interval, hundreds of decision trees were calculated to predict the likelihood of detecting differences in treatments. The best prevalence-filtered data set was user-selected by choosing the prevalence interval that kept a large portion of the 16S rRNA sequences in the data set while also showing the lowest error rate. To obtain the likelihood of introducing type I error while building prevalence-filtered data sets, an error detection step based was also included. A pime reanalysis of published data sets uncovered other expected microbial associations than previously reported, which may be masked when only relative abundance was considered. |
Databáze: | OpenAIRE |
Externí odkaz: |