Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification
Autor: | Ru-Dong Li, Yuhua Zhou, Lu Xie, Lei Liu, Zongxin Ling, Xiaokui Guo, Yin Wang |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2016 |
Předmět: |
0301 basic medicine
Article Subject Computer science 030106 microbiology 0206 medical engineering lcsh:Medicine 02 engineering and technology Dental Caries Bioinformatics General Biochemistry Genetics and Molecular Biology 03 medical and health sciences Text mining RNA Ribosomal 16S Profiling (information science) Cluster Analysis Data Mining Humans Phylogeny General Immunology and Microbiology Phylogenetic tree business.industry Supervised learning lcsh:R Disease classification Pattern recognition General Medicine Bacterial Infections Pneumonia ComputingMethodologies_PATTERNRECOGNITION Metagenomics Classification methods Metagenome Artificial intelligence Motif (music) Supervised Machine Learning business 020602 bioinformatics Algorithms Research Article |
Zdroj: | BioMed Research International BioMed Research International, Vol 2016 (2016) |
ISSN: | 2314-6133 |
DOI: | 10.1155/2016/6598307 |
Popis: | Background. Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty.Results. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF) to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods.Conclusions. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data. |
Databáze: | OpenAIRE |
Externí odkaz: |