Comparison of beta diversity measures in clustering the high-dimensional microbial data
Autor: | Xueyi He, Biyuan Chen, Xiaobing Zou, Bangquan Pan, Na You |
---|---|
Rok vydání: | 2020 |
Předmět: |
Pervasive Developmental Disorders
Autism Spectrum Disorder Autism Vector Spaces Beta diversity Social Sciences Biochemistry 0302 clinical medicine Medical Conditions RNA Ribosomal 16S Medicine and Health Sciences Cluster Analysis Psychology Bacteroides Phylogeny Data Management 0303 health sciences education.field_of_study Multidisciplinary Ecology Microbiota Applied Mathematics Simulation and Modeling Genomics Hypersphere Nucleic acids Neurology Ribosomal RNA Medical Microbiology Physical Sciences Medicine Algorithms Research Article Microbial Taxonomy Cell biology Computer and Information Sciences Cellular structures and organelles Ecological Metrics Science Population Computational biology Microbial Genomics Biology Research and Analysis Methods Microbiology 03 medical and health sciences Clustering Algorithms Developmental Neuroscience Genetics Bhattacharyya distance Humans Computer Simulation Microbiome education Cluster analysis Divergence (statistics) Non-coding RNA 030304 developmental biology Taxonomy Models Genetic Bacteria Ecology and Environmental Sciences Gut Bacteria Organisms Genetic Variation Biology and Life Sciences Species Diversity Gastrointestinal Microbiome Algebra Linear Algebra Neurodevelopmental Disorders Developmental Psychology RNA Compositional data Ribosomes 030217 neurology & neurosurgery Mathematics Neuroscience |
Zdroj: | PLoS ONE PLoS ONE, Vol 16, Iss 2, p e0246893 (2021) |
ISSN: | 1932-6203 |
Popis: | The heterogeneity of disease is a major concern in medical research and is commonly characterized as subtypes with different pathogeneses exhibiting distinct prognoses and treatment effects. The classification of a population into homogeneous subgroups is challenging, especially for complex diseases. Recent studies show that gut microbiome compositions play a vital role in disease development, and it is of great interest to cluster patients according to their microbial profiles. There are a variety of beta diversity measures to quantify the dissimilarity between the compositions of different samples for clustering. However, using different beta diversity measures results in different clusters, and it is difficult to make a choice among them. Considering microbial compositions from 16S rRNA sequencing, which are presented as a high-dimensional vector with a large proportion of extremely small or even zero-valued elements, we set up three simulation experiments to mimic the microbial compositional data and evaluate the performance of different beta diversity measures in clustering. It is shown that the Kullback-Leibler divergence-based beta diversity, including the Jensen-Shannon divergence and its square root, and the hypersphere-based beta diversity, including the Bhattacharyya and Hellinger, can capture compositional changes in low-abundance elements more efficiently and can work stably. Their performance on two real datasets demonstrates the validity of the simulation experiments. |
Databáze: | OpenAIRE |
Externí odkaz: |