Comparison of beta diversity measures in clustering the high-dimensional microbial data

Autor: Xueyi He, Biyuan Chen, Xiaobing Zou, Bangquan Pan, Na You
Rok vydání: 2020
Předmět:
Pervasive Developmental Disorders
Autism Spectrum Disorder
Autism
Vector Spaces
Beta diversity
Social Sciences
Biochemistry
0302 clinical medicine
Medical Conditions
RNA
Ribosomal
16S

Medicine and Health Sciences
Cluster Analysis
Psychology
Bacteroides
Phylogeny
Data Management
0303 health sciences
education.field_of_study
Multidisciplinary
Ecology
Microbiota
Applied Mathematics
Simulation and Modeling
Genomics
Hypersphere
Nucleic acids
Neurology
Ribosomal RNA
Medical Microbiology
Physical Sciences
Medicine
Algorithms
Research Article
Microbial Taxonomy
Cell biology
Computer and Information Sciences
Cellular structures and organelles
Ecological Metrics
Science
Population
Computational biology
Microbial Genomics
Biology
Research and Analysis Methods
Microbiology
03 medical and health sciences
Clustering Algorithms
Developmental Neuroscience
Genetics
Bhattacharyya distance
Humans
Computer Simulation
Microbiome
education
Cluster analysis
Divergence (statistics)
Non-coding RNA
030304 developmental biology
Taxonomy
Models
Genetic

Bacteria
Ecology and Environmental Sciences
Gut Bacteria
Organisms
Genetic Variation
Biology and Life Sciences
Species Diversity
Gastrointestinal Microbiome
Algebra
Linear Algebra
Neurodevelopmental Disorders
Developmental Psychology
RNA
Compositional data
Ribosomes
030217 neurology & neurosurgery
Mathematics
Neuroscience
Zdroj: PLoS ONE
PLoS ONE, Vol 16, Iss 2, p e0246893 (2021)
ISSN: 1932-6203
Popis: The heterogeneity of disease is a major concern in medical research and is commonly characterized as subtypes with different pathogeneses exhibiting distinct prognoses and treatment effects. The classification of a population into homogeneous subgroups is challenging, especially for complex diseases. Recent studies show that gut microbiome compositions play a vital role in disease development, and it is of great interest to cluster patients according to their microbial profiles. There are a variety of beta diversity measures to quantify the dissimilarity between the compositions of different samples for clustering. However, using different beta diversity measures results in different clusters, and it is difficult to make a choice among them. Considering microbial compositions from 16S rRNA sequencing, which are presented as a high-dimensional vector with a large proportion of extremely small or even zero-valued elements, we set up three simulation experiments to mimic the microbial compositional data and evaluate the performance of different beta diversity measures in clustering. It is shown that the Kullback-Leibler divergence-based beta diversity, including the Jensen-Shannon divergence and its square root, and the hypersphere-based beta diversity, including the Bhattacharyya and Hellinger, can capture compositional changes in low-abundance elements more efficiently and can work stably. Their performance on two real datasets demonstrates the validity of the simulation experiments.
Databáze: OpenAIRE