Robust, scalable, and informative clustering for diverse biological networks.

Autor: Gaiteri C; Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA. gaiteri@gmail.com.; Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA. gaiteri@gmail.com.; Department of Neurological Sciences, Rush University Medical Center, Chicago, IL, USA. gaiteri@gmail.com., Connell DR; Rush University Graduate College, Rush University Medical Center, Chicago, IL, USA., Sultan FA; Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA., Iatrou A; Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA.; Department of Psychiatry, McLean Hospital, Harvard Medical School, Harvard University, Belmont, MA, USA., Ng B; Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA.; Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA., Szymanski BK; Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, USA.; Network Science and Technology Center, Rensselaer Polytechnic Institute, Troy, NY, USA.; Academy of Social Sciences, Łódź, Poland., Zhang A; Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA.; Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA., Tasaki S; Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA.; Department of Neurological Sciences, Rush University Medical Center, Chicago, IL, USA.
Jazyk: angličtina
Zdroj: Genome biology [Genome Biol] 2023 Oct 12; Vol. 24 (1), pp. 228. Date of Electronic Publication: 2023 Oct 12.
DOI: 10.1186/s13059-023-03062-0
Abstrakt: Clustering molecular data into informative groups is a primary step in extracting robust conclusions from big data. However, due to foundational issues in how they are defined and detected, such clusters are not always reliable, leading to unstable conclusions. We compare popular clustering algorithms across thousands of synthetic and real biological datasets, including a new consensus clustering algorithm-SpeakEasy2: Champagne. These tests identify trends in performance, show no single method is universally optimal, and allow us to examine factors behind variation in performance. Multiple metrics indicate SpeakEasy2 generally provides robust, scalable, and informative clusters for a range of applications.
(© 2023. BioMed Central Ltd., part of Springer Nature.)
Databáze: MEDLINE