Joint dimension reduction and clustering analysis of single-cell RNA-seq and spatial transcriptomics data.

Autor: Liu W; Academy of Statistics and Interdisciplinary Sciences, East China Normal University, Shanghai, 200062, China.; Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, 169857, Singapore., Liao X; Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, 169857, Singapore., Yang Y; Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, 169857, Singapore., Lin H; Center of Statistical Research and School of Statistics, Southwestern University of Finance and Economics, Chengdu, 611130, China., Yeong J; Institute of Molecular and Cell Biology(IMCB), Agency of Science, Technology and Research(A*STAR), 138673, Singapore.; Department of Anatomical Pathology, Singapore General Hospital, 169856, Singapore., Zhou X; Department of Biostatistics, University of Michigan, Ann Arbor, 48109, USA., Shi X; Academy of Statistics and Interdisciplinary Sciences, East China Normal University, Shanghai, 200062, China.; Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, 200062, China., Liu J; Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, 169857, Singapore.
Jazyk: angličtina
Zdroj: Nucleic acids research [Nucleic Acids Res] 2022 Jul 08; Vol. 50 (12), pp. e72.
DOI: 10.1093/nar/gkac219
Abstrakt: Dimension reduction and (spatial) clustering is usually performed sequentially; however, the low-dimensional embeddings estimated in the dimension-reduction step may not be relevant to the class labels inferred in the clustering step. We therefore developed a computation method, Dimension-Reduction Spatial-Clustering (DR-SC), that can simultaneously perform dimension reduction and (spatial) clustering within a unified framework. Joint analysis by DR-SC produces accurate (spatial) clustering results and ensures the effective extraction of biologically informative low-dimensional features. DR-SC is applicable to spatial clustering in spatial transcriptomics that characterizes the spatial organization of the tissue by segregating it into multiple tissue structures. Here, DR-SC relies on a latent hidden Markov random field model to encourage the spatial smoothness of the detected spatial cluster boundaries. Underlying DR-SC is an efficient expectation-maximization algorithm based on an iterative conditional mode. As such, DR-SC is scalable to large sample sizes and can optimize the spatial smoothness parameter in a data-driven manner. With comprehensive simulations and real data applications, we show that DR-SC outperforms existing clustering and spatial clustering methods: it extracts more biologically relevant features than conventional dimension reduction methods, improves clustering performance, and offers improved trajectory inference and visualization for downstream trajectory inference analyses.
(© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.)
Databáze: MEDLINE