Truncated Robust Principal Component Analysis and Noise Reduction for Single Cell RNA-seq Data
Autor: | Maciej Sykulski, Krzysztof Gogolewski, Neo Christopher Chung, Anna Gambin |
---|---|
Rok vydání: | 2018 |
Předmět: |
0301 basic medicine
Noise (signal processing) business.industry Computer science Noise reduction Pattern recognition RNA-Seq Genomics Regularization (mathematics) 03 medical and health sciences Matrix (mathematics) 030104 developmental biology Artificial intelligence business Cluster analysis Robust principal component analysis |
Zdroj: | Bioinformatics Research and Applications ISBN: 9783319949673 ISBRA |
DOI: | 10.1007/978-3-319-94968-0_32 |
Popis: | The development of single cell RNA sequencing (scRNA-seq) has enabled innovative approaches to investigating mRNA abundances. In our study, we are interested in extracting the systematic patterns of scRNA-seq data in an unsupervised manner, thus we have developed two extensions of robust principal component analysis (RPCA). First, we present a truncated version of RPCA (tRPCA), that is much faster and memory efficient. Second, we introduce a noise reduction in tRPCA with \(L_2\) regularization (tRPCAL2). Unlike RPCA that only considers a low-rank L and sparse S matrices, the proposed method can also extract a noise E matrix inherent in modern genomic data. We demonstrate its usefulness by applying our methods on the peripheral blood mononuclear cell (PBMC) scRNA-seq data. Particularly, the clustering of a low-rank L matrix showcases better classification of unlabeled single cells. Overall, the proposed variants are well-suited for high-dimensional and noisy data that are routinely generated in genomics. |
Databáze: | OpenAIRE |
Externí odkaz: |