Truncated Robust Principal Component Analysis and Noise Reduction for Single Cell RNA-seq Data

Autor: Maciej Sykulski, Krzysztof Gogolewski, Neo Christopher Chung, Anna Gambin
Rok vydání: 2018
Předmět:
Zdroj: Bioinformatics Research and Applications ISBN: 9783319949673
ISBRA
DOI: 10.1007/978-3-319-94968-0_32
Popis: The development of single cell RNA sequencing (scRNA-seq) has enabled innovative approaches to investigating mRNA abundances. In our study, we are interested in extracting the systematic patterns of scRNA-seq data in an unsupervised manner, thus we have developed two extensions of robust principal component analysis (RPCA). First, we present a truncated version of RPCA (tRPCA), that is much faster and memory efficient. Second, we introduce a noise reduction in tRPCA with \(L_2\) regularization (tRPCAL2). Unlike RPCA that only considers a low-rank L and sparse S matrices, the proposed method can also extract a noise E matrix inherent in modern genomic data. We demonstrate its usefulness by applying our methods on the peripheral blood mononuclear cell (PBMC) scRNA-seq data. Particularly, the clustering of a low-rank L matrix showcases better classification of unlabeled single cells. Overall, the proposed variants are well-suited for high-dimensional and noisy data that are routinely generated in genomics.
Databáze: OpenAIRE