Sparse functional data analysis accounts for missing information in single-cell epigenomics

Autor: Xiongtao Dai, Pantelis Z. Hadjipantelis, Pedro Madrigal
Jazyk: angličtina
Rok vydání: 2018
Předmět:
DOI: 10.1101/504365
Popis: Single-cell epigenome assays produce sparsely sampled data, leading to coverage pooling across cells to increase resolution. Imputation of missing data using deep learning is available but requires intensive computation, and it has been applied only to DNA methylation obtained by single cell bisulfite sequencing. Here, sparsity in chromatin accessibility obtained by scNMT-seq is addressed using functional data analysis to fit sparsely sampled GpC coverage profiles of individual cells taking into account all the cells of the same cell-type or condition. For that, sparse functional principal component analysis (S-FPCA) is applied, and the principal components are used to estimate chromatin accessibility coverage in individual cells. This methodology can potentially be used with other single-cell assays with missing data such as scBS-seq, scNOME-seq, or scATAC-seq. The R package fdapace is available in CRAN, and R code used in this manuscript can be found at: http://github.com/pmb59/sparseSingleCell.
Databáze: OpenAIRE