CIPR: a web-based R/shiny app and R package to annotate cell clusters in single cell RNA sequencing experiments

Autor: W. Zac Stephens, H. Atakan Ekiz, Ryan M. O'Connell, Christopher J. Conley
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Identity prediction
Computer science
computer.software_genre
lcsh:Computer applications to medicine. Medical informatics
Biochemistry
Similarity
Mice
03 medical and health sciences
0302 clinical medicine
Cluster analysis
Structural Biology
Databases
Genetic

Gene expression
Animals
Humans
Web application
Molecular Biology
Gene
lcsh:QH301-705.5
Cell Aggregation
030304 developmental biology
Internet
0303 health sciences
Base Sequence
Sequence Analysis
RNA

business.industry
Applied Mathematics
Single cell RNA-sequencing
Immune cells
RNA
Molecular Sequence Annotation
Pipeline (software)
Expression (mathematics)
Gene expression profiling
Computer Science Applications
lcsh:Biology (General)
Filter (video)
Identity (object-oriented programming)
lcsh:R858-859.7
Data mining
Single-Cell Analysis
DNA microarray
business
computer
Algorithms
Software
030217 neurology & neurosurgery
Zdroj: BMC Bioinformatics, Vol 21, Iss 1, Pp 1-15 (2020)
BMC Bioinformatics
ISSN: 1471-2105
DOI: 10.1186/s12859-020-3538-2
Popis: Background Single cell RNA sequencing (scRNAseq) has provided invaluable insights into cellular heterogeneity and functional states in health and disease. During the analysis of scRNAseq data, annotating the biological identity of cell clusters is an important step before downstream analyses and it remains technically challenging. The current solutions for annotating single cell clusters generally lack a graphical user interface, can be computationally intensive or have a limited scope. On the other hand, manually annotating single cell clusters by examining the expression of marker genes can be subjective and labor-intensive. To improve the quality and efficiency of annotating cell clusters in scRNAseq data, we present a web-based R/Shiny app and R package, Cluster Identity PRedictor (CIPR), which provides a graphical user interface to quickly score gene expression profiles of unknown cell clusters against mouse or human references, or a custom dataset provided by the user. CIPR can be easily integrated into the current pipelines to facilitate scRNAseq data analysis. Results CIPR employs multiple approaches for calculating the identity score at the cluster level and can accept inputs generated by popular scRNAseq analysis software. CIPR provides 2 mouse and 5 human reference datasets, and its pipeline allows inter-species comparisons and the ability to upload a custom reference dataset for specialized studies. The option to filter out lowly variable genes and to exclude irrelevant reference cell subsets from the analysis can improve the discriminatory power of CIPR suggesting that it can be tailored to different experimental contexts. Benchmarking CIPR against existing functionally similar software revealed that our algorithm is less computationally demanding, it performs significantly faster and provides accurate predictions for multiple cell clusters in a scRNAseq experiment involving tumor-infiltrating immune cells. Conclusions CIPR facilitates scRNAseq data analysis by annotating unknown cell clusters in an objective and efficient manner. Platform independence owing to Shiny framework and the requirement for a minimal programming experience allows this software to be used by researchers from different backgrounds. CIPR can accurately predict the identity of a variety of cell clusters and can be used in various experimental contexts across a broad spectrum of research areas.
Databáze: OpenAIRE