CellMixS: quantifying and visualizing batch effects in single-cell RNA-seq data.

Autor: Lütge A; Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland.; SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland., Zyprych-Walczak J; Department of Mathematical and Statistical Methods, Poznan University of Life Sciences, Poznań, Poland., Brykczynska Kunzmann U; Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland., Crowell HL; Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland.; SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland., Calini D; F. Hoffmann-LaRoche Ltd, Pharma Research and Early Development, Neuroscience, Ophthalmologyand Rare Diseases, Roche Innovation Center Basel, Basel, Switzerland., Malhotra D; F. Hoffmann-LaRoche Ltd, Pharma Research and Early Development, Neuroscience, Ophthalmologyand Rare Diseases, Roche Innovation Center Basel, Basel, Switzerland., Soneson C; SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland.; Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland., Robinson MD; Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland mark.robinson@mls.uzh.ch.; SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland.
Jazyk: angličtina
Zdroj: Life science alliance [Life Sci Alliance] 2021 Mar 23; Vol. 4 (6). Date of Electronic Publication: 2021 Mar 23 (Print Publication: 2021).
DOI: 10.26508/lsa.202001004
Abstrakt: A key challenge in single-cell RNA-sequencing (scRNA-seq) data analysis is batch effects that can obscure the biological signal of interest. Although there are various tools and methods to correct for batch effects, their performance can vary. Therefore, it is important to understand how batch effects manifest to adjust for them. Here, we systematically explore batch effects across various scRNA-seq datasets according to magnitude, cell type specificity, and complexity. We developed a cell-specific mixing score (cms) that quantifies mixing of cells from multiple batches. By considering distance distributions, the score is able to detect local batch bias as well as differentiate between unbalanced batches and systematic differences between cells of the same cell type. We compare metrics in scRNA-seq data using real and synthetic datasets and whereas these metrics target the same question and are used interchangeably, we find differences in scalability, sensitivity, and ability to handle differentially abundant cell types. We find that cell-specific metrics outperform cell type-specific and global metrics and recommend them for both method benchmarks and batch exploration.
(© 2021 Lütge et al.)
Databáze: MEDLINE