Removal of batch effects using distribution-matching residual networks
Autor: | Kelly P. Stanton, Huamin Li, Jun Zhao, Yuval Kluger, Ruth R. Montgomery, Khadir Raddassi, Uri Shaham |
---|---|
Rok vydání: | 2017 |
Předmět: |
FOS: Computer and information sciences
0301 basic medicine Statistics and Probability Systematic error Multivariate statistics Matching (statistics) Computer science Statistics as Topic Machine Learning (stat.ML) Residual computer.software_genre Biochemistry Machine Learning 03 medical and health sciences 0302 clinical medicine Statistics - Machine Learning Humans Molecular Biology Observational error Sequence Analysis RNA Computational Biology Original Papers Data Accuracy Computer Science Applications Computational Mathematics 030104 developmental biology Distribution (mathematics) Computational Theory and Mathematics 030220 oncology & carcinogenesis Measuring instrument Cytophotometry Data mining Single-Cell Analysis computer |
Zdroj: | Bioinformatics. 33:2539-2546 |
ISSN: | 1367-4811 1367-4803 |
Popis: | Sources of variability in experimentally derived data include measurement error in addition to the physical phenomena of interest. This measurement error is a combination of systematic components, originating from the measuring instrument, and random measurement errors. Several novel biological technologies, such as mass cytometry and single-cell RNA-seq, are plagued with systematic errors that may severely affect statistical analysis if the data is not properly calibrated. We propose a novel deep learning approach for removing systematic batch effects. Our method is based on a residual network, trained to minimize the Maximum Mean Discrepancy (MMD) between the multivariate distributions of two replicates, measured in different batches. We apply our method to mass cytometry and single-cell RNA-seq datasets, and demonstrate that it effectively attenuates batch effects. Comment: fixed typo |
Databáze: | OpenAIRE |
Externí odkaz: |