BatchQC: interactive software for evaluating sample and batch effects in genomic data.

Autor: Manimaran S; Department of Biostatistics, Boston University, Boston, MA.; Division of Computational Biomedicine, Boston University School of Medicine, Boston, MA., Selby HM; Bioinformatics Program, Boston University, Boston, MA., Okrah K; gRED Oncology Biostatistics, Genentech, South San Francisco, CA., Ruberman C; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD., Leek JT; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD., Quackenbush J; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA.; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA., Haibe-Kains B; Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, Ontario, Canada.; Princess Margaret Cancer Centre, University Health NetworkToronto, Ontario, Canada.; Ontario Institute of Cancer Research, Toronto, Ontario, Canada., Bravo HC; Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD., Johnson WE; Department of Biostatistics, Boston University, Boston, MA.; Division of Computational Biomedicine, Boston University School of Medicine, Boston, MA.; Bioinformatics Program, Boston University, Boston, MA.
Jazyk: angličtina
Zdroj: Bioinformatics (Oxford, England) [Bioinformatics] 2016 Dec 15; Vol. 32 (24), pp. 3836-3838. Date of Electronic Publication: 2016 Aug 18.
DOI: 10.1093/bioinformatics/btw538
Abstrakt: Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. There are several existing batch adjustment tools for '-omics' data, but they do not indicate a priori whether adjustment needs to be conducted or how correction should be applied. We present a software pipeline, BatchQC, which addresses these issues using interactive visualizations and statistics that evaluate the impact of batch effects in a genomic dataset. BatchQC can also apply existing adjustment tools and allow users to evaluate their benefits interactively. We used the BatchQC pipeline on both simulated and real data to demonstrate the effectiveness of this software toolkit.
Availability and Implementation: BatchQC is available through Bioconductor: http://bioconductor.org/packages/BatchQC and GitHub: https://github.com/mani2012/BatchQC CONTACT: wej@bu.eduSupplementary information: Supplementary data are available at Bioinformatics online.
(© The Author 2016. Published by Oxford University Press.)
Databáze: MEDLINE