Scalable Genomics with R and Bioconductor

Autor: Lawrence, Michael, Morgan, Martin
Rok vydání: 2014
Předmět:
Zdroj: Statistical Science 2014, Vol. 29, No. 2, 214-226
Druh dokumentu: Working Paper
DOI: 10.1214/14-STS476
Popis: This paper reviews strategies for solving problems encountered when analyzing large genomic data sets and describes the implementation of those strategies in R by packages from the Bioconductor project. We treat the scalable processing, summarization and visualization of big genomic data. The general ideas are well established and include restrictive queries, compression, iteration and parallel computing. We demonstrate the strategies by applying Bioconductor packages to the detection and analysis of genetic variants from a whole genome sequencing experiment.
Comment: Published in at http://dx.doi.org/10.1214/14-STS476 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Databáze: arXiv