A workflow for parallel and distributed computing of large-scale genomic data

Autor: Hyun-Hwa Choi, Byoung-Seob Kim, Seungjo Bae, Shin-Young Ahn
Rok vydání: 2013
Předmět:
Zdroj: ICITST
Popis: Workflow management systems are emerging as dominant solution in bioinformatics because they enable researchers to analyze the huge amount of data generated by modern laboratory equipment. The growth of genomic data generated by next generation sequencing (NGS) results in an increasing need to analyze data on distributed computer clusters. In this paper, we construct a semi-automated workflow system for the analysis of large-scale sequence data sets, describe a pipeline designed with parallel computation to perform the optimal computational steps required to analyze whole genome sequence data, and report the overall execution time of the pipeline using cores on multiple machines.
Databáze: OpenAIRE