Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants
Autor: | Teresita Díaz de Ståhl, Maxime Garcia, Valtteri Wirta, Sebastian DiLorenzo, Björn Nystedt, Max Käller, Pall I Olason, Marcel Martin, Jesper Eisfeldt, Szilveszter Juhos, Johanna Sandgren, Philip Ewels, Malin Larsson, Monica Nistér |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
Source code
Computer science media_common.quotation_subject Germline variants Computational biology General Biochemistry Genetics and Molecular Biology Workflow Software portability Annotation Documentation Software Genetics Humans Analysis workflow General Pharmacology Toxicology and Pharmaceutics Genetik media_common Cancer Whole genome sequencing General Immunology and Microbiology Whole Genome Sequencing business.industry Software Tool Article Somatic variants General Medicine Articles Identification (information) Germ Cells business |
Zdroj: | F1000Research |
ISSN: | 2046-1402 |
Popis: | Whole-genome sequencing (WGS) is a fundamental technology for research to advance precision medicine, but the limited availability of portable and user-friendly workflows for WGS analyses poses a major challenge for many research groups and hampers scientific progress. Here we present Sarek, an open-source workflow to detect germline variants and somatic mutations based on sequencing data from WGS, whole-exome sequencing (WES), or gene panels. Sarek features (i) easy installation, (ii) robust portability across different computer environments, (iii) comprehensive documentation, (iv) transparent and easy-to-read code, and (v) extensive quality metrics reporting. Sarek is implemented in the Nextflow workflow language and supports both Docker and Singularity containers as well as Conda environments, making it ideal for easy deployment on any POSIX-compatible computers and cloud compute environments. Sarek follows the GATK best-practice recommendations for read alignment and pre-processing, and includes a wide range of software for the identification and annotation of germline and somatic single-nucleotide variants, insertion and deletion variants, structural variants, tumour sample purity, and variations in ploidy and copy number. Sarek offers easy, efficient, and reproducible WGS analyses, and can readily be used both as a production workflow at sequencing facilities and as a powerful stand-alone tool for individual research groups. The Sarek source code, documentation and installation instructions are freely available at https://github.com/nf-core/sarek and at https://nf-co.re/sarek/. |
Databáze: | OpenAIRE |
Externí odkaz: |