HaTSPiL: A modular pipeline for high throughput sequencing data analysis

Autor: Lisa Marie Simon, Edoardo Morandi, Isabelle Laurence Polignano, Silvia Deaglio, Giulia Basile, Andrea Lauria, Danny Incarnato, Caterina Parlato, Salvatore Oliviero, Elisa Tirtei, Matteo Cereda, Francesca Arruga, Francesca Anselmi, Franca Fagioli
Přispěvatelé: Molecular Genetics
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Data Analysis
Research Validity
Computer science
Molecular biology
Mutagenesis and Gene Deletion Techniques
DNA barcoding
Computer Architecture
Workflow
0302 clinical medicine
Software
Sequencing techniques
Software Design
DNA sequencing
0303 health sciences
Multidisciplinary
Software Engineering
High-Throughput Nucleotide Sequencing
Genomics
Research Assessment
DNA Barcoding
Medicine
Engineering and Technology
Transcriptome Analysis
Sequence Analysis
Research Article
Next-Generation Sequencing
Computer and Information Sciences
Sequence analysis
Reliability (computer networking)
Science
Sample (statistics)
DNA
DNA Barcoding
Taxonomic

Humans
Reproducibility of Results
Sequence Analysis
DNA

Computer Software
03 medical and health sciences
Genetics
030304 developmental biology
Biology and life sciences
business.industry
Software Tools
Computational Biology
Taxonomic
Naming convention (programming)
Modular design
Genome Analysis
Pipeline (software)
Research and analysis methods
Molecular biology techniques
Mutational Analysis
Programming Languages
Software engineering
business
030217 neurology & neurosurgery
User Interfaces
Zdroj: PLoS ONE
PLoS ONE, 14(10):e0222512. PUBLIC LIBRARY SCIENCE
PLoS ONE, Vol 14, Iss 10, p e0222512 (2019)
ISSN: 1932-6203
Popis: BACKGROUND: Next generation sequencing methods are widely adopted for a large amount of scientific purposes, from pure research to health-related studies. The decreasing costs per analysis led to big amounts of generated data and to the subsequent improvement of software for the respective analyses. As a consequence, many approaches have been developed to chain different software in order to obtain reliable and reproducible workflows. However, the large range of applications for NGS approaches entails the challenge to manage many different workflows without losing reliability.METHODS: We here present a high-throughput sequencing pipeline (HaTSPiL), a Python-powered CLI tool designed to handle different approaches for data analysis with a high level of reliability. The software relies on the barcoding of filenames using a human readable naming convention that contains any information regarding the sample needed by the software to automatically choose different workflows and parameters. HaTSPiL is highly modular and customisable, allowing the users to extend its features for any specific need.CONCLUSIONS: HaTSPiL is licensed as Free Software under the MIT license and it is available at https://github.com/dodomorandi/hatspil.
Databáze: OpenAIRE