HaTSPiL: A modular pipeline for high throughput sequencing data analysis
Autor: | Lisa Marie Simon, Edoardo Morandi, Isabelle Laurence Polignano, Silvia Deaglio, Giulia Basile, Andrea Lauria, Danny Incarnato, Caterina Parlato, Salvatore Oliviero, Elisa Tirtei, Matteo Cereda, Francesca Arruga, Francesca Anselmi, Franca Fagioli |
---|---|
Přispěvatelé: | Molecular Genetics |
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
Data Analysis
Research Validity Computer science Molecular biology Mutagenesis and Gene Deletion Techniques DNA barcoding Computer Architecture Workflow 0302 clinical medicine Software Sequencing techniques Software Design DNA sequencing 0303 health sciences Multidisciplinary Software Engineering High-Throughput Nucleotide Sequencing Genomics Research Assessment DNA Barcoding Medicine Engineering and Technology Transcriptome Analysis Sequence Analysis Research Article Next-Generation Sequencing Computer and Information Sciences Sequence analysis Reliability (computer networking) Science Sample (statistics) DNA DNA Barcoding Taxonomic Humans Reproducibility of Results Sequence Analysis DNA Computer Software 03 medical and health sciences Genetics 030304 developmental biology Biology and life sciences business.industry Software Tools Computational Biology Taxonomic Naming convention (programming) Modular design Genome Analysis Pipeline (software) Research and analysis methods Molecular biology techniques Mutational Analysis Programming Languages Software engineering business 030217 neurology & neurosurgery User Interfaces |
Zdroj: | PLoS ONE PLoS ONE, 14(10):e0222512. PUBLIC LIBRARY SCIENCE PLoS ONE, Vol 14, Iss 10, p e0222512 (2019) |
ISSN: | 1932-6203 |
Popis: | BACKGROUND: Next generation sequencing methods are widely adopted for a large amount of scientific purposes, from pure research to health-related studies. The decreasing costs per analysis led to big amounts of generated data and to the subsequent improvement of software for the respective analyses. As a consequence, many approaches have been developed to chain different software in order to obtain reliable and reproducible workflows. However, the large range of applications for NGS approaches entails the challenge to manage many different workflows without losing reliability.METHODS: We here present a high-throughput sequencing pipeline (HaTSPiL), a Python-powered CLI tool designed to handle different approaches for data analysis with a high level of reliability. The software relies on the barcoding of filenames using a human readable naming convention that contains any information regarding the sample needed by the software to automatically choose different workflows and parameters. HaTSPiL is highly modular and customisable, allowing the users to extend its features for any specific need.CONCLUSIONS: HaTSPiL is licensed as Free Software under the MIT license and it is available at https://github.com/dodomorandi/hatspil. |
Databáze: | OpenAIRE |
Externí odkaz: |