ToTem: a tool for variant calling pipeline optimization
Autor: | Jitka Malčíková, Vojtech Bystry, Nikola Tom, Tobias Rausch, Miroslav Kolarik, Vladimir Benes, Šárka Pavlová, Ondrej Tom, Blanka Kubešová, Šárka Pospíšilová |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
0301 basic medicine
Computer science computer.software_genre lcsh:Computer applications to medicine. Medical informatics Biochemistry DNA sequencing Germline 03 medical and health sciences Structural Biology Next generation sequencing Variant calling Molecular Biology lcsh:QH301-705.5 Whole genome sequencing Programming language Applied Mathematics Totem Process (computing) Computational Biology High-Throughput Nucleotide Sequencing Reproducibility of Results Pipeline (software) Computer Science Applications Benchmarking 030104 developmental biology Parameter optimization lcsh:Biology (General) Research Design Benchmark (computing) lcsh:R858-859.7 DNA microarray computer Software |
Zdroj: | BMC Bioinformatics, Vol 19, Iss 1, Pp 1-9 (2018) BMC Bioinformatics |
ISSN: | 1471-2105 |
Popis: | Background High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters for optimal precision and recall. Results Here we introduce ToTem, a tool for automated pipeline optimization. ToTem is a stand-alone web application with a comprehensive graphical user interface (GUI). ToTem is written in Java and PHP with an underlying connection to a MySQL database. Its primary role is to automatically generate, execute and benchmark different variant calling pipeline settings. Our tool allows an analysis to be started from any level of the process and with the possibility of plugging almost any tool or code. To prevent an over-fitting of pipeline parameters, ToTem ensures the reproducibility of these by using cross validation techniques that penalize the final precision, recall and F-measure. The results are interpreted as interactive graphs and tables allowing an optimal pipeline to be selected, based on the user’s priorities. Using ToTem, we were able to optimize somatic variant calling from ultra-deep targeted gene sequencing (TGS) data and germline variant detection in whole genome sequencing (WGS) data. Conclusions ToTem is a tool for automated pipeline optimization which is freely available as a web application at https://totem.software. Electronic supplementary material The online version of this article (10.1186/s12859-018-2227-x) contains supplementary material, which is available to authorized users. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |