Objective review of de novo stand-alone error correction methods for NGS data

Autor: Alic, Andrei Stefan, Ruzafa, David, Dopazo, Joaquin, Blanquer Espert, Ignacio
Rok vydání: 2016
Předmět:
Zdroj: WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE
r-IIS La Fe. Repositorio Institucional de Producción Científica del Instituto de Investigación Sanitaria La Fe
instname
Wiley Interdisciplinary Reviews-Computational Molecular Science
r-CIPF. Repositorio Institucional Producción Científica del Centro de Investigación Principe Felipe (CIPF)
r-CIPF: Repositorio Institucional Producción Científica del Centro de Investigación Principe Felipe (CIPF)
Centro de Investigación Principe Felipe (CIPF)
RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
ISSN: 1759-0876
Popis: [EN] The sequencing market has increased steadily over the last few years, with different approaches to read DNA information prone to different types of errors. Multiple studies demonstrated the impact of sequencing errors on different applications of next-generation sequencing (NGS), making error correction a fundamental initial step. Different methods in the literature use different approaches and fit different types of problems. We analyzed 50 methods divided into five main approaches (k-spectrum, suffix arrays, multiple-sequence alignment, read clustering, and probabilistic models). They are not published as a part of a suite (stand-alone), and target raw, unprocessed data without an existing reference genome (de novo). These correctors handle one or more sequencing technologies using the same or different approaches. They face general challenges (sometimes with specific traits for specific technologies) such as repetitive regions, uncalled bases, and ploidy. Even assessing their performance is a challenge in itself because of the approach taken by various authors, the unknown factor (de novo), and the behavior of the third-party tools employed in the benchmarks. This study aims to help the researcher in the field to advance the field of error correction, the educator to have a brief but comprehensive companion, and the bioinformatician to choose the right tool for the right job. © 2016 John Wiley & Sons, Ltd
We want to thank our colleague Eloy Romero Alcale who has provided valuable advice regarding the structure of the document. This work was supported by Generalitat Valenciana [GRISOLIA/2013/013 to A.A.].
Databáze: OpenAIRE