Analysis of quality raw data of second generation sequencers with Quality Assessment Software.

Autor: Ramos RT; Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém-PA, Brazil. asilva@ufpa.br., Carneiro AR, Baumbach J, Azevedo V, Schneider MP, Silva A
Jazyk: angličtina
Zdroj: BMC research notes [BMC Res Notes] 2011 Apr 18; Vol. 4, pp. 130. Date of Electronic Publication: 2011 Apr 18.
DOI: 10.1186/1756-0500-4-130
Abstrakt: Background: Second generation technologies have advantages over Sanger; however, they have resulted in new challenges for the genome construction process, especially because of the small size of the reads, despite the high degree of coverage. Independent of the program chosen for the construction process, DNA sequences are superimposed, based on identity, to extend the reads, generating contigs; mismatches indicate a lack of homology and are not included. This process improves our confidence in the sequences that are generated.
Findings: We developed Quality Assessment Software, with which one can review graphs showing the distribution of quality values from the sequencing reads. This software allow us to adopt more stringent quality standards for sequence data, based on quality-graph analysis and estimated coverage after applying the quality filter, providing acceptable sequence coverage for genome construction from short reads.
Conclusions: Quality filtering is a fundamental step in the process of constructing genomes, as it reduces the frequency of incorrect alignments that are caused by measuring errors, which can occur during the construction process due to the size of the reads, provoking misassemblies. Application of quality filters to sequence data, using the software Quality Assessment, along with graphing analyses, provided greater precision in the definition of cutoff parameters, which increased the accuracy of genome construction.
Databáze: MEDLINE