Three Rounds of Read Correction Significantly Improve Eukaryotic Protein Detection in ONT Reads

Autor: Hussain A. Safar, Fatemah Alatar, Abu Salim Mustafa
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Microorganisms, Vol 12, Iss 2, p 247 (2024)
Druh dokumentu: article
ISSN: 2076-2607
DOI: 10.3390/microorganisms12020247
Popis: Background: Eukaryotes’ whole-genome sequencing is crucial for species identification, gene detection, and protein annotation. Oxford Nanopore Technology (ONT) is an affordable and rapid platform for sequencing eukaryotes; however, the relatively higher error rates require computational and bioinformatic efforts to produce more accurate genome assemblies. Here, we evaluated the effect of read correction tools on eukaryote genome completeness, gene detection and protein annotation. Methods: Reads generated by ONT of four eukaryotes, C. albicans, C. gattii, S. cerevisiae, and P. falciparum, were assembled using minimap2 and underwent three rounds of read correction using flye, medaka and racon. The generates consensus FASTA files were compared for total length (bp), genome completeness, gene detection, and protein-annotation by QUAST, BUSCO, BRAKER1 and InterProScan, respectively. Results: Genome completeness was dependent on the assembly method rather than on the read correction tool; however, medaka performed better than flye and racon. Racon significantly performed better than flye and medaka in gene detection, while both racon and medaka significantly performed better than flye in protein-annotation. Conclusion: We show that three rounds of read correction significantly affect gene detection and protein annotation, which are dependent on assembly quality in preference to assembly completeness.
Databáze: Directory of Open Access Journals