Genome Compression: An Image-Based Approach
Autor: | Roberto Hiroshi Herai, Juliano V. Martins, Kelvin V. Kredens, Edson Emílio Scalabrin, Osmar Betazzi Dordal, Bráulio Coelho Ávila |
---|---|
Rok vydání: | 2018 |
Předmět: |
0301 basic medicine
Lossless compression Pixel Computer science business.industry 0206 medical engineering Pattern recognition 02 engineering and technology computer.file_format Genome DNA sequencing 03 medical and health sciences 030104 developmental biology WebP RGB color model Artificial intelligence Image file formats business computer 020602 bioinformatics Data compression |
Zdroj: | Artificial Intelligence and Soft Computing ISBN: 9783319912615 ICAISC (2) |
DOI: | 10.1007/978-3-319-91262-2_22 |
Popis: | With the advent of Next Generation Sequencing Technologies, it has been possible to reduce the cost and time of genome sequencing. Thus, there was a significant increase in demand for genomes that were assembled daily. This demand requires more efficient techniques for storing and transmitting genomic data. In this research, we discussed the horizontal compression of lossless genomic sequences, using two image formats, WEBP, and FLIF. For this, the genomic sequence is transformed into a matrix of colored pixels, where an RGB color is assigned to each symbol of the A, T, C, G alphabet at a position x-y. The WEBP format showed the best data-rate saving (76.15%, SD = 0.84) when compared to FLIF. In addition, we compared the data-rate savings of two specialized DELIMINATE and MPCompress genomic data compression tools with WEBP. The results obtained show that the WEBP is close to DELIMINATE (76.03%, SD = 2.54%) and MFCompress (76.97%). SD = 1.36%). Finally, we suggest using WEBP for genomic data compression. |
Databáze: | OpenAIRE |
Externí odkaz: |