Bacteria classification using minimal absent words

Autor: Riccardo Rizzo, Alessio Langiu, Giosuè Lo Bosco, Gabriele Fici
Přispěvatelé: Fici, G., Langiu, A., Lo Bosco, G., Rizzo, R.
Jazyk: angličtina
Rok vydání: 2017
Předmět:
Zdroj: AIMS Medical Science, Vol 5, Iss 1, Pp 23-32 (2017)
AIMS journal 5 (2018): 23–32. doi:10.3934/medsci.2018.1.23
info:cnr-pdr/source/autori:Fici, Gabriele; Langiu, Alessio; Lo Bosco, Giosue; Rizzo, Riccardo/titolo:Bacteria classification using minimal absent words/doi:10.3934%2Fmedsci.2018.1.23/rivista:AIMS journal/anno:2018/pagina_da:23/pagina_a:32/intervallo_pagine:23–32/volume:5
ISSN: 2375-1576
DOI: 10.3934/medsci.2018.1.23
Popis: Bacteria classification has been deeply investigated with different tools for many purposes, such as early diagnosis, metagenomics, phylogenetics. Classification methods based on ribosomal DNA sequences are considered a reference in this area. We present a new classificatier for bacteria species based on a dissimilarity measure of purely combinatorial nature. This measure is based on the notion of Minimal Absent Words, a combinatorial definition that recently found applications in bioinformatics. We can therefore incorporate this measure into a probabilistic neural network in order to classify bacteria species. Our approach is motivated by the fact that there is a vast literature on the combinatorics of Minimal Absent Words in relation with the degree of repetitiveness of a sequence. We ran our experiments on a public dataset of Ribosomal RNA Sequences from the complex 16S. Our approach showed a very high score in the accuracy of the classification, proving hence that our method is comparable with the standard tools available for the automatic classification of bacteria species.
Databáze: OpenAIRE