Automated Chagas Disease Vectors Identification using Data Mining Techniques

Autor: Shadi Banitaan, Zeinab Ghasemi, Ghaith Al-Refai
Rok vydání: 2020
Předmět:
Zdroj: EIT
Popis: Chagas disease (CD) is a vector-borne zoonotic disease affecting large parts of the world. It is imposing a tremendous social burden on public health and ranks as one of the most severe threats to human health. CD is often transmitted to humans by the feces of insects called triatomine or kissing bugs. The diagnosis of CD can be performed at any stage of the disease and involves the analysis of clinical, epidemiological, and laboratory data. The CD has two different phases, acute phase and chronic phase. Since controlling and treating CD is easier in the early stages, detecting it in the acute phase plays an essential role in overcoming and controlling it. There are many clinical trials dedicated to this problem, but progress in applicational research (automatic identification) has been slower. Due to this shortcoming and the importance of this problem, this research is dedicated to present two automatic CD vector identification systems that classify several different vectors of kissing bugs with an acceptable and promising identification rate. Our proposed methods are composed of preprocessing, feature extraction, and classification phases. Principal component analysis (PCA) is utilized for feature extraction and Random Forrest (RF) and Support Vector Machine (SVM) are employed in the classification stages. A dataset consisting of more than two thousand kissing bug images is used as input of our methods. The accuracy for the first proposed approach, namely PCA-SVM, is 87.62% for 410 images of 12 Mexican and 75.26% for 1620 images of 39 Brazilian species. The second proposed approach, namely PCA-RF, has an accuracy of 100% for both Brazilian and Mexican species. We achieved perfect results with the PCA-RF method. Our results are promising and outperform the results of other available developed automatic identification systems for CD vectors.
Databáze: OpenAIRE