Unbalanced Learning for Early Automatic Diagnosis of Diabetes Based on Enhanced Resampling Technique and Stacking Classifier

Autor: Zemmal, Nawel, Benzebouchi, Nacer, Azizi, Nabiha, Schwab, Didier, Belhaouari, Brahim
Přispěvatelé: Schwab, Didier
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Popis: Diabetes is characterized by an abnormally enhanced concentration of glucose in the blood serum. It has a damaging impact on several noble body systems, mainly on the cardiovascular, renal, and visual systems. Automated screening allows early diagnosis of certain illness (such as diabetes), which generally increases the chances for successful treatment. Today, machine learning has developed considerably in the domain of medical diagnosis, especially with regard to diabetes diagnosis, and as such, thanks to the integration of the concept of unbalanced learning, which considerably reduces the generation of erroneous classification results. This general concept is dealt with from two different perspectives, i.e. at the data level through modification/balancing of the learning data set as well as at the algorithm level. The present paper takes a hybrid approach towards imbalanced learning in proposing an enhanced multimodal meta-learning method called IRESAMPLE+St to distinguish between normal and diabetic patients. This approach relies on the Stacking paradigm by utilizing the complementarity that may exist between classifiers. In the same focus of this study, a modified RESAMPLE-based technique referred to as IRESAMPLE+ and the SMOTE method is integrated as a preliminary resampling step to overcome and resolve the problem of unbalanced data. The imbalanced Pima Indian Diabetes (PID) data set is optimized through the proposed IRESAMPLE+ method, successfully operating as both an oversampling and undersampling technique, thereby reinforcing the diagnostic accuracy established by the Stacking classifier. The suggested IRESAMPLE+St provides a computerized diabetes diagnostic system with impressive results, Accuracy of 99.87%, Sensitivity of 100%, Specificity of 99.70% and AUROC of 99.90%, comparing them to the principal related studies. The over-performing results reflect the design and engineering successes achieved with the IRESAMPLE+St system for the classification of diabetes.
Databáze: OpenAIRE