Development of predictive model of diabetic using supervised machine learning classification algorithm of ensemble voting

Autor: Datta, Debabrata, Bhattacharya, Madhubrata, Rajest, S. Suman, Shynu, T., Regin, R., Priscila, S. Silvia
Zdroj: International Journal of Bioinformatics Research and Applications; 2023, Vol. 19 Issue: 3 p151-169, 19p
Abstrakt: Predicting the health status of patients suffering from diabetic is an important task in the health sector because the medical history of diabetic evidenced that it is a slow killer. If data collection is enough, suitable, and noise-free, such difficulties can be predicted accurately. AI-based machine learning algorithms can predict diabetes. Overfitting and underfitting impair the accuracy of classification machine learning models. Individual machine-learning models are weak learners. Hence, the demand is to develop a strong model (overall model) by combining all weak learner models to improve accuracy. Voting creates a robust and accurate model. Voting is classified as soft and hard. Ensemble machines learning models like RF, AdaBoost, and Gboost are integrated with LR, DT and KNN models. Our ensemble voting classifier model combines RF, AdaBoost, Gboost, LR, DT, and KNN. This voting model predicts diabetes with 97+ % accuracy. LR, DT, and KNN models estimate precision, recall, and F1. We tested our proposed models on two sets of input datasets with numerical and categorical features and found that categorical features improve prediction accuracy.
Databáze: Supplemental Index