Popis: |
Personalized accurate diagnosis is key for effective diabetes management. This study develops an ensemble machine learning approach using XGBoost, Random Forest, and Support Vector Machine for enhanced multi-class prediction of diabetes types. A dataset of 280 type 1, type 2, and non-diabetic patients from Nigeria is utilized. After data preprocessing, the base classifiers are hyperparameters-tuned using grid search. Then, soft voting was applied to create the ensemble classifier. The ensemble model achieved high predictive performance with accuracy of 90.48%, outperforming individual and prior classifiers. Detailed feature importance analysis identified age, HbA1c, weight, and fasting plasma glucose as top predictors for type 1 diabetes, while HbA1c, 2-hour plasma glucose, and fasting plasma glucose are most indicative of type 2 diabetes. The ensemble framework and tailored feature analysis enabled personalized diagnosis by gaining insights into distinguishing attributes between diabetes types. The approach demonstrates potential to improve clinical decision-making through robust, personalized predictions. Future work involves incorporating more risk factors and advanced feature selection techniques. The study has significant implications for advancing personalized medicine for diabetes. |