Outlier Management and its Impact on Diabetes Prediction: A Voting Ensemble Study.

Autor: Praveen, S. Phani, Sandeep, Kotte, Sai, N. Raghavendra, Sharma, Aditi, Pandey, Jitendra, Chouhan, Vikas
Předmět:
Zdroj: Journal of Intelligent Systems & Internet of Things; 2024, Vol. 12 Issue 1, p8-19, 12p
Abstrakt: The chronic metabolic disorder known as diabetes mellitus, which is defined by hyperglycemia, poses a significant threat to the health of people all over the world. The categorization is broken down into two primary categories: Type 1 and Type 2, with each category having its own unique causes and approaches to treatment. It is very necessary for the effective management of illnesses to have both the prompt detection and the exact prediction of outcomes. The applications of machine learning and data mining are becoming increasingly important as tools in this setting. The current research study analyses the usage of machine learning models, specifically Voting Ensembles, for the goal of predicting diabetes. Specifically, the researchers were interested in how accurate these models were. Using GridSearchCV, the Voting Ensemble, which consists of LightGBM, XGBoost, and AdaBoost, is fine-tuned to manage outliers. This may be done with or without the Interquartile Range (IQR) pre-processing. The results of a comparative analysis of performance, which is carried out, illustrate the benefits that are linked with outlier management. According to the findings, the Voting Ensemble model, when paired with IQR pre-processing, possesses greater accuracy, precision, and AUC score, which makes it more acceptable for predicting diabetes. Despite this, the strategy that does not use the IQR continues to be a workable and reasonable alternative. The current study emphasizes both the significance of outlier management within the area of healthcare analytics and the effect of data preparation procedures on the accuracy of prediction models. Both of these topics are brought up because of the relevance of the current work. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index