Is handling unbalanced datasets for machine learning uplifts system performance?: A case of diabetic prediction.

Autor: Narwane SV; Department of Computer Engineering, Datta Meghe College of Engineering, Navi Mumbai, Pin Code: 400 708, India. Electronic address: svnarwane@gmail.com., Sawarkar SD; Department of Computer Engineering, Datta Meghe College of Engineering, Navi Mumbai, Pin Code: 400 708, India. Electronic address: sudhir_sawarkar@yahoo.com.
Jazyk: angličtina
Zdroj: Diabetes & metabolic syndrome [Diabetes Metab Syndr] 2022 Sep; Vol. 16 (9), pp. 102609. Date of Electronic Publication: 2022 Sep 05.
DOI: 10.1016/j.dsx.2022.102609
Abstrakt: Background and Aims: Healthcare is a sensitive sector, and addressing the class imbalance in the healthcare domain is a time-consuming task for machine learning-based systems due to the vast amount of data. This study looks into the impact of socioeconomic disparities on the healthcare data of diabetic patients to make accurate disease predictions.
Methods: This study proposed a systematic approach of Closest Distance Ranking and Principal Component Analysis to deal with the unbalanced dataset. A typical machine learning technique was used to analyze the proposed approach. The data set of pregnant diabetic women is analysed for accurate detection.
Results: The results of the case are analysed using sensitivity, which demonstrates that the minority class's lack of information makes it impossible to forecast the results. On the other hand, the unbalanced dataset was treated using the proposed technique and evaluated with the machine learning algorithm which significantly increased the performance of the system.
Conclusion: The performance of the machine learning-based system was significantly enhanced by the unbalanced dataset which was processed with the proposed technique and evaluated with the machine learning algorithm. For the first time, an unbalanced dataset was treated with a combination of Closest Distance Ranking and Principal Component Analysis.
Competing Interests: Declaration of competing interest The authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
(Copyright © 2022 Diabetes India. Published by Elsevier Ltd. All rights reserved.)
Databáze: MEDLINE