Identification of the risk factors of type 2 diabetes and its prediction using machine learning techniques.

Autor: Islam MM; Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh.; Department of Statistics, Jatiya Kabi Kazi Nazrul Islam University, Mymensingh, Bangladesh., Rahman MJ; Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh., Menhazul Abedin M; Statistics Discipline, Khulna University, Khulna, Bangladesh., Ahammed B; Statistics Discipline, Khulna University, Khulna, Bangladesh., Ali M; Statistics Discipline, Khulna University, Khulna, Bangladesh., Ahmed NAMF; Institute of Education and Research, University of Rajshahi, Rajshahi, Bangladesh., Maniruzzaman M; Statistics Discipline, Khulna University, Khulna, Bangladesh.
Jazyk: angličtina
Zdroj: Health systems (Basingstoke, England) [Health Syst (Basingstoke)] 2022 Nov 05; Vol. 12 (2), pp. 243-254. Date of Electronic Publication: 2022 Nov 05 (Print Publication: 2023).
DOI: 10.1080/20476965.2022.2141141
Abstrakt: This study identified the risk factors for type 2 diabetes (T2D) and proposed a machine learning (ML) technique for predicting T2D. The risk factors for T2D were identified by multiple logistic regression (MLR) using p-value (p<0.05). Then, five ML-based techniques, including logistic regression, naïve Bayes, J48, multilayer perceptron, and random forest (RF) were employed to predict T2D. This study utilized two publicly available datasets, derived from the National Health and Nutrition Examination Survey, 2009-2010 and 2011-2012. About 4922 respondents with 387 T2D patients were included in 2009-2010 dataset, whereas 4936 respondents with 373 T2D patients were included in 2011-2012. This study identified six risk factors (age, education, marital status, SBP, smoking, and BMI) for 2009-2010 and nine risk factors (age, race, marital status, SBP, DBP, direct cholesterol, physical activity, smoking, and BMI) for 2011-2012. RF-based classifier obtained 95.9% accuracy, 95.7% sensitivity, 95.3% F-measure, and 0.946 area under the curve.
Competing Interests: No potential conflict of interest was reported by the author(s).
(© 2022 The Operational Research Society.)
Databáze: MEDLINE