Predicting Type 2 Diabetes Using Logistic Regression and Machine Learning Approaches
Autor: | Chandra K. Dhakal, Ram D. Joshi |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
0301 basic medicine
Health Toxicology and Mutagenesis Decision tree Disease Type 2 diabetes Logistic regression Machine learning computer.software_genre Article Body Mass Index 03 medical and health sciences 0302 clinical medicine Pregnancy Risk Factors Diabetes mellitus decision tree medicine prediction accuracy Humans 030212 general & internal medicine business.industry Decision tree learning Public Health Environmental and Occupational Health medicine.disease 030104 developmental biology Logistic Models machine learning Diabetes Mellitus Type 2 diabetes risk factors Medicine Female Artificial intelligence business Body mass index computer |
Zdroj: | International Journal of Environmental Research and Public Health, Vol 18, Iss 7346, p 7346 (2021) International Journal of Environmental Research and Public Health Volume 18 Issue 14 |
ISSN: | 1661-7827 1660-4601 |
Popis: | Diabetes mellitus is one of the most common human diseases worldwide and may cause several health-related complications. It is responsible for considerable morbidity, mortality, and economic loss. A timely diagnosis and prediction of this disease could provide patients with an opportunity to take the appropriate preventive and treatment strategies. To improve the understanding of risk factors, we predict type 2 diabetes for Pima Indian women utilizing a logistic regression model and decision tree—a machine learning algorithm. Our analysis finds five main predictors of type 2 diabetes: glucose, pregnancy, body mass index (BMI), diabetes pedigree function, and age. We further explore a classification tree to complement and validate our analysis. The six-fold classification tree indicates glucose, BMI, and age are important factors, while the ten-node tree implies glucose, BMI, pregnancy, diabetes pedigree function, and age as the significant predictors. Our preferred specification yields a prediction accuracy of 78.26% and a cross-validation error rate of 21.74%. We argue that our model can be applied to make a reasonable prediction of type 2 diabetes, and could potentially be used to complement existing preventive measures to curb the incidence of diabetes and reduce associated costs. |
Databáze: | OpenAIRE |
Externí odkaz: |