Popis: |
Diabetes mellitus is one of the deadliest incurable diseases globally, and its cases continue upward. The identification of the disease in an early way helps fight it; however, blood tests can be considered invasive, discouraging its accomplishment. In this vein, this work aims to build a model as an alternative to traditional exams to identify the disease. Statistical learning algorithms such as logistic regression, K-nearestneighbors, decision trees, random forest, and support vector machines were used for diabetes classification. These models were considered separately and combined via hard and soft voting classifiers. Themethods were applied to a widely known dataset of 768 individuals and nine variables, compared using several accuracy metrics based on the confusion matrix, and used to estimate the probability of diabetesfor a given profile. |