Popis: |
Objective. Improving health literacy in infectious diseases is a direct manifestation of the solid advance in disease control and prevention. Our study is aimed at exploring applying synthetic minority oversampling technique (SMOTE) in the prediction assessment of whether residents and business employees have infectious disease health literacy. Methods. The Chinese resident infectious disease health literacy evaluation scale was used to investigate the associated variables. The screened variables were input variables and the presence or absence of infectious diseases health literacy as outcome variables. Logistic regression, random forest, and support vector machine (SVM) models were built in the data sets before and after treatment by the SMOTE algorithm, respectively, and the performance of the models was evaluated by receiver operating characteristic curves (ROC). Results. Logistic regression, random forest, and SVM achieved accuracies of 0.828, 0.612, and 0.654 before SMOTE algorithm processing, and the areas under the ROC curves (AUCs) of the three models were 0.754, 0.817, and 0.759, respectively. The accuracies were 0.938, 0.911, and 0.894 after SMOTE algorithm processing, and the AUCs of the three models were 0.913, 0.925, and 0.910, respectively. Conclusions. The random forest model based on the SMOTE has high application value in assessing whether residents versus enterprise employees have infectious disease health literacy. |