Popis: |
When conducting classification tests, one of the most difficult challenges that can occur is ensuring that a high degree of accuracy is maintained in spite of the presence of unbalanced data sets. Achieving a high accuracy result in a classification study in which a class with a large number of samples can be better learned does not, however, provide information about the efficiency of the results of the other classes, and the accuracy provides conclusions that are misleading due to the fact that the results are so accurate. Using this strategy, it is possible to classify the great majority of students into a range of different categories (pass/fail, risky/not hazardous, etc.). When dealing with data that is not evenly distributed, the F1-score and the ROC AUC score are more accurate evaluations of the overall performance of the model compared to the other metrics. On the other hand, certain measurements, such as recall and precision, represent the level of achievement for lessons and provide direction for understanding the material covered in those classes. If the findings of the study solely depend on the accuracy metric, then it is possible that it will be challenging to integrate these findings into reality. |