A model for predicting academic performance on standardised tests for lagging regions based on machine learning and Shapley additive explanations

Autor: Mario Suaza-Medina, Rita Peñabaena-Niebles, Maria Jubiz-Diaz
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Scientific Reports, Vol 14, Iss 1, Pp 1-17 (2024)
Druh dokumentu: article
ISSN: 2045-2322
DOI: 10.1038/s41598-024-76596-3
Popis: Abstract Data are becoming more important in education since they allow for the analysis and prediction of future behaviour to improve academic performance and quality at educational institutions. However, academic performance is affected by regions’ conditions, such as demographic, psychographic, socioeconomic and behavioural variables, especially in lagging regions. This paper presents a methodology based on applying nine classification algorithms and Shapley values to identify the variables that influence the performance of the Colombian standardised test: the Saber 11 exam. This study is innovative because, unlike others, it applies to lagging regions and combines the use of EDM and Shapley values to predict students’ academic performance and analyse the influence of each variable on academic performance. The results show that the algorithms with the best accuracy are Extreme Gradient Boosting Machine, Light Gradient Boosting Machine, and Gradient Boosting Machine. According to the Shapley values, the most influential variables are the socioeconomic level index, gender, region, location of the educational institution, and age. For Colombia, the results showed that male students from urban educational institutions over 18 years have the best academic performance. Moreover, there are differences in educational quality among the lagging regions. Students from Nariño have advantages over ones from other departments. The proposed methodology allows for generating public policies better aligned with the reality of lagging regions and achieving equity in access to education.
Databáze: Directory of Open Access Journals
Nepřihlášeným uživatelům se plný text nezobrazuje