Multi-split Optimized Bagging Ensemble Model Selection for Multi-class Educational Data Mining

Autor: Injadat, MohammadNoor, Moubayed, Abdallah, Nassif, Ali Bou, Shami, Abdallah
Rok vydání: 2020
Předmět:
Druh dokumentu: Working Paper
DOI: 10.1007/s10489-020-01776-3
Popis: Predicting students' academic performance has been a research area of interest in recent years with many institutions focusing on improving the students' performance and the education quality. The analysis and prediction of students' performance can be achieved using various data mining techniques. Moreover, such techniques allow instructors to determine possible factors that may affect the students' final marks. To that end, this work analyzes two different undergraduate datasets at two different universities. Furthermore, this work aims to predict the students' performance at two stages of course delivery (20% and 50% respectively). This analysis allows for properly choosing the appropriate machine learning algorithms to use as well as optimize the algorithms' parameters. Furthermore, this work adopts a systematic multi-split approach based on Gini index and p-value. This is done by optimizing a suitable bagging ensemble learner that is built from any combination of six potential base machine learning algorithms. It is shown through experimental results that the posited bagging ensemble models achieve high accuracy for the target group for both datasets.
Comment: 29 Pages, 13 Figures, 19 Tables, Accepted in Springer's Applied Intelligence
Databáze: arXiv