An Apache Spark-Based Platform for Predicting the Performance of Undergraduate Students

Autor: Thong Le Mai, Nam Thoai, Phat Thanh Do, Minh Thanh Chung
Rok vydání: 2019
Předmět:
Zdroj: HPCC/SmartCity/DSS
DOI: 10.1109/hpcc/smartcity/dss.2019.00041
Popis: Nowadays, Education Data Mining (EDM) plays a very important role in higher education institutions. Plenty of algorithms have been employed to measure student's GPA in the next semester's courses. The results can be used to early identify dropout students or help students choose the elective courses which are appropriate for them. The most widely used methods are machine learning, however, the problem is the accuracy which can be changed from dataset to dataset. More importantly, the performance of prediction models can be affected by the characteristic of dataset associated with the applied model. In this paper, we build a distributed platform on Spark to predict missing grades of elective courses for undergraduate students. The paper compares several methods that are based on the combination of Collaborative Filtering & Matrix Factorization (namely Alternative Least Square). We evaluate the performance of these algorithms using a dataset provided by Ho Chi Minh University of Technology (HCMUT). The dataset consists of information about undergraduate students from 2006 to 2017. Depending on the characteristics of our dataset, the paper highlights that Alternative Least Square with non-negative constraint achieves the better results than others in comparison.
Databáze: OpenAIRE