Abstrakt: |
Predictive modelling in the education domain can be utilised to significantly improve teaching and learning experiences. Massive Open Online Courses (MOOCs) generate a large volume of data that can be exploited to predict and evaluate student performance based on various factors. This paper has two broad aims. Firstly, to develop and tune several Machine Learning (ML) models to perform classification tasks on the dataset to predict student performance, including Linear Regression, Logistic Regression, Random Forests, K-Nearest Neighbours, and more. Secondly, to evaluate the efficacy of these ML models and identify those which are best suited to this task. The categories of data utilised in achieving these aims include (i) demographic information, (ii) academic background, and (iii) interaction with MOOC course materials. The research procedure comprises five phases: data exploration to analyse the dataset, feature engineering which involves discerning the most important features and converting them into a format decipherable by the ML models, model building, model evaluation by measurement of accuracy, and subsequent comparative evaluation between the different models. The results achieved in this study are expected to have implications on how MOOC platforms utilise data to improve user experience. As indicated by the findings of this study, the data collected by these platforms may be used to predict performance with accuracy of over 77%; this extracted information can be exploited to enhance educational theory or practices in the context of MOOCs, for instance by implementing varying teaching methodologies or providing different types of resources based on predicted performance. [ABSTRACT FROM AUTHOR] |