Relative Performance of Machine Learning and Linear Regression in Predicting Quality of Life and Academic Performance of School Children in Norway: Data Analysis of a Quasi-Experimental Study

Autor: Robert Froud, Solveig Hakestad Hansen, Hans Kristian Ruud, Jonathan Foss, Leila Ferguson, Per Morten Fredriksen
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: Journal of Medical Internet Research, Vol 23, Iss 7, p e22021 (2021)
Druh dokumentu: article
ISSN: 1438-8871
DOI: 10.2196/22021
Popis: BackgroundMachine learning techniques are increasingly being applied in health research. It is not clear how useful these approaches are for modeling continuous outcomes. Child quality of life is associated with parental socioeconomic status and physical activity and may be associated with aerobic fitness and strength. It is unclear whether diet or academic performance is associated with quality of life. ObjectiveThe purpose of this study was to compare the predictive performance of machine learning techniques with that of linear regression in examining the extent to which continuous outcomes (physical activity, aerobic fitness, muscular strength, diet, and parental education) are predictive of academic performance and quality of life and whether academic performance and quality of life are associated. MethodsWe modeled data from children attending 9 schools in a quasi-experimental study. We split data randomly into training and validation sets. Curvilinear, nonlinear, and heteroscedastic variables were simulated to examine the performance of machine learning techniques compared to that of linear models, with and without imputation. ResultsWe included data for 1711 children. Regression models explained 24% of academic performance variance in the real complete-case validation set, and up to 15% in quality of life. While machine learning techniques explained high proportions of variance in training sets, in validation, machine learning techniques explained approximately 0% of academic performance and 3% to 8% of quality of life. With imputation, machine learning techniques improved to 15% for academic performance. Machine learning outperformed regression for simulated nonlinear and heteroscedastic variables. The best predictors of academic performance in adjusted models were the child’s mother having a master-level education (P
Databáze: Directory of Open Access Journals
Nepřihlášeným uživatelům se plný text nezobrazuje