30-Month revalidation to determine the temporal stability of machine learning models for detecting online gambling-related harms

Autor: W. Spencer Murch, Sylvia Kairouz, Martin French
Rok vydání: 2023
Popis: Background and Aims: Machine learning algorithms can detect at-risk online gamblers by analyzing patterns in betting behaviour. Example models have been tested in several jurisdictions, but their performance over time has not been assessed. We investigated the temporal stability of two existing models after 30-months (2019 – 2022). We further aimed to identify potential sources of model degradation, and strategies to restore prior classification performance.Design: Revalidation of a large-scale study linking participants’ self-reported gambling problems to their online gambling behaviours.Setting: Online gambling website operated by a Canadian provincial gambling operator.Participants: Adults aged 18+ (N = 11,258) who completed a survey and participated in online gambling. Measurements: Two binary dependent variables based on validated risk thresholds for the Problem Gambling Severity Index (PGSI) identified participants who reported a high- (PGSI 8+) or moderate-to-high (PGSI 5+) risk of past-year gambling problems. Previously-developed machine learning models made predictions about these dependent variables based on 10 inputs derived from participants’ deposits, betting behaviour and account-level data on the site.Findings: Significant changes between the prior validation study and current revalidation analysis were evident in threshold-dependent metrics (sensitivity and specificity), as well as the models’ Area Under the Precision-Recall Curve (AUPRC; PGSI 5+ ΔAUPRC = +2.87%; t(20401) = 2.83, p = .004, 95% CI [60.59, 61.57]; PGSI 8+ ΔAUPRC = +7.06%; t(20401) = 7.21, p < .001, 95% CI [44.47, 45.62]). These changes may be attributable to observed drifts in the distributions of the input and dependent variables. Redeveloping the models’ decision thresholds restored previously-observed levels of classification performance for threshold-dependent metrics only.Conclusion: Machine learning models predicting PGSI risk categories via indicators of online gambling behaviour may continue to function adequately 30 months after validation. These results provide preliminary support for their utility in real-world detection tasks, and speak to the temporal stability of the behavioural profile of problematic gambling.
Databáze: OpenAIRE