Machine learning using genetic and clinical data identifies a signature that robustly predicts methotrexate response in rheumatoid arthritis

Autor: Lee Jin, Lim, Ashley J W, Lim, Brandon N S, Ooi, Justina Wei Lynn, Tan, Ee Tzun, Koh, Samuel S, Chong, Chiea Chuen, Khor, Lisa, Tucker-Kellogg, Caroline G, Lee, Chuanhui, Xu
Rok vydání: 2022
Předmět:
Zdroj: Rheumatology. 61:4175-4186
ISSN: 1462-0332
1462-0324
DOI: 10.1093/rheumatology/keac032
Popis: Objective To develop a hypothesis-free model that best predicts response to MTX drug in RA patients utilizing biologically meaningful genetic feature selection of potentially functional single nucleotide polymorphisms (pfSNPs) through robust machine learning (ML) feature selection methods. Methods MTX-treated RA patients with known response were divided in a 4:1 ratio into training and test sets. From the patients’ exomes, potential features for classifier prediction were identified from pfSNPs and non-genetic factors through ML using recursive feature elimination with cross-validation incorporating the random forest classifier. Feature selection was repeated on random subsets of the training cohort, and consensus features were assembled into the final feature set. This feature set was evaluated for predictive potential using six ML classifiers, first by cross-validation within the training set, and finally by analysing its performance with the unseen test set. Results The final feature set contains 56 pfSNPs and five non-genetic factors. The majority of these pfSNPs are located in pathways related to RA pathogenesis or MTX action and are predicted to modulate gene expression. When used for training in six ML classifiers, performance was good in both the training set (area under the curve: 0.855–0.916; sensitivity: 0.715–0.892; and specificity: 0.733–0.862) and the unseen test set (area under the curve: 0.751–0.826; sensitivity: 0.581–0.839; and specificity: 0.641–0.923). Conclusion Sensitive and specific predictors of MTX response in RA patients were identified in this study through a novel strategy combining biologically meaningful and machine learning feature selection and training. These predictors may facilitate better treatment decision-making in RA management.
Databáze: OpenAIRE