Multicohort study testing the generalisability of the SASKit-ML stroke and PDAC prognostic model pipeline to other chronic diseases.

Autor: Palmer D; Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Rostock, Germany., Henze L; Department of Medicine, Clinic III, Hematology, Oncology, Palliative Medicine, Rostock University Medical Center, Rostock, Germany.; Department of Internal Medicine II - Hematology, Oncology and Palliative Medicine, Asklepios Hospital Group Harz Mountains, Goslar, Germany., Murua Escobar H; Department of Medicine, Clinic III, Hematology, Oncology, Palliative Medicine, Rostock University Medical Center, Rostock, Germany., Walter U; Department of Neurology, Rostock University Medical Center, Rostock, Germany., Kowald A; Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Rostock, Germany., Fuellen G; Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Rostock, Germany fuellen@uni-rostock.de.
Jazyk: angličtina
Zdroj: BMJ open [BMJ Open] 2024 Sep 30; Vol. 14 (9), pp. e088181. Date of Electronic Publication: 2024 Sep 30.
DOI: 10.1136/bmjopen-2024-088181
Abstrakt: Objectives: To validate and test the generalisability of the SASKit-ML pipeline, a prepublished feature selection and machine learning pipeline for the prediction of health deterioration after a stroke or pancreatic adenocarcinoma event, by using it to identify biomarkers of health deterioration in chronic disease.
Design: This is a validation study using a predefined protocol applied to multiple publicly available datasets, including longitudinal data from cohorts with type 2 diabetes (T2D), inflammatory bowel disease (IBD), rheumatoid arthritis (RA) and various cancers. The datasets were chosen to mimic as closely as possible the SASKit cohort, a prospective, longitudinal cohort study.
Data Sources: Public data were used from the T2D (77 patients with potential pre-diabetes and 18 controls) and IBD (49 patients with IBD and 12 controls) branches of the Human Microbiome Project (HMP), RA Map (RA-MAP, 92 patients with RA, 22 controls) and The Cancer Genome Atlas (TCGA, 16 cancers).
Methods: Data integration steps were performed in accordance with the prepublished study protocol, generating features to predict disease outcomes using 10-fold cross-validated random survival forests.
Outcome Measures: Health deterioration was assessed using disease-specific clinical markers and endpoints across different cohorts. In the HMP-T2D cohort, the worsening of glycated haemoglobin (HbA1c) levels (5.7% or more HbA1c in the blood), fasting plasma glucose (at least 100 mg/dL) and oral glucose tolerance test (at least 140) results were considered. For the HMP-IBD cohort, a worsening by at least 3 points of a disease-specific severity measure, the "Simple Clinical Colitis Activity Index" or "Harvey-Bradshaw Index" indicated an event. For the RA-MAP cohort, the outcome was defined as the worsening of the "Disease Activity Score 28" or "Simple Disease Activity Index" by at least five points, or the worsening of the "Health Assessment Questionnaire" score or an increase in the number of swollen/tender joints were evaluated. Finally, the outcome for all TCGA datasets was the progression-free interval.
Results: Models for the prediction of health deterioration in T2D, IBD, RA and 16 cancers were produced. The T2D (C-index of 0.633 and Integrated Brier Score (IBS) of 0.107) and the RA (C-index of 0.654 and IBS of 0.150) models were modestly predictive. The IBD model was uninformative. TCGA models tended towards modest predictive power.
Conclusions: The SASKit-ML pipeline produces informative and useful features with the power to predict health deterioration in a variety of diseases and cancers; however, this performance is disease-dependent.
Competing Interests: Competing interests: UW reports grants and personal fees from Merz Pharma, personal fees from Amarin, personal fees from Bristol-Myers Squibb, personal fees from Canon Medical Systems, personal fees from Daiichi Sankyo, personal fees from Ipsen Pharma, personal fees from Pfizer, personal fees from Thieme and personal fees from Elsevier Press, all outside the submitted work.
(© Author(s) (or their employer(s)) 2024. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.)
Databáze: MEDLINE