Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening.

Autor: Cai T; Brigham and Women's Hospital, Boston, Massachusetts, United States., Cai F; Massachusetts Institute of Technology, Cambridge, Massachusetts, United States., Dahal KP; Brigham and Women's Hospital, Boston, Massachusetts, United States., Cremone G; Brigham and Women's Hospital, Boston, Massachusetts, United States., Lam E; Brigham and Women's Hospital, Boston, Massachusetts, United States., Golnik C; Brigham and Women's Hospital, Boston, Massachusetts, United States., Seyok T; Brigham and Women's Hospital, Boston, Massachusetts, United States., Hong C; Harvard University, Boston, Massachusetts, United States., Cai T; Harvard University, Boston, Massachusetts, United States., Liao KP; Brigham and Women's Hospital, Harvard University, and Veterans Affairs Boston Healthcare System, Boston, Massachusetts, United States.
Jazyk: angličtina
Zdroj: ACR open rheumatology [ACR Open Rheumatol] 2021 Sep; Vol. 3 (9), pp. 593-600. Date of Electronic Publication: 2021 Jul 23.
DOI: 10.1002/acr2.11289
Abstrakt: Objective: Efficiently identifying eligible patients is a crucial first step for a successful clinical trial. The objective of this study was to test whether an approach using electronic health record (EHR) data and an ensemble machine learning algorithm incorporating billing codes and data from clinical notes processed by natural language processing (NLP) can improve the efficiency of eligibility screening.
Methods: We studied patients screened for a clinical trial of rheumatoid arthritis (RA) with one or more International Classification of Diseases (ICD) code for RA and age greater than 35 years, from a tertiary care center and a community hospital. The following three groups of EHR features were considered for the algorithm: 1) structured features, 2) the counts of NLP concepts from notes, 3) health care utilization. All features were linked to dates. We applied random forest and logistic regression with least absolute shrinkage and selection operator penalty against the following two standard approaches: 1) one or more RA ICD code and no ICD codes related to exclusion criteria (Screen RAICD1 +EX ) and 2) two or more RA ICD codes (Screen RAICD2 ). To test the portability, we trained the algorithm at one institution and tested it at the other.
Results: In total, 3359 patients at Brigham and Women's Hospital (BWH) and 642 patients at Faulkner Hospital (FH) were studied, with 461 (13.7%) eligible patients at BWH and 84 (13.4%) at FH. The application of the algorithm reduced ineligible patients from chart review by 40.5% at the tertiary care center and by 57.0% at the community hospital. In contrast, Screen RAICD2 reduced patients for chart review by 2.7% to 11.3%; Screen RAICD1+EX reduced patients for chart review by 63% to 65% but excluded 22% to 27% of eligible patients.
Conclusion: The ensemble machine learning algorithm incorporating billing codes and NLP data increased the efficiency of eligibility screening by reducing the number of patients requiring chart review while not excluding eligible patients. Moreover, this approach can be trained at one institution and applied at another for multicenter clinical trials.
(© 2021 The Authors. ACR Open Rheumatology published by Wiley Periodicals LLC on behalf of American College of Rheumatology.)
Databáze: MEDLINE