A comparison of machine learning methods to find clinical trials for inclusion in new systematic reviews from their PROSPERO registrations prior to searching and screening.

Autor: Liu S; Biomedical Informatics and Digital Health, Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia., Bourgeois FT; Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.; Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA., Narang C; Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA., Dunn AG; Biomedical Informatics and Digital Health, Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia.; Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.
Jazyk: angličtina
Zdroj: Research synthesis methods [Res Synth Methods] 2024 Jan; Vol. 15 (1), pp. 73-85. Date of Electronic Publication: 2023 Sep 25.
DOI: 10.1002/jrsm.1672
Abstrakt: Searching for trials is a key task in systematic reviews and a focus of automation. Previous approaches required knowing examples of relevant trials in advance, and most methods are focused on published trial articles. To complement existing tools, we compared methods for finding relevant trial registrations given a International Prospective Register of Systematic Reviews (PROSPERO) entry and where no relevant trials have been screened for inclusion in advance. We compared SciBERT-based (extension of Bidirectional Encoder Representations from Transformers) PICO extraction, MetaMap, and term-based representations using an imperfect dataset mined from 3632 PROSPERO entries connected to a subset of 65,662 trial registrations and 65,834 trial articles known to be included in systematic reviews. Performance was measured by the median rank and recall by rank of trials that were eventually included in the published systematic reviews. When ranking trial registrations relative to PROSPERO entries, 296 trial registrations needed to be screened to identify half of the relevant trials, and the best performing approach used a basic term-based representation. When ranking trial articles relative to PROSPERO entries, 162 trial articles needed to be screened to identify half of the relevant trials, and the best-performing approach used a term-based representation. The results show that MetaMap and term-based representations outperformed approaches that included PICO extraction for this use case. The results suggest that when starting with a PROSPERO entry and where no trials have been screened for inclusion, automated methods can reduce workload, but additional processes are still needed to efficiently identify trial registrations or trial articles that meet the inclusion criteria of a systematic review.
(© 2023 The Authors. Research Synthesis Methods published by John Wiley & Sons Ltd.)
Databáze: MEDLINE