Unsupervised title and abstract screening for systematic review: a topic modelling approach

Autor: Agnes Natukunda, Leacky Muchene
Rok vydání: 2022
DOI: 10.21203/rs.3.rs-1302520/v1
Popis: Background: The importance of systematic reviews in collating and summarising available research output on a particular topic cannot be over-emphasized. However, initial screening of retrieved literature is significantly time and labour intensive. Attempts at automating parts of the systematic review process have been made with varying degree of success partly due to being domain-specific, requiring vendor-specific software or manually labelled training data. We propose an unsupervised learning approach to screening documents’ title and abstract during systematic reviews. Methods: We implemented a Latent Dirichlet Allocation-based topic model to derive representative topics from the retrieved documents’ title and abstract. The second step involves defining a score threshold for classifying the documents as relevant for full-text review or not. The score is derived based on a set of search keywords (often the database retrieval search terms). Two systematic review studies were retrospectively used to illustrate the methodology. Results: In one case study (helminth dataset), 69.83% sensitivity compared to manual title and abstract screening was achieved. This is against a false positive rate of 22.63%. For the second case study (Wilson disease dataset), a sensitivity of 54.02% and specificity of 67.03% was achieved. Conclusions: Unsupervised title and abstract screening has the potential to reduce the workload involved in conducting systematic review. While sensitivity of the methodology on the tested data is low, approximately 70% specificity was achieved. Users still have the option to manually review a subset of documents flagged as irrelevant by the automated system.
Databáze: OpenAIRE