COVID-19: Comparative Analysis of Methods for Identifying Articles Related to Therapeutics and Vaccines without Using Labeled Data

Autor: Parmar, Mihir, Ambalavanan, Ashwin Karthik, Guan, Hong, Banerjee, Rishab, Pabla, Jitesh, Devarakonda, Murthy
Rok vydání: 2021
Předmět:
Druh dokumentu: Working Paper
Popis: Here we proposed an approach to analyze text classification methods based on the presence or absence of task-specific terms (and their synonyms) in the text. We applied this approach to study six different transfer-learning and unsupervised methods for screening articles relevant to COVID-19 vaccines and therapeutics. The analysis revealed that while a BERT model trained on search-engine results generally performed well, it miss-classified relevant abstracts that did not contain task-specific terms. We used this insight to create a more effective unsupervised ensemble.
Comment: 6 pages, 3 Tables, Appendix
Databáze: arXiv