COVID-19 Literature Topic-Based Search via Hierarchical NMF
Autor: | Alona Kryshchenko, Yihuan Huang, Kyung Ha, Elizaveta Rebrova, Longxiu Huang, Deanna Needell, Xia Li, Rachel Grotheer, Oleksandr Kryshchenko, Pengyu Li |
---|---|
Rok vydání: | 2020 |
Předmět: |
Structure (mathematical logic)
FOS: Computer and information sciences Computer Science - Machine Learning Information retrieval Coronavirus disease 2019 (COVID-19) Computer science Computer Science - Digital Libraries Literature based Machine Learning (stat.ML) Scientific literature Computer Science - Information Retrieval Non-negative matrix factorization Machine Learning (cs.LG) Tree structure Statistics - Machine Learning Digital Libraries (cs.DL) Information Retrieval (cs.IR) |
Zdroj: | NLP4COVID@EMNLP |
DOI: | 10.48550/arxiv.2009.09074 |
Popis: | A dataset of COVID-19-related scientific literature is compiled, combining the articles from several online libraries and selecting those with open access and full text available. Then, hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure that allows researchers to search for relevant literature based on detected topics. We discover eight major latent topics and 52 granular subtopics in the body of literature, related to vaccines, genetic structure and modeling of the disease and patient studies, as well as related diseases and virology. In order that our tool may help current researchers, an interactive website is created that organizes available literature using this hierarchical structure. |
Databáze: | OpenAIRE |
Externí odkaz: |