Abstrakt: |
The scientific community has reacted to the COVID-19 outbreak by producing a high number of literary works that are helping us to understand a variety of topics related to the pandemic from different perspectives. Dealing with this large amount of information can be challenging, especially when researchers need to find answers to complex questions about specific topics. We present an Information Retrieval System that uses latent information to select relevant works related to specific concepts. By applying Latent Dirichlet Allocation (LDA) models to documents, we can identify key concepts related to a specific query and a corpus. Our method is iterative in that, from an initial input query defined by the user, the original query is expanded for each subsequent iteration. In addition, our method is able to work with a limited amount of information per article. We have tested the performance of our proposal using human validation and two evaluation strategies, achieving good results in both of them. Concerning the first strategy, we performed two surveys to determine the performance of our model. For all the categories that were studied, precision was always greater than 0.6, while accuracy was always greater than 0.8. The second strategy also showed good results, achieving a precision of 1.0 for one category and scoring over 0.7 points overall. [ABSTRACT FROM AUTHOR] |