A Purely Entity-Based Semantic Search Approach for Document Retrieval

Autor: Mohamed Lemine Sidi, Serkan Gunal
Jazyk: angličtina
Rok vydání: 2023
Předmět:
Zdroj: Applied Sciences, Vol 13, Iss 18, p 10285 (2023)
Druh dokumentu: article
ISSN: 2076-3417
DOI: 10.3390/app131810285
Popis: Over the past decade, knowledge bases (KB) have been increasingly utilized to complete and enrich the representation of queries and documents in order to improve the document retrieval task. Although many approaches have used KB for such purposes, the problem of how to effectively leverage entity-based representation still needs to be resolved. This paper proposes a Purely Entity-based Semantic Search Approach for Information Retrieval (PESS4IR) as a novel solution. The approach includes (i) its own entity linking method and (ii) an inverted indexing method, and for document retrieval and ranking, (iii) an appropriate ranking method is designed to take advantage of all the strengths of the approach. We report the findings on the performance of our approach, which is tested by queries annotated by two known entity linking tools, REL and DBpedia-Spotlight. The experiments are performed on the standard TREC 2004 Robust and MSMARCO collections. By using the REL method on the Robust collection, for the queries whose terms are all annotated and whose average annotation scores are greater than or equal to 0.75, our approach achieves the maximum nDCG@5 score (1.00). Also, it is shown that using PESS4IR alongside another document retrieval method would improve performance, unless that method alone achieves the maximum nDCG@5 score for those highly annotated queries.
Databáze: Directory of Open Access Journals