Improving Information Retrieval Through a Global Term Weighting Scheme

Autor: Elva Díaz, Daniel Cuellar, Eunice Esther Ponce-de-Leon-Senti
Rok vydání: 2015
Předmět:
Zdroj: Lecture Notes in Computer Science ISBN: 9783319192635
MCPR
DOI: 10.1007/978-3-319-19264-2_24
Popis: The output of an information retrieval system is an ordered list of documents corresponding to the user query, represented by an input list of terms. This output relies on the estimated similarity between each document and the query. This similarity depends in turn on the weighting scheme used for the terms of the document index. Term weighting then plays a big role in the estimation of the aforementioned similarity. This paper proposes a new term weighting approach for information retrieval based on the marginal frequencies. Consisting of the global count of term frequencies over the corpus of documents, while conventional term weighting schemes such as the normalized term frequency takes into account the term frequencies for particular documents. The presented experiment shows the advantages and disadvantages of the proposed retrieval scheme. Performance measures such as precision and recall and F-Score are used over classical benchmarks such as CACM to validate the experimental results.
Databáze: OpenAIRE