Implementation of an Information Retrieval System Using the Soft Cosine Measure

Autor: Juan Javier González Barbosa, Lucía Janeth Hernández González, Juan Frausto Solís, Rogelio Florencia-Juárez, Martha B. Mojica Mata, Guadalupe Castilla Valdés, J. David Terán-Villanueva
Rok vydání: 2016
Předmět:
Zdroj: Nature-Inspired Design of Hybrid Intelligent Systems ISBN: 9783319470535
Nature-Inspired Design of Hybrid Intelligent Systems
Popis: The retrieval information models have been of important study since 1992. These models are based on comparing a user query and a collection of documents taking into account the concurrency of the terms, with the objective to classify a set of relevant documents and retrieve them to the user in accordance with the evaluations criterion. There are metrics to classify a set of documents according to the grade of similarity, such as cosine similarity and soft cosine measure. In this paper, we perform a comparative study of these similarity metrics. The Vector Space Model (VSM) was implemented for retrieving information. A sample of the Collection of the Association for Computing Machinery (CACM) in the domain of Computer Science was used in the evaluation. The experiment results show that the recall is of 96 % in both metrics, but the soft cosine achieves 2 % more in mean average precision.
Databáze: OpenAIRE