Implementation of an Information Retrieval System Using the Soft Cosine Measure
Autor: | Juan Javier González Barbosa, Lucía Janeth Hernández González, Juan Frausto Solís, Rogelio Florencia-Juárez, Martha B. Mojica Mata, Guadalupe Castilla Valdés, J. David Terán-Villanueva |
---|---|
Rok vydání: | 2016 |
Předmět: |
0209 industrial biotechnology
Measure (data warehouse) Information retrieval Computer science Computer Science::Information Retrieval Concurrency InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL Cosine similarity 02 engineering and technology Domain (software engineering) Set (abstract data type) 020901 industrial engineering & automation Similarity (network science) Information model 0202 electrical engineering electronic engineering information engineering Vector space model 020201 artificial intelligence & image processing |
Zdroj: | Nature-Inspired Design of Hybrid Intelligent Systems ISBN: 9783319470535 Nature-Inspired Design of Hybrid Intelligent Systems |
Popis: | The retrieval information models have been of important study since 1992. These models are based on comparing a user query and a collection of documents taking into account the concurrency of the terms, with the objective to classify a set of relevant documents and retrieve them to the user in accordance with the evaluations criterion. There are metrics to classify a set of documents according to the grade of similarity, such as cosine similarity and soft cosine measure. In this paper, we perform a comparative study of these similarity metrics. The Vector Space Model (VSM) was implemented for retrieving information. A sample of the Collection of the Association for Computing Machinery (CACM) in the domain of Computer Science was used in the evaluation. The experiment results show that the recall is of 96 % in both metrics, but the soft cosine achieves 2 % more in mean average precision. |
Databáze: | OpenAIRE |
Externí odkaz: |