Geographical queries reformulation using a parallel association rules generator to build spatial taxonomies
Autor: | Omar El Midaoui, Btihal El Ghali, Moulay Driss Rahmani, Abderrahim El Qadi |
---|---|
Rok vydání: | 2021 |
Předmět: |
Structure (mathematical logic)
Information retrieval General Computer Science Association rule learning Computer science business.industry Big data Parallel FP-growth algorithm Spatial entity Context (language use) Geographical query Spatial relation Search engine Reformulation Taxonomy (general) Spark (mathematics) Machine learning Adjacency list Electrical and Electronic Engineering business Database transaction |
Popis: | Geographical queries need a special process of reformulation by information retrieval systems (IRS) due to their specificities and hierarchical structure. This fact is ignored by most of web search engines. In this paper, we propose an automatic approach for building a spatial taxonomy, that models’ the notion of adjacency that will be used in the reformulation of the spatial part of a geographical query. This approach exploits the documents that are in top of the retrieved list when submitting a spatial entity, which is composed of a spatial relation and a noun of a city. Then, a transactional database is constructed, considering each document extracted as a transaction that contains the nouns of the cities sharing the country of the submitted query’s city. The algorithm frequent pattern growth (FP-growth) is applied to this database in his parallel version (parallel FP-growth: PFP) in order to generate association rules, that will form the country’s taxonomy in a Big Data context. Experiments has been conducted on Spark and their results show that query reformulation using the taxonomy constructed based on our proposed approach improves the precision and the effectiveness of the IRS. |
Databáze: | OpenAIRE |
Externí odkaz: |