Semantically enhanced pseudo relevance feedback for Arabic information retrieval
Autor: | Jaffar Atwan, Masnizah Mohd, Hasan Rashaideh, Ghassan Kanaan |
---|---|
Rok vydání: | 2015 |
Předmět: |
Arabic
Computer science InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL WordNet Relevance feedback 02 engineering and technology Library and Information Sciences computer.software_genre Query expansion 0202 electrical engineering electronic engineering information engineering Semantic relationship Semantic information Information retrieval Recall business.industry 05 social sciences Search engine indexing language.human_language language 020201 artificial intelligence & image processing Artificial intelligence 0509 other social sciences 050904 information & library sciences business computer Natural language processing Information Systems |
Zdroj: | Journal of Information Science. 42:246-260 |
ISSN: | 1741-6485 0165-5515 |
DOI: | 10.1177/0165551515594722 |
Popis: | The conventional information retrieval (IR) framework consists of four primary phases, namely, pre-processing, indexing, querying and retrieving results. Some phases of the current Arabic IR (AIR) framework have several drawbacks. This research aims to enhance an AIR by improving the processes in a conventional IR framework. We introduce an enhanced stop-word list in the pre-processing level and investigate several Arabic stemmers. In addition, an Arabic WordNet was utilized in the corpus and query expansion levels. We also adopted semantic information for the Pseudo Relevance Feedback. The enhanced Arabic IR framework was built and evaluated using TREC 2001 data. The technique of using the Arabic WordNet to build a semantic relationship between query and corpus in two levels, that is, the corpus and query levels, is a new one. The enhanced AIR framework demonstrated an improvement by 49% in terms of mean average precision, with an increase of 7.3% in recall compared with the baseline framework. |
Databáze: | OpenAIRE |
Externí odkaz: |