Arabic Light Stemmer Based on ISRI Stemmer

Autor: Khudhair Abed Thamer, Dhafar Hamed Abd, Abir Hussain, Wasiq Khan
Rok vydání: 2021
Předmět:
Zdroj: Intelligent Computing Theories and Application ISBN: 9783030845315
ICIC (3)
DOI: 10.1007/978-3-030-84532-2_4
Popis: The process of stemming is considered as one of the most essential steps in natural language processing and retrieving information. Nevertheless, in Arabic language, the task of stemming remains a major challenge due to the fact that Arabic language has a particular morphology, thereby making it different from other languages. Majority of existing algorithms are limited to a given number of words, create ambiguity between original letters and affixes, and often make use of dictionary patterns or words. We therefore, for the first time, present a design and implementation of Arabic light stemmer based on Information Science Research Institute algorithm. The algorithm is evaluated empirically using a newly created Arabic dataset which was created using data from different Arabic websites with contents that have been written in modern Arabic language. The experimental results indicated that the proposed method outperforms when benchmarked with existing methods.
Databáze: OpenAIRE