Benchmarking and assessing the performance of Arabic stemmers

Autor:	Qasem A. Al-Radaideh, Khalid W. Akkawi, Mohammed N. Al-Kabi
Rok vydání:	2011
Předmět:	Source code business.industry Arabic Computer science media_common.quotation_subject Benchmarking Library and Information Sciences computer.software_genre language.human_language Benchmark (computing) language Artificial intelligence business computer Natural language processing Information Systems media_common
Zdroj:	Journal of Information Science. 37:111-119
ISSN:	1741-6485 0165-5515
DOI:	10.1177/0165551510392305
Popis:	Previous studies on the stemming of the Arabic language lack fair evaluation, full description of algorithms used or access to the source code of the stemmers and the datasets used to evaluate such stemmers. Freeing source codes and datasets is an essential step to enable researchers to enhance stemmers currently in use and to verify the results of these studies. This study laid the foundation of establishing a benchmark for Arabic stemmers and presents an evaluation of four heavy (root-based) stemmers for the Arabic language. The evaluation aims to assess the accuracy of each of the four stemmers and to show the strength of each. The four algorithms are: Al-Mustafa stemmer, Al-Sarhan stemmer, Rabab’ah stemmer and Taghva stemmer. The accuracy and strength tests used in this study ranked Rabab’ah stemmer as the first followed by Al-Sarhan, Al-Mustafa, and Taghva stemmers respectively.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::015aabf2b67e16878bb854e8e21a5d9f https://doi.org/10.1177/0165551510392305 Zobrazit plný text záznamu