Comparative Analysis of Nine Arabic Stemmers on Microblog Information Retrieval

Autor:	Amal Almazrua, Manal Almazrua, Hend S. Al-Khalifa
Rok vydání:	2020
Předmět:	Information retrieval Machine translation Grammar Arabic Computer science Microblogging Document classification media_common.quotation_subject computer.software_genre language.human_language Information science Task analysis language Social media computer media_common
Zdroj:	IALP
DOI:	10.1109/ialp51396.2020.9310456
Popis:	Stemming has shown to be effective in many natural language processing (NLP) applications such as in document classification, machine translation, and information retrieval (IR). This paper compares the performance of nine stemmers for Arabic language on microblog IR. These stemmers include: Information Science Research Institute (ISRI), Tashaphyne, Khoja, AL-stem, Light10, Motaz, Assem, Farasa, and ARLStem. Each stemmer was studied independently using the EveTAR dataset on a specific information retrieval task to obtain relevant query tweets. The performance of the nine stemmers was evaluated using BM25, precision at 30, and Mean Average Precision (MAP). The results show that root-based stemmers (i.e. ISRI and Khoja) outperformed others.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::1b1420577ae01ce2be8becf86611c100 https://doi.org/10.1109/ialp51396.2020.9310456 Zobrazit plný text záznamu