JIDR: Towards building hybrid Arabic stemmer
Autor: | Rima Rouibia, Mohamed Amine Cheragui, Imane Belhadj |
---|---|
Rok vydání: | 2017 |
Předmět: |
Arabic
Computer science business.industry media_common.quotation_subject Affix Information technology computer.software_genre language.human_language Set (abstract data type) Originality language Canonical form Artificial intelligence Architecture business computer Word (computer architecture) Natural language processing media_common |
Zdroj: | 2017 International Conference on Mathematics and Information Technology (ICMIT). |
DOI: | 10.1109/mathit.2017.8259714 |
Popis: | The Arabic Language Processing knew these last decades a massive rise, giving birth to specialized tools like: information retrieval systems, text categorization, orthographic correctors, generators of words, automatic summarizers, etc… However the development of such tools strongly relies on of a certain number of basic modules such as Stemming, which consists of converting each word to into canonical form. The aim of this paper is to present our system JIDR a stemmer of Arabic text. Where the originality of the work, resides in the fact of cohabiting three different techniques (dictionary, removing affix and morphological analysis). But also, in the architecture set up to treat weak verbs. |
Databáze: | OpenAIRE |
Externí odkaz: |