Zobrazeno 1 - 10
of 3 804
pro vyhledávání: '"Habash, A."'
This study presents the ``Arabic Derivational ChainBank,'' a novel framework for modeling Arabic derivational morphology. It establishes connections between forms and meanings by constructing a chain of derived words that reflect their derivational s
Externí odkaz:
http://arxiv.org/abs/2410.20463
This paper presents the foundational framework and initial findings of the Balanced Arabic Readability Evaluation Corpus (BAREC) project, designed to address the need for comprehensive Arabic language resources aligned with diverse readability levels
Externí odkaz:
http://arxiv.org/abs/2410.08674
Autor:
Abassy, Mervat, Elozeiri, Kareem, Aziz, Alexander, Ta, Minh Ngoc, Tomar, Raj Vardhan, Adhikari, Bimarsha, Ahmed, Saad El Dine, Wang, Yuxia, Afzal, Osama Mohammed, Xie, Zhuohan, Mansurov, Jonibek, Artemova, Ekaterina, Mikhailov, Vladislav, Xing, Rui, Geng, Jiahui, Iqbal, Hasan, Mujahid, Zain Muhammad, Mahmoud, Tarek, Tsvigun, Akim, Aji, Alham Fikri, Shelmanov, Artem, Habash, Nizar, Gurevych, Iryna, Nakov, Preslav
The ease of access to large language models (LLMs) has enabled a widespread of machine-generated texts, and now it is often hard to tell whether a piece of text was human-written or machine-generated. This raises concerns about potential misuse, part
Externí odkaz:
http://arxiv.org/abs/2408.04284
Autor:
Zaghouani, Wajdi, Jarrar, Mustafa, Habash, Nizar, Bouamor, Houda, Zitouni, Imed, Diab, Mona, El-Beltagy, Samhaa R., AbuOdeh, Muhammed
We present an overview of the FIGNEWS shared task, organized as part of the ArabicNLP 2024 conference co-located with ACL 2024. The shared task addresses bias and propaganda annotation in multilingual news posts. We focus on the early days of the Isr
Externí odkaz:
http://arxiv.org/abs/2407.18147
Autor:
Abdul-Mageed, Muhammad, Keleg, Amr, Elmadany, AbdelRahim, Zhang, Chiyu, Hamed, Injy, Magdy, Walid, Bouamor, Houda, Habash, Nizar
We describe the findings of the fifth Nuanced Arabic Dialect Identification Shared Task (NADI 2024). NADI's objective is to help advance SoTA Arabic NLP by providing guidance, datasets, modeling opportunities, and standardized evaluation conditions t
Externí odkaz:
http://arxiv.org/abs/2407.04910
Automatic readability assessment is relevant to building NLP applications for education, content analysis, and accessibility. However, Arabic readability assessment is a challenging task due to Arabic's morphological richness and limited readability
Externí odkaz:
http://arxiv.org/abs/2407.03032
Autor:
Alhafni, Bashar, Al-Towaity, Sarah, Fawzy, Ziyad, Nassar, Fatema, Eryani, Fadhl, Bouamor, Houda, Habash, Nizar
Dialectal Arabic is the primary spoken language used by native Arabic speakers in daily communication. The rise of social media platforms has notably expanded its use as a written language. However, Arabic dialects do not have standard orthographies.
Externí odkaz:
http://arxiv.org/abs/2407.03020
The widespread absence of diacritical marks in Arabic text poses a significant challenge for Arabic natural language processing (NLP). This paper explores instances of naturally occurring diacritics, referred to as "diacritics in the wild," to unveil
Externí odkaz:
http://arxiv.org/abs/2406.05760
We present the SAMER Corpus, the first manually annotated Arabic parallel corpus for text simplification targeting school-aged learners. Our corpus comprises texts of 159K words selected from 15 publicly available Arabic fiction novels most of which
Externí odkaz:
http://arxiv.org/abs/2404.18615
Autor:
Lynn, Teresa, Altakrori, Malik H., Magdy, Samar Mohamed, Das, Rocktim Jyoti, Lyu, Chenyang, Nasr, Mohamed, Samih, Younes, Aji, Alham Fikri, Nakov, Preslav, Godbole, Shantanu, Roukos, Salim, Florian, Radu, Habash, Nizar
The rapid evolution of Natural Language Processing (NLP) has favored major languages such as English, leaving a significant gap for many others due to limited resources. This is especially evident in the context of data annotation, a task whose impor
Externí odkaz:
http://arxiv.org/abs/2404.17342