Zobrazeno 1 - 10
of 3 723
pro vyhledávání: '"A. Habash"'
Autor:
Abassy, Mervat, Elozeiri, Kareem, Aziz, Alexander, Ta, Minh Ngoc, Tomar, Raj Vardhan, Adhikari, Bimarsha, Ahmed, Saad El Dine, Wang, Yuxia, Afzal, Osama Mohammed, Xie, Zhuohan, Mansurov, Jonibek, Artemova, Ekaterina, Mikhailov, Vladislav, Xing, Rui, Geng, Jiahui, Iqbal, Hasan, Mujahid, Zain Muhammad, Mahmoud, Tarek, Tsvigun, Akim, Aji, Alham Fikri, Shelmanov, Artem, Habash, Nizar, Gurevych, Iryna, Nakov, Preslav
The widespread accessibility of large language models (LLMs) to the general public has significantly amplified the dissemination of machine-generated texts (MGTs). Advancements in prompt manipulation have exacerbated the difficulty in discerning the
Externí odkaz:
http://arxiv.org/abs/2408.04284
Autor:
Zaghouani, Wajdi, Jarrar, Mustafa, Habash, Nizar, Bouamor, Houda, Zitouni, Imed, Diab, Mona, El-Beltagy, Samhaa R., AbuOdeh, Muhammed
We present an overview of the FIGNEWS shared task, organized as part of the ArabicNLP 2024 conference co-located with ACL 2024. The shared task addresses bias and propaganda annotation in multilingual news posts. We focus on the early days of the Isr
Externí odkaz:
http://arxiv.org/abs/2407.18147
Autor:
Abdul-Mageed, Muhammad, Keleg, Amr, Elmadany, AbdelRahim, Zhang, Chiyu, Hamed, Injy, Magdy, Walid, Bouamor, Houda, Habash, Nizar
We describe the findings of the fifth Nuanced Arabic Dialect Identification Shared Task (NADI 2024). NADI's objective is to help advance SoTA Arabic NLP by providing guidance, datasets, modeling opportunities, and standardized evaluation conditions t
Externí odkaz:
http://arxiv.org/abs/2407.04910
Automatic readability assessment is relevant to building NLP applications for education, content analysis, and accessibility. However, Arabic readability assessment is a challenging task due to Arabic's morphological richness and limited readability
Externí odkaz:
http://arxiv.org/abs/2407.03032
Autor:
Alhafni, Bashar, Al-Towaity, Sarah, Fawzy, Ziyad, Nassar, Fatema, Eryani, Fadhl, Bouamor, Houda, Habash, Nizar
Dialectal Arabic is the primary spoken language used by native Arabic speakers in daily communication. The rise of social media platforms has notably expanded its use as a written language. However, Arabic dialects do not have standard orthographies.
Externí odkaz:
http://arxiv.org/abs/2407.03020
The widespread absence of diacritical marks in Arabic text poses a significant challenge for Arabic natural language processing (NLP). This paper explores instances of naturally occurring diacritics, referred to as "diacritics in the wild," to unveil
Externí odkaz:
http://arxiv.org/abs/2406.05760
We present the SAMER Corpus, the first manually annotated Arabic parallel corpus for text simplification targeting school-aged learners. Our corpus comprises texts of 159K words selected from 15 publicly available Arabic fiction novels most of which
Externí odkaz:
http://arxiv.org/abs/2404.18615
Autor:
Lynn, Teresa, Altakrori, Malik H., Magdy, Samar Mohamed, Das, Rocktim Jyoti, Lyu, Chenyang, Nasr, Mohamed, Samih, Younes, Aji, Alham Fikri, Nakov, Preslav, Godbole, Shantanu, Roukos, Salim, Florian, Radu, Habash, Nizar
The rapid evolution of Natural Language Processing (NLP) has favored major languages such as English, leaving a significant gap for many others due to limited resources. This is especially evident in the context of data annotation, a task whose impor
Externí odkaz:
http://arxiv.org/abs/2404.17342
Autor:
Wang, Yuxia, Mansurov, Jonibek, Ivanov, Petar, Su, Jinyan, Shelmanov, Artem, Tsvigun, Akim, Afzal, Osama Mohammed, Mahmoud, Tarek, Puccetti, Giovanni, Arnold, Thomas, Whitehouse, Chenxi, Aji, Alham Fikri, Habash, Nizar, Gurevych, Iryna, Nakov, Preslav
Publikováno v:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
We present the results and the main findings of SemEval-2024 Task 8: Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection. The task featured three subtasks. Subtask A is a binary classification task determining whether a tex
Externí odkaz:
http://arxiv.org/abs/2404.14183
We present ZAEBUC-Spoken, a multilingual multidialectal Arabic-English speech corpus. The corpus comprises twelve hours of Zoom meetings involving multiple speakers role-playing a work situation where Students brainstorm ideas for a certain topic and
Externí odkaz:
http://arxiv.org/abs/2403.18182