Mawdoo3 AI at MADAR Shared Task: Arabic Fine-Grained Dialect Identification with Ensemble Learning
Autor: | Ahmad Mustafa, Haitham Seelawi, Mostafa Samir, Bashar Talafha, Abed Alhakim Freihat, Mohammad Zaghloul, Ahmad Ragab, Hesham Al-Bataineh, Abdelrahman Mattar, Hussein T. Al-Natsheh |
---|---|
Rok vydání: | 2019 |
Předmět: |
Ensemble forecasting
Computer science business.industry Arabic 02 engineering and technology computer.software_genre Ensemble learning language.human_language Task (project management) Set (abstract data type) Identification (information) Ranking 020204 information systems 0202 electrical engineering electronic engineering information engineering language Modern Standard Arabic 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing |
Zdroj: | WANLP@ACL 2019 |
DOI: | 10.18653/v1/w19-4630 |
Popis: | In this paper we discuss several models we used to classify 25 city-level Arabic dialects in addition to Modern Standard Arabic (MSA) as part of MADAR shared task (sub-task 1). We propose an ensemble model of a group of experimentally designed best performing classifiers on a various set of features. Our system achieves an accuracy of 69.3% macro F1-score with an improvement of 1.4% accuracy from the baseline model on the DEV dataset. Our best run submitted model ranked as third out of 19 participating teams on the TEST dataset with only 0.12% macro F1-score behind the top ranked system. |
Databáze: | OpenAIRE |
Externí odkaz: |