Improved Generalization of Arabic Text Classifiers

Autor:	Alaa Khaddaj, Wassim El-Hajj, Hazem Hajj
Rok vydání:	2019
Předmět:	Artificial neural network Computer science Arabic business.industry 02 engineering and technology 010501 environmental sciences computer.software_genre 01 natural sciences language.human_language Robustness (computer science) Classifier (linguistics) 0202 electrical engineering electronic engineering information engineering language 020201 artificial intelligence & image processing Artificial intelligence business Transfer of learning computer Natural language processing 0105 earth and related environmental sciences
Zdroj:	WANLP@ACL 2019
Popis:	While transfer learning for text has been very active in the English language, progress in Arabic has been slow, including the use of Domain Adaptation (DA). Domain Adaptation is used to generalize the performance of any classifier by trying to balance the classifier’s accuracy for a particular task among different text domains. In this paper, we propose and evaluate two variants of a domain adaptation technique: the first is a base model called Domain Adversarial Neural Network (DANN), while the second is a variation that incorporates representational learning. Similar to previous approaches, we propose the use of proxy A-distance as a metric to assess the success of generalization. We make use of ArSentDLEV, a multi-topic dataset collected from the Levantine countries, to test the performance of the models. We show the superiority of the proposed method in accuracy and robustness when dealing with the Arabic language.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::b6f9a5e0d477857810d8dd0320ffa915 https://doi.org/10.18653/v1/w19-4618 Zobrazit plný text záznamu