Arabic Offensive and Hate Speech Detection Using a Cross-Corpora Multi-Task Learning Model

Autor: Wassen Aldjanabi, Abdelghani Dahou, Mohammed A. A. Al-qaness, Mohamed Abd Elaziz, Ahmed Mohamed Helmi, Robertas Damaševičius
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: Informatics, Vol 8, Iss 4, p 69 (2021)
Druh dokumentu: article
ISSN: 2227-9709
DOI: 10.3390/informatics8040069
Popis: As social media platforms offer a medium for opinion expression, social phenomena such as hatred, offensive language, racism, and all forms of verbal violence have increased spectacularly. These behaviors do not affect specific countries, groups, or communities only, extending beyond these areas into people’s everyday lives. This study investigates offensive and hate speech on Arab social media to build an accurate offensive and hate speech detection system. More precisely, we develop a classification system for determining offensive and hate speech using a multi-task learning (MTL) model built on top of a pre-trained Arabic language model. We train the MTL model on the same task using cross-corpora representing a variation in the offensive and hate context to learn global and dataset-specific contextual representations. The developed MTL model showed a significant performance and outperformed existing models in the literature on three out of four datasets for Arabic offensive and hate speech detection tasks.
Databáze: Directory of Open Access Journals