Detecting Arabic sexual harassment using bidirectional long-short-term memory and a temporal convolutional network

Autor: Noor Amer Hamzah, Ban N. Dhannoon
Jazyk: angličtina
Rok vydání: 2023
Předmět:
Zdroj: Egyptian Informatics Journal, Vol 24, Iss 2, Pp 365-373 (2023)
Druh dokumentu: article
ISSN: 1110-8665
DOI: 10.1016/j.eij.2023.05.007
Popis: Due to advances in technology, social media has become the most popular medium for spreading news. Many messages are published on social media sites such as Facebook, Twitter, Instagram, etc. Social media platforms also provide opportunities to express opinions and social phenomena such as hate, offensive language, racism, sexual content, and all forms of verbal violence, which have amazingly increased. These behaviors do not only affect specific countries, groups, or societies but extend beyond these areas into people's daily lives. This study examines sexual content and harassment discourse in Arabic social media to build an accurate system for detecting sexual harassment expressions. The dataset was collected from Twitter posts to make the classification. A deep learning model was developed as a classification system to identify sexual speech using Bidirectional Long-Short-Term Memory (BiLSTM), Temporal Convolutional Network (TCN) with word embedding and the FastText previously trained on the Arabic language model. The proposed (TCN-BiLSTM) model was compared with Extreme Gradient Boosting (XGBoost). The CASH dataset implemented with the (TCN -Bi-LSTM) model gate obtained an accuracy rate of 96.65% and an F0.5 value of 0.969. The implementation of XGBoost using word embeddings resulted in an accuracy rate of 92.56% and an F0.5 value of 0.925. Findings and manual interpretation showed that different text representation methods with various deep learning algorithms obtain higher classification performance easily in complex sentences. This strategy is helpful with languages that are difficult to study morphologically, like Arabic, Turkish, and Lithuanian.
Databáze: Directory of Open Access Journals