Sara Detection on Social Media Using Deep Learning Algorithm Development

Autor: M. Khairul Anam, Lucky Lhaura Van FC, Hamdani Hamdani, Rahmaddeni Rahmaddeni, Junadhi Junadhi, Muhammad Bambang Firdaus, Irwanda Syahputra, Yuda Irawan
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Journal of Applied Engineering and Technological Science, Vol 6, Iss 1 (2024)
Druh dokumentu: article
ISSN: 2715-6087
2715-6079
DOI: 10.37385/jaets.v6i1.5390
Popis: Social media has become a key platform for disseminating information and opinions, particularly in Indonesia, where SARA (Ethnicity, Religion, Race, and Intergroup) issues can fuel social tensions. To address this, developing an automated system to detect and classify harmful content is essential. This study develops a deep learning model using Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) to detect SARA-related comments on Twitter. The method involves data collection through web scraping, followed by cleaning, manual labeling, and text preprocessing. To address data imbalance, SMOTE (Synthetic Minority Over-sampling Technique) is applied, while early stopping prevents overfitting. Model performance is evaluated using precision, recall, and F1-score. The results demonstrate that SMOTE significantly improves model performance, particularly in detecting minority-class SARA comments. CNN+SMOTE achieves a accuracy of 93%, and BiLSTM+SMOTE records a recall of 88%, effectively capturing patterns in SARA and non-SARA data. With SMOTE and early stopping, the model successfully manages class imbalance and reduces overfitting. This research supports efforts to curtail hate speech on social media, especially in the Indonesian context, where SARA-related issues often dominate public discourse.
Databáze: Directory of Open Access Journals