Sentiment Analysis on a Large Indonesian Product Review Dataset

Autor: Ade Romadhony, Said Al Faraby, Rita Rismala, Untari Novia Wisesty, Anditya Arifianto
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Journal of Information Systems Engineering and Business Intelligence, Vol 10, Iss 1, Pp 167-178 (2024)
Druh dokumentu: article
ISSN: 2598-6333
2443-2555
DOI: 10.20473/jisebi.10.1.167-178
Popis: Background: The publicly available large dataset plays an important role in the development of the natural language processing/computational linguistic research field. However, up to now, there are only a few large Indonesian language datasets accessible for research purposes, including sentiment analysis datasets, where sentiment analysis is considered the most popular task. Objective: The objective of this work is to present sentiment analysis on a large Indonesian product review dataset, employing various features and methods. Two tasks have been implemented: classifying reviews into three classes (positive, negative, neutral), and predicting ratings. Methods: Sentiment analysis was conducted on the FDReview dataset, comprising over 700,000 reviews. The analysis treated sentiment as a classification problem, employing the following methods: Multinomial Naí¯ve Bayes (MNB), Support Vector Machine (SVM), LSTM, and BiLSTM. Result: The experimental results indicate that in the comparison of performance using conventional methods, MNB outperformed SVM in rating prediction, whereas SVM exhibited better performance in the review classification task. Additionally, the results demonstrate that the BiLSTM method outperformed all other methods in both tasks. Furthermore, this study includes experiments conducted on balanced and unbalanced small-sized sample datasets. Conclusion: Analysis of the experimental results revealed that the deep learning-based method performed better only in the large dataset setting. Results from the small balanced dataset indicate that conventional machine learning methods exhibit competitive performance compared to deep learning approaches. Keywords: Indonesian review dataset, Large dataset, Rating prediction, Sentiment analysis
Databáze: Directory of Open Access Journals