Punctuation and lexicon aid representation: A hybrid model for short text sentiment analysis on social media platform

Autor: Zhenyu Li, Zongfeng Zou
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Journal of King Saud University: Computer and Information Sciences, Vol 36, Iss 3, Pp 102010- (2024)
Druh dokumentu: article
ISSN: 1319-1578
DOI: 10.1016/j.jksuci.2024.102010
Popis: Sentiment analysis measures user experience on social media. With the emergence of pre-trained models, text classification tasks have become homogeneous, without a significant improvement in accuracy. Therefore, we present a hybrid model called PLASA to classify the sentiment polarity of short comments, particularly barrages. PLASA introduces a collaborative attention module that integrates information about relative position and knowledge from summarized lexicons to better adjust the relationship between word representations. Our model is evaluated using three new curated sentiment analysis datasets: SentiTikTok-2023 (4613 items), SentiBilibili-2023 (7755 items), and SentiWeibo-2023 (5614 items). Although the comment length varies across datasets, all maintain a consistent punctuation percentage at approximately 13%. Consequently, PLASA with the optimal combination demonstrates notable performance improvements compared to both the baseline and commonly used models. It achieves micro-F1 scores of 93.94%, 90.34%, and 88.79% on the respective datasets. We also observed that the representation capacity of the pre-trained model decreases as the text length increases. Moreover, the proposed collaborative attention module effectively addresses this limitation, as confirmed by our ablation study.
Databáze: Directory of Open Access Journals