Single-Stage Extensive Semantic Fusion for multi-modal sarcasm detection

Autor: Hong Fang, Dahao Liang, Weiyu Xiang
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Array, Vol 22, Iss , Pp 100344- (2024)
Druh dokumentu: article
ISSN: 2590-0056
DOI: 10.1016/j.array.2024.100344
Popis: With the rise of social media and online interactions, there is a growing need for analytical models capable of understanding the nuanced, multi-modal communication inherent in platforms, especially for detecting sarcasm. Existing research employs multi-stage models along with extensive semantic information extractions and single-modal encoders. These models often struggle with efficient aligning and fusing multi-modal representations. Addressing these shortcomings, we introduce the Single-Stage Extensive Semantic Fusion (SSESF) model, designed to concurrently process multi-modal inputs in a unified framework, which performs encoding and fusing in the same architecture with shared parameters. A projection mechanism is employed to overcome the challenges posed by the diversity of inputs and the integration of a wide range of semantic information. Additionally, we design a multi-objective optimization that enhances the model’s ability to learn latent semantic nuances with supervised contrastive learning. The unified framework emphasizes the interaction and integration of multi-modal data, while multi-objective optimization preserves the complexity of semantic nuances for sarcasm detection. Experimental results on a public multi-modal sarcasm dataset demonstrate the superiority of our model, achieving state-of-the-art performance. The findings highlight the model’s capability to integrate extensive semantic information, demonstrating its effectiveness in the simultaneous interpretation and fusion of multi-modal data for sarcasm detection.
Databáze: Directory of Open Access Journals