A smart video analytical framework for sarcasm detection using novel adaptive fusion network and SarcasNet-99 model.

Autor: Murthy, Jamuna S., Siddesh, G. M.
Předmět:
Zdroj: Visual Computer; Nov2024, Vol. 40 Issue 11, p8085-8097, 13p
Abstrakt: Sarcasm is often related to something that has created a mass confusion among the general uninformed public. It is always associated with a mockery tone or trenchancy facial expression or weird language. Existing literatures that are profound in the field of sarcasm detection mainly focused on text-based input with sarcastic comments or facial expression-based analysis, i.e., image input. But both text and image input are not sufficient to analyze the underlying sarcasm behind the scene. This kind of analysis can also be misleading sometimes as the emotional expression can change with social circumstances (i.e., audio tone) over time. Hence to address these challenges, "A Smart Video Analytical framework for Sarcasm Detection using Deep Learning" is introduced where sarcasm detection is done by considering video modality. Proposed model extracts three important features from the video, i.e., text using proposed Enhanced-BERT, image using ImageNet and audio using Librosa. After extraction, each modality is addressed individually and is finally fused using proposed adaptive early fusion approach. The final task prediction of classification is done using novel deep neural network called "SarcasNet-99" to detect sarcasm in video over distributed framework called Apache Storm. TedX and GIF Reply datasets are used for model training and testing with around 10,000 + video clips. When compared against existing state-of-the-art techniques such as AlexNet, DenseNet, SqueezeNet and ResNet, the proposed model predicted accuracy 99.005% with LeakyReLU activation function. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index