Intermediate Layer Attention Mechanism for Multimodal Fusion in Personality and Affect Computing

Autor:	P. Sreevidya, J. Aravinth, Sathishkumar Samiappan
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	Big-five personality traits emotion recognition fusion techniques attention Electrical engineering. Electronics. Nuclear engineering TK1-9971
Zdroj:	IEEE Access, Vol 12, Pp 112776-112793 (2024)
Druh dokumentu:	article
ISSN:	2169-3536
DOI:	10.1109/ACCESS.2024.3442377
Popis:	This article introduces a versatile multimodal architecture designed for personality-aware systems, encompassing tasks such as personality trait prediction, sentiment analysis, and emotion recognition. This is a unique attempt to develop a general pipeline that is applicable to the personality affect computing applications within the context of multimodal data. The proposed model employs task-specific feature extraction models that are appropriately trained for each application. An intermediate layer, employing both inter- and intra-attention mechanisms for fusion, is presented. This dual attention mechanism is further improved with a binary search algorithm, which is notably the key contribution of the work. This fusion models discerns distinctive features crucial for classification and regression tasks. To evaluate the system’s efficacy, short-duration video clips and corresponding transcriptions from databases were utilized. Low-level acoustic features were derived from audio signals, while high-level and mid-level audio features were extracted through a transformer-based sentence-RoBERTa model applied to audio transcripts. Visual features were obtained from context and facial images through deep face networks, followed by the use of CNN and LSTM models. Dimensionality reduction and multimodal fusion techniques were implemented prior to applying machine learning-based classification and prediction tasks. Performance metrics such as mean accuracy and squared correlation coefficients ( $R^{2}$ ) were chosen for prediction tasks, while accuracy and F1-score were employed for classification tasks. The study explored various fusion techniques and dimension-reduction approaches to establish an efficient pipeline, ultimately aiming to reduce uncertainties and enhance robustness. The results indicate that the proposed architecture performs comparably with state-of-the-art systems across all evaluated domains.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/5426081f30614229b9c48f41c9e0548a Zobrazit plný text záznamu View record in DOAJ