Automatic Emotion Recognition Using Temporal Multimodal Deep Learning

Autor: Frederic Maire, Bahareh Nakisa, Vinod Chandran, Mohammad Naim Rastgoo, Andry Rakotonirainy
Rok vydání: 2020
Předmět:
General Computer Science
Computer science
Speech recognition
Emotion classification
Feature extraction
convolutional neural network
Wearable computer
02 engineering and technology
010501 environmental sciences
Electroencephalography
01 natural sciences
Convolutional neural network
ComputerApplications_MISCELLANEOUS
0202 electrical engineering
electronic engineering
information engineering

medicine
General Materials Science
Electrical and Electronic Engineering
0105 earth and related environmental sciences
blood volume pulse
Modality (human–computer interaction)
medicine.diagnostic_test
business.industry
Deep learning
General Engineering
temporal multimodal fusion
020201 artificial intelligence & image processing
Emotion recognition
lcsh:Electrical engineering. Electronics. Nuclear engineering
Artificial intelligence
long short-term memory
business
lcsh:TK1-9971
electroencephalography
Zdroj: IEEE Access, Vol 8, Pp 225463-225474 (2020)
ISSN: 2169-3536
Popis: Emotion recognition using miniaturised wearable physiological sensors has emerged as a revolutionary technology in various applications. However, detecting emotions using the fusion of multiple physiological signals remains a complex and challenging task. When fusing physiological signals, it is essential to consider the ability of different fusion approaches to capture the emotional information contained within and across modalities. Moreover, since physiological signals consist of time-series data, it becomes imperative to consider their temporal structures in the fusion process. In this study, we propose a temporal multimodal fusion approach with a deep learning model to capture the non-linear emotional correlation within and across electroencephalography (EEG) and blood volume pulse (BVP) signals and to improve the performance of emotion classification. The performance of the proposed model is evaluated using two different fusion approaches - early fusion and late fusion. Specifically, we use a convolutional neural network (ConvNet) long short-term memory (LSTM) model to fuse the EEG and BVP signals to jointly learn and explore the highly correlated representation of emotions across modalities, after learning each modality with a single deep network. The performance of the temporal multimodal deep learning model is validated on our dataset collected from smart wearable sensors and is also compared with results of recent studies. The experimental results show that the temporal multimodal deep learning models, based on early and late fusion approaches, successfully classified human emotions into one of four quadrants of dimensional emotions with an accuracy of 71.61% and 70.17%, respectively.
Databáze: OpenAIRE