Generating fMRI-Enriched Acoustic Vectors using a Cross-Modality Adversarial Network for Emotion Recognition
Autor: | Jeng-Lin Li, Chi-Chun Lee, Gao-Yi Chao, Chun-Min Chang, Ya-Tse Wu |
---|---|
Rok vydání: | 2018 |
Předmět: |
medicine.diagnostic_test
Adversarial network Computer science Cross modality Speech recognition 02 engineering and technology ENCODE 03 medical and health sciences 0302 clinical medicine 0202 electrical engineering electronic engineering information engineering medicine 020201 artificial intelligence & image processing Emotion recognition Functional magnetic resonance imaging Classifier (UML) 030217 neurology & neurosurgery |
Zdroj: | ICMI |
DOI: | 10.1145/3242969.3242992 |
Popis: | Automatic emotion recognition has long been developed by concentrating on modeling human expressive behavior. At the same time, neuro-scientific evidences have shown that the varied neuro-responses (i.e., blood oxygen level-dependent (BOLD) signals measured from the functional magnetic resonance imaging (fMRI)) is also a function on the types of emotion perceived. While past research has indicated that fusing acoustic features and fMRI improves the overall speech emotion recognition performance, obtaining fMRI data is not feasible in real world applications. In this work, we propose a cross modality adversarial network that jointly models the bi-directional generative relationship between acoustic features of speech samples and fMRI signals of human percetual responses by leveraging a parallel dataset. We encode the acoustic descriptors of a speech sample using the learned cross modality adversarial network to generate the fMRI-enriched acoustic vectors to be used in the emotion classifier. The generated fMRI-enriched acoustic vector is evaluated not only in the parallel dataset but also in an additional dataset without fMRI scanning. Our proposed framework significantly outperform using acoustic features only in a four-class emotion recognition task for both datasets, and the use of cyclic loss in learning the bi-directional mapping is also demonstrated to be crucial in achieving improved recognition rates. |
Databáze: | OpenAIRE |
Externí odkaz: |