Design of Efficient Speech Emotion Recognition Based on Multi Task Learning

Autor:	Liu Yunxiang, Zhang Kexin
Jazyk:	angličtina
Rok vydání:	2023
Předmět:	Speech emotion recognition multi-task learning noise reduction eliminating gender differences hidden layer sharing data balance Electrical engineering. Electronics. Nuclear engineering TK1-9971
Zdroj:	IEEE Access, Vol 11, Pp 5528-5537 (2023)
Druh dokumentu:	article
ISSN:	2169-3536
DOI:	10.1109/ACCESS.2023.3237268
Popis:	Speech emotion recognition technology includes feature extraction and classifier construction. However, the recognition efficiency is reduced due to noise interference and gender differences. To solve this problem, this paper used two multi-task learning models based on adversarial multi-task learning(ASP-MTL). The first model took emotion recognition as the main task and noise recognition as the auxiliary task, and removed the noise part identified by the auxiliary task. After identifying the non-noise part, the second model was constructed. The second model took emotion recognition as the main task and gender classification as the auxiliary task. These two multi-task learning models can not only can use shared information to learn the relationship between different tasks, but also can identify specific tasks. This paper used Audio/Visual Emotion Challenge (AVEC) database and AFEW6.0 database,which were recorded in the field environment. Considering the problem of data imbalance between datasets, the data balance operation was carried out on the data sets in the process of data preprocessing. The paper shows an increase of around 10% in terms of accuracy and F1 score with the recent works using AVEC database and AFEW6.0 datasets, which proved that this paper has made a great progress in SER.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/2e73d40ea36f4fa79513136a3e096c0c Zobrazit plný text záznamu View record in DOAJ