Research on Speech Emotion Recognition Method Based A-CapsNet

Autor: Yingmei Qi, Heming Huang, Huiyun Zhang
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: Applied Sciences, Vol 12, Iss 24, p 12983 (2022)
Druh dokumentu: article
ISSN: 2076-3417
DOI: 10.3390/app122412983
Popis: Speech emotion recognition is a crucial work direction in speech recognition. To increase the performance of speech emotion detection, researchers have worked relentlessly to improve data augmentation, feature extraction, and pattern formation. To address the concerns of limited speech data resources and model training overfitting, A-CapsNet, a neural network model based on data augmentation methodologies, is proposed in this research. In order to solve the issue of data scarcity and achieve the goal of data augmentation, the noise from the Noisex-92 database is first combined with four different data division methods (emotion-independent random-division, emotion-dependent random-division, emotion-independent cross-validation and emotion-dependent cross-validation methods, abbreviated as EIRD, EDRD, EICV and EDCV, respectively). The database EMODB is then used to analyze and compare the performance of the model proposed in this paper under different signal-to-noise ratios, and the results show that the proposed model and data augmentation are effective.
Databáze: Directory of Open Access Journals