Speech emotion recognition using data augmentation

Autor:	V. M. Praseetha, P. P. Joby
Rok vydání:	2021
Předmět:	Linguistics and Language Computer science business.industry Speech recognition Deep learning Feature vector SIGNAL (programming language) Feature extraction Filter bank Language and Linguistics Call centre Human-Computer Interaction Preprocessor Robot Computer Vision and Pattern Recognition Artificial intelligence business Software
Zdroj:	International Journal of Speech Technology. 25:783-792
ISSN:	1572-8110 1381-2416
DOI:	10.1007/s10772-021-09883-3
Popis:	Humans are considered as emotional beings and so the uttered speech reflect the human emotions. Human computer interaction can be done more effectively by automatically identifying the emotions from speech. Automatic speech emotion recognition is applied in many areas like computer gaming, call centre, speech therapy controlling robots etc. Emotion recognition can be considered as feature space to label space mapping. From the uttered speech, the different features are calculated. Then, to automatically recognize the emotions, the relationship between the emotions and the features are learned. The required preprocessing is done with the collected training samples and the features are extracted from the speech signals. The extracted feature vectors are stored in the database. When the input signal comes, the preprocessing and feature extraction are done and the extracted features are compared with the feature vectors in the database to determine the emotion in that speech signal. We have developed a deep learning model for speech emotion recognition with GRU which take the filterbank energies of the speech signals as input. To overcome the problem with the availability of database and to increase the number of input samples, we have applied data augmentation.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::64b5f84000e74e56dbe91dd2ef7f262d https://doi.org/10.1007/s10772-021-09883-3 Zobrazit plný text záznamu Full text from SpringerLink