Speech Emotion Recognition Using ANN on MFCC Features

Autor:	Harshit Dolka, Sujitha Juliet, Arul Xavier V M
Rok vydání:	2021
Předmět:	Signal processing Computer science media_common.quotation_subject Speech recognition Feature extraction 02 engineering and technology Disgust Data set 030507 speech-language pathology & audiology 03 medical and health sciences Surprise 0202 electrical engineering electronic engineering information engineering Signal processing algorithms 020201 artificial intelligence & image processing Mel-frequency cepstrum Emotion recognition 0305 other medical science media_common
Zdroj:	2021 3rd International Conference on Signal Processing and Communication (ICPSC).
DOI:	10.1109/icspc51351.2021.9451810
Popis:	Speech Emotion Recognition (SER) is one of the active research topics in Human-Computer Interaction. This paper focuses on training an ANN Model for SER using Mel Frequency Cepstral Coefficients (MFCCs) feature extraction and training it on selected audio datasets to compare the performance. The model can classify audio files based on a total of eight emotional states: happy, sad, angry, surprise, disgust, calm and neutral, although the number of emotions varies in selected datasets. The proposed model gives an average accuracy of 99.52% on the TESS data set, 88.72% on the RAVDESS data set, 71.69% on the CREMA data set, and 86.80% on the SAVEE data set.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::66cb97708088196b894cd7fe9ec47954 https://doi.org/10.1109/icspc51351.2021.9451810 Zobrazit plný text záznamu