Speech Emotion Recognition Using Cross-Correlation and Acoustic Features

Autor:	Garima Vyas, Zhen Liu, Joyjit Chatterjee, Vajja Mukesh, Hui-Huang Hsu
Rok vydání:	2018
Předmět:	Zero-crossing rate Computer science Speech recognition Emotion classification Feature extraction Lexical analysis 02 engineering and technology Spectral centroid 030507 speech-language pathology & audiology 03 medical and health sciences ComputingMethodologies_PATTERNRECOGNITION Formant 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Mel-frequency cepstrum 0305 other medical science Utterance
Zdroj:	DASC/PiCom/DataCom/CyberSciTech
Popis:	Speech emotion recognition is a trending research topic these days, with its main motive to improve the human-machine interaction. At present, most of the work in this area utilizes extraction of discriminatory features for the purpose of classification of emotions into various categories. Most of the present work involves use of only Mel Frequency Cepstral Coefficients (MFCCs) as an integral feature for emotion recognition. In some other works, the utterance of words is used for lexical analysis for emotion recognition, which is language dependent. In this paper, two different techniques are utilized for classifying emotions into Angry, Happy or Neutral categories. In the first technique, the maximum cross correlation between audio files is computed for labeling the speech data into one of the three emotion categories. Accordingly, a function is developed in MATLAB for Identification of an emotion for any audio file passed as an argument. The second technique makes use of six discriminatory features, namely, Energy, Volume, MFCC, Zero Crossing Rate, Formants and Spectral Centroid. These features are used as predictors for the purpose of classification of emotions. A variety of classifiers are used through the MATLAB classification learner toolbox, and an accuracy of 91.3% is achieved using the Cubic SVM Classifier. The proposed techniques pave way for a real-time prototype for speech emotion recognition in the near future.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::a4c23ccd86930dc9deb6781d6e0b0159 https://doi.org/10.1109/dasc/picom/datacom/cyberscitec.2018.00050 Zobrazit plný text záznamu