A novel perceptual feature set for audio emotion recognition

Autor:	Mehmet Cenk Sezgin, Bilge Gunsel, Gunes Karabulut Kurt
Rok vydání:	2011
Předmět:	PEAQ business.industry Computer science Speech recognition media_common.quotation_subject Feature extraction Pattern recognition computer.software_genre Speaker recognition Database normalization Perception Artificial intelligence Valence (psychology) Sound quality business Audio signal processing computer media_common
Zdroj:	FG
DOI:	10.1109/fg.2011.5771348
Popis:	We present a novel system for audio emotion recognition based on the Perceptual Evaluation of Audio Quality (PEAQ) model as described by the standard, ITU-R BS.1387–1 which provides a mathematical model resembling the human auditory system. The introduced feature set performs perceptual analysis in time, spectral and Bark domains thus enabling us to represent the statistics of emotional audio for arousal and valence modes with a small number of features. Unlike the existing systems, the proposed feature set learns statistical characteristic of emotional differences hence does not require data normalization to eliminate speaker or corpus dependency. Recognition performance obtained for the well known VAM and EMO-DB corpora show that the classification accuracy achieved by the proposed feature set outperforms the reported benchmarking results particularly for valence both for natural and acted emotional data.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::882ea516fd4586bb67202f9980ef39dc https://doi.org/10.1109/fg.2011.5771348 Zobrazit plný text záznamu