Toward an emotion efficient architecture based on the sound spectrum from the voice of Portuguese speakers.

Autor: Filho, Geraldo P. Rocha, Meneguette, Rodolfo I., Mendonça, Fábio Lúcio Lopes de, Enamoto, Liriam, Pessin, Gustavo, Gonçalves, Vinícius P.
Předmět:
Zdroj: Neural Computing & Applications; Nov2024, Vol. 36 Issue 32, p19939-19950, 12p
Abstrakt: One of the main challenges in the process of recognizing emotion through the voice are related to the specific characteristics of an individual's sound spectrum, such as accent and speech rhythm, as well as regionalism and wide variability of spoken phrases. Despite efforts to propose emotion recognition models, providing an increase in accuracy in classifying emotion in a specialized way is an open research question. Faced with these challenges, this work proposes DEEP (DEtection of voice Emotion in Portuguese language), an architecture for detecting voice emotion based on patterns present in the sound spectrum generated by the voice of Brazilian Portuguese speakers. DEEP recognizes each emotion by using a set of specialist Convolutional Neural Networks that receive as input the features extracted from the sound spectrum. With this, DEEP aims to specialize each emotion to increase the rate of correct answers and adapt to different tones and voice conditions that may occur in everyday life. Our results show that DEEP outperforms the emotion recognition measures of other state of art techniques for all evaluated scenarios. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index