Popis: |
Emotional speech recognition and synthesis of expressive speech are highly dependable on the availability of emotional speech corpora. In this paper, we present the creation and verification of the Serbian Emotional Amateur Cellphone Speech Corpus (SEAC), which was released by the University of Novi Sad, Faculty of Technical Sciences in 2022, as the first amateur emotional speech corpus in Serbian language, recorded over cellphones. The corpus contains emotional speech elicited from 53 different speakers (24 male and 29 female) in 5 different emotional states (neutral, happiness, sadness, fear and anger), and its total duration amounts to approximately 8 hours of speech data. Initial objective evaluation of the corpus has confirmed high correlation between the behaviour of acoustic parameters corresponding to different emotional states in the newly recorded corpus and the existing Serbian emotional speech corpus recorded by 6 professional actors, which was used as a source for reference recordings. The corpus was further evaluated through listening tests concerned with human emotion recognition. Finally, we present the results of experiments concerning emotion recognition and speaker recognition by several conventional machine learning systems carried out on the corpus, as well as the results of a cross-lingual emotion recognition experiment involving a state-of-the-art machine learning system based on deep convolutional neural networks. |