Výsledky vyhledávání - "Busso, Carlos"

Report

Describe Where You Are: Improving Noise-Robustness for Speech Emotion Recognition with Text Description of the Environment

Autor: Leem, Seong-Gyun, Fulford, Daniel, Onnela, Jukka-Pekka, Gard, David, Busso, Carlos

Speech emotion recognition (SER) systems often struggle in real-world environments, where ambient noise severely degrades their performance. This paper explores a novel approach that exploits prior knowledge of testing environments to maximize SER pe

Externí odkaz: http://arxiv.org/abs/2407.17716

Zobrazit plný text záznamu

Report

A Layer-Anchoring Strategy for Enhancing Cross-Lingual Speech Emotion Recognition

Autor: Upadhyay, Shreya G., Busso, Carlos, Lee, Chi-Chun

Cross-lingual speech emotion recognition (SER) is important for a wide range of everyday applications. While recent SER research relies heavily on large pretrained models for emotion training, existing studies often concentrate solely on the final tr

Externí odkaz: http://arxiv.org/abs/2407.04966

Zobrazit plný text záznamu

Report

We Need Variations in Speech Synthesis: Sub-center Modelling for Speaker Embeddings

Autor: Ulgen, Ismail Rasim, Busso, Carlos, Hansen, John H. L., Sisman, Berrak

In speech synthesis, modeling of rich emotions and prosodic variations present in human voice are crucial to synthesize natural speech. Although speaker embeddings have been widely used in personalized speech synthesis as conditioning inputs, they ar

Externí odkaz: http://arxiv.org/abs/2407.04291

Zobrazit plný text záznamu

Report

Towards Naturalistic Voice Conversion: NaturalVoices Dataset with an Automatic Processing Pipeline

Autor: Salman, Ali N., Du, Zongyang, Chandra, Shreeram Suresh, Ulgen, Ismail Rasim, Busso, Carlos, Sisman, Berrak

Voice conversion (VC) research traditionally depends on scripted or acted speech, which lacks the natural spontaneity of real-life conversations. While natural speech data is limited for VC, our study focuses on filling in this gap. We introduce a no

Externí odkaz: http://arxiv.org/abs/2406.04494

Zobrazit plný text záznamu

Report

emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition

Autor: Rajapakshe, Thejan, Rana, Rajib, Khalifa, Sara, Sisman, Berrak, Schuller, Bjorn W., Busso, Carlos

Speech Emotion Recognition (SER) is crucial for enabling computers to understand the emotions conveyed in human communication. With recent advancements in Deep Learning (DL), the performance of SER models has significantly improved. However, designin

Externí odkaz: http://arxiv.org/abs/2403.14083

Zobrazit plný text záznamu

Report

Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition

Autor: Ulgen, Ismail Rasim, Du, Zongyang, Busso, Carlos, Sisman, Berrak

Speaker embeddings carry valuable emotion-related information, which makes them a promising resource for enhancing speech emotion recognition (SER), especially with limited labeled data. Traditionally, it has been assumed that emotion information is

Externí odkaz: http://arxiv.org/abs/2401.11017

Zobrazit plný text záznamu

Report

Versatile audio-visual learning for emotion recognition

Autor: Goncalves, Lucas, Leem, Seong-Gyun, Lin, Wei-Cheng, Sisman, Berrak, Busso, Carlos

Most current audio-visual emotion recognition models lack the flexibility needed for deployment in practical applications. We envision a multimodal system that works even when only one modality is available and can be implemented interchangeably for

Externí odkaz: http://arxiv.org/abs/2305.07216

Zobrazit plný text záznamu

Report

Mixed-EVC: Mixed Emotion Synthesis and Control in Voice Conversion

Autor: Zhou, Kun, Sisman, Berrak, Busso, Carlos, Ma, Bin, Li, Haizhou

Emotional voice conversion (EVC) traditionally targets the transformation of spoken utterances from one emotional state to another, with previous research mainly focusing on discrete emotion categories. This paper departs from the norm by introducing

Externí odkaz: http://arxiv.org/abs/2210.13756

Zobrazit plný text záznamu

Report

Driving Anomaly Detection Using Conditional Generative Adversarial Network

Autor: Qiu, Yuning, Misu, Teruhisa, Busso, Carlos

Anomaly driving detection is an important problem in advanced driver assistance systems (ADAS). It is important to identify potential hazard scenarios as early as possible to avoid potential accidents. This study proposes an unsupervised method to qu

Externí odkaz: http://arxiv.org/abs/2203.08289

Zobrazit plný text záznamu

Report

Unsupervised Personalization of an Emotion Recognition System: The Unique Properties of the Externalization of Valence in Speech

Autor: Sridhar, Kusha, Busso, Carlos

Publikováno v: IEEE Transactions on Affective Computing, vol. 13, no. 4, pp. 1959-1972, October-December 2022

The prediction of valence from speech is an important, but challenging problem. The externalization of valence in speech has speaker-dependent cues, which contribute to performances that are often significantly lower than the prediction of other emot

Externí odkaz: http://arxiv.org/abs/2201.07876

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání