Zobrazeno 1 - 10
of 508
pro vyhledávání: '"Schuller, Björn W"'
Increasingly frequent publications in the literature report voice quality differences between depressed patients and controls. Here, we examine the possibility of using voice analysis as an early warning signal for the development of emotion disturba
Externí odkaz:
http://arxiv.org/abs/2411.11541
Curriculum learning (CL) describes a machine learning training strategy in which samples are gradually introduced into the training process based on their difficulty. Despite a partially contradictory body of evidence in the literature, CL finds popu
Externí odkaz:
http://arxiv.org/abs/2411.00973
Audio-based kinship verification (AKV) is important in many domains, such as home security monitoring, forensic identification, and social network analysis. A key challenge in the task arises from differences in age across samples from different indi
Externí odkaz:
http://arxiv.org/abs/2410.11120
The increasing success of audio foundation models across various tasks has led to a growing need for improved interpretability to understand their intricate decision-making processes better. Existing methods primarily focus on explaining these models
Externí odkaz:
http://arxiv.org/abs/2410.07530
While current emotional text-to-speech (TTS) systems can generate highly intelligible emotional speech, achieving fine control over emotion rendering of the output speech still remains a significant challenge. In this paper, we introduce ParaEVITS, a
Externí odkaz:
http://arxiv.org/abs/2409.06451
Foundational Large Language Models (LLMs) have changed the way we perceive technology. They have been shown to excel in tasks ranging from poem writing and coding to essay generation and puzzle solving. With the incorporation of image generation capa
Externí odkaz:
http://arxiv.org/abs/2409.00105
Neural network models for audio tasks, such as automatic speech recognition (ASR) and acoustic scene classification (ASC), are susceptible to noise contamination for real-life applications. To improve audio quality, an enhancement module, which can b
Externí odkaz:
http://arxiv.org/abs/2408.06264
Publikováno v:
Proc. INTERSPEECH 2023, 2683-2687
Abusive content in online social networks is a well-known problem that can cause serious psychological harm and incite hatred. The ability to upload audio data increases the importance of developing methods to detect abusive content in speech recordi
Externí odkaz:
http://arxiv.org/abs/2407.20808
Emotion and Intent Joint Understanding in Multimodal Conversation (MC-EIU) aims to decode the semantic information manifested in a multimodal conversational history, while inferring the emotions and intents simultaneously for the current utterance. M
Externí odkaz:
http://arxiv.org/abs/2407.02751
Autor:
Gerczuk, Maurice, Amiriparian, Shahin, Lutz, Justina, Strube, Wolfgang, Papazova, Irina, Hasan, Alkomiet, Schuller, Björn W.
In emergency medicine, timely intervention for patients at risk of suicide is often hindered by delayed access to specialised psychiatric care. To bridge this gap, we introduce a speech-based approach for automatic suicide risk assessment. Our study
Externí odkaz:
http://arxiv.org/abs/2407.11012