Zobrazeno 1 - 10
of 518
pro vyhledávání: '"Schuller, Bjoern"'
Autor:
Körber, Nikolai, Kromer, Eduard, Siebert, Andreas, Hauke, Sascha, Mueller-Gritschneder, Daniel, Schuller, Björn
We introduce PerCo (SD), a perceptual image compression method based on Stable Diffusion v2.1, targeting the ultra-low bit range. PerCo (SD) serves as an open and competitive alternative to the state-of-the-art method PerCo, which relies on a proprie
Externí odkaz:
http://arxiv.org/abs/2409.20255
Autor:
Ye, Zhengxin Joseph, Schuller, Bjoern
Earnings release is a key economic event in the financial markets and crucial for predicting stock movements. Earnings data gives a glimpse into how a company is doing financially and can hint at where its stock might go next. However, the irregulari
Externí odkaz:
http://arxiv.org/abs/2409.17392
Autor:
Schuller, Björn, Mallol-Ragolta, Adria, Almansa, Alejandro Peña, Tsangko, Iosif, Amin, Mostafa M., Semertzidou, Anastasia, Christ, Lukas, Amiriparian, Shahin
The dawn of Foundation Models has on the one hand revolutionised a wide range of research problems, and, on the other hand, democratised the access and use of AI-based tools by the general public. We even observe an incursion of these models into dis
Externí odkaz:
http://arxiv.org/abs/2409.08907
While current emotional text-to-speech (TTS) systems can generate highly intelligible emotional speech, achieving fine control over emotion rendering of the output speech still remains a significant challenge. In this paper, we introduce ParaEVITS, a
Externí odkaz:
http://arxiv.org/abs/2409.06451
Foundational Large Language Models (LLMs) have changed the way we perceive technology. They have been shown to excel in tasks ranging from poem writing and coding to essay generation and puzzle solving. With the incorporation of image generation capa
Externí odkaz:
http://arxiv.org/abs/2409.00105
Autor:
Kounadis-Bastian, Dionyssos, Schrüfer, Oliver, Derington, Anna, Wierstorf, Hagen, Eyben, Florian, Burkhardt, Felix, Schuller, Björn
Speech Emotion Recognition (SER) needs high computational resources to overcome the challenge of substantial annotator disagreement. Today SER is shifting towards dimensional annotations of arousal, dominance, and valence (A/D/V). Universal metrics a
Externí odkaz:
http://arxiv.org/abs/2408.13920
Neural network models for audio tasks, such as automatic speech recognition (ASR) and acoustic scene classification (ASC), are susceptible to noise contamination for real-life applications. To improve audio quality, an enhancement module, which can b
Externí odkaz:
http://arxiv.org/abs/2408.06264
Publikováno v:
Proc. INTERSPEECH 2023, 2683-2687
Abusive content in online social networks is a well-known problem that can cause serious psychological harm and incite hatred. The ability to upload audio data increases the importance of developing methods to detect abusive content in speech recordi
Externí odkaz:
http://arxiv.org/abs/2407.20808
Autor:
Triantafyllopoulos, Andreas, Tsangko, Iosif, Gebhard, Alexander, Mesaros, Annamaria, Virtanen, Tuomas, Schuller, Björn
Foundation models (FMs) are increasingly spearheading recent advances on a variety of tasks that fall under the purview of computer audition -- the use of machines to understand sounds. They feature several advantages over traditional pipelines: amon
Externí odkaz:
http://arxiv.org/abs/2407.15672
Emotion and Intent Joint Understanding in Multimodal Conversation (MC-EIU) aims to decode the semantic information manifested in a multimodal conversational history, while inferring the emotions and intents simultaneously for the current utterance. M
Externí odkaz:
http://arxiv.org/abs/2407.02751