Výsledky vyhledávání - "Schuller, Bjoern"

Report

Autor: Körber, Nikolai, Kromer, Eduard, Siebert, Andreas, Hauke, Sascha, Mueller-Gritschneder, Daniel, Schuller, Björn

We introduce PerCo (SD), a perceptual image compression method based on Stable Diffusion v2.1, targeting the ultra-low bit range. PerCo (SD) serves as an open and competitive alternative to the state-of-the-art method PerCo, which relies on a proprie

Externí odkaz: http://arxiv.org/abs/2409.20255

Zobrazit plný text záznamu

Report

Trading through Earnings Seasons using Self-Supervised Contrastive Representation Learning

Autor: Ye, Zhengxin Joseph, Schuller, Bjoern

Earnings release is a key economic event in the financial markets and crucial for predicting stock movements. Earnings data gives a glimpse into how a company is doing financially and can hint at where its stock might go next. However, the irregulari

Externí odkaz: http://arxiv.org/abs/2409.17392

Zobrazit plný text záznamu

Report

Affective Computing Has Changed: The Foundation Model Disruption

Autor: Schuller, Björn, Mallol-Ragolta, Adria, Almansa, Alejandro Peña, Tsangko, Iosif, Amin, Mostafa M., Semertzidou, Anastasia, Christ, Lukas, Amiriparian, Shahin

The dawn of Foundation Models has on the one hand revolutionised a wide range of research problems, and, on the other hand, democratised the access and use of AI-based tools by the general public. We even observe an incursion of these models into dis

Externí odkaz: http://arxiv.org/abs/2409.08907

Zobrazit plný text záznamu

Report

Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models

Autor: Jing, Xin, Zhou, Kun, Triantafyllopoulos, Andreas, Schuller, Björn W.

While current emotional text-to-speech (TTS) systems can generate highly intelligible emotional speech, achieving fine control over emotion rendering of the output speech still remains a significant challenge. In this paper, we introduce ParaEVITS, a

Externí odkaz: http://arxiv.org/abs/2409.06451

Zobrazit plný text záznamu

Report

Negation Blindness in Large Language Models: Unveiling the NO Syndrome in Image Generation

Autor: Nadeem, Mohammad, Sohail, Shahab Saquib, Cambria, Erik, Schuller, Björn W., Hussain, Amir

Foundational Large Language Models (LLMs) have changed the way we perceive technology. They have been shown to excel in tasks ranging from poem writing and coding to essay generation and puzzle solving. With the incorporation of image generation capa

Externí odkaz: http://arxiv.org/abs/2409.00105

Zobrazit plný text záznamu

Report

Wav2Small: Distilling Wav2Vec2 to 72K parameters for Low-Resource Speech emotion recognition

Autor: Kounadis-Bastian, Dionyssos, Schrüfer, Oliver, Derington, Anna, Wierstorf, Hagen, Eyben, Florian, Burkhardt, Felix, Schuller, Björn

Speech Emotion Recognition (SER) needs high computational resources to overcome the challenge of substantial annotator disagreement. Today SER is shifting towards dimensional annotations of arousal, dominance, and valence (A/D/V). Universal metrics a

Externí odkaz: http://arxiv.org/abs/2408.13920

Zobrazit plný text záznamu

Report

Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance

Autor: Milling, Manuel, Liu, Shuo, Triantafyllopoulos, Andreas, Aslan, Ilhan, Schuller, Björn W.

Neural network models for audio tasks, such as automatic speech recognition (ASR) and acoustic scene classification (ASC), are susceptible to noise contamination for real-life applications. To improve audio quality, an enhancement module, which can b

Externí odkaz: http://arxiv.org/abs/2408.06264

Zobrazit plný text záznamu

Report

Abusive Speech Detection in Indic Languages Using Acoustic Features

Autor: Spiesberger, Anika A., Triantafyllopoulos, Andreas, Tsangko, Iosif, Schuller, Björn W.

Publikováno v: Proc. INTERSPEECH 2023, 2683-2687

Abusive content in online social networks is a well-known problem that can cause serious psychological harm and incite hatred. The ability to upload audio data increases the importance of developing methods to detect abusive content in speech recordi

Externí odkaz: http://arxiv.org/abs/2407.20808

Zobrazit plný text záznamu

Report

Computer Audition: From Task-Specific Machine Learning to Foundation Models

Autor: Triantafyllopoulos, Andreas, Tsangko, Iosif, Gebhard, Alexander, Mesaros, Annamaria, Virtanen, Tuomas, Schuller, Björn

Foundation models (FMs) are increasingly spearheading recent advances on a variety of tasks that fall under the purview of computer audition -- the use of machines to understand sounds. They feature several advantages over traditional pipelines: amon

Externí odkaz: http://arxiv.org/abs/2407.15672

Zobrazit plný text záznamu

Report

Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset

Autor: Liu, Rui, Zuo, Haolin, Lian, Zheng, Xing, Xiaofen, Schuller, Björn W., Li, Haizhou

Emotion and Intent Joint Understanding in Multimodal Conversation (MC-EIU) aims to decode the semantic information manifested in a multimodal conversational history, while inferring the emotions and intents simultaneously for the current utterance. M

Externí odkaz: http://arxiv.org/abs/2407.02751

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání