Zobrazeno 1 - 10
of 2 055
pro vyhledávání: '"A. Herremans"'
This paper introduces an extendable modular system that compiles a range of music feature extraction models to aid music information retrieval research. The features include musical elements like key, downbeats, and genre, as well as audio characteri
Externí odkaz:
http://arxiv.org/abs/2411.00469
Recent advancements in Text-to-Speech (TTS) systems have enabled the generation of natural and expressive speech from textual input. Accented TTS aims to enhance user experience by making the synthesized speech more relatable to minority group listen
Externí odkaz:
http://arxiv.org/abs/2410.13342
In this work, we present a novel method for music emotion recognition that leverages Large Language Model (LLM) embeddings for label alignment across multiple datasets and zero-shot prediction on novel categories. First, we compute LLM embeddings for
Externí odkaz:
http://arxiv.org/abs/2410.11522
In tandem with the recent advancements in foundation model research, there has been a surge of generative music AI applications within the past few years. As the idea of AI-generated or AI-augmented music becomes more mainstream, many researchers in
Externí odkaz:
http://arxiv.org/abs/2409.09378
Current strategies for achieving fine-grained prosody control in speech synthesis entail extracting additional style embeddings or adopting more complex architectures. To enable zero-shot application of pretrained text-to-speech (TTS) models, we pres
Externí odkaz:
http://arxiv.org/abs/2408.06827
Controllable music generation promotes the interaction between humans and composition systems by projecting the users' intent on their desired music. The challenge of introducing controllability is an increasingly important issue in the symbolic musi
Externí odkaz:
http://arxiv.org/abs/2407.10462
Reinforcement learning has demonstrated impressive performance in various challenging problems such as robotics, board games, and classical arcade games. However, its real-world applications can be hindered by the absence of robustness and safety in
Externí odkaz:
http://arxiv.org/abs/2406.09976
Autor:
Wang, Kyra, Herremans, Dorien
Laughing, sighing, stuttering, and other forms of paralanguage do not contribute any direct lexical meaning to speech, but they provide crucial propositional context that aids semantic and pragmatic processes such as irony. It is thus important for a
Externí odkaz:
http://arxiv.org/abs/2406.08820
Autor:
Kang, Jaeyong, Herremans, Dorien
Deep learning models for music have advanced drastically in recent years, but how good are machine learning models at capturing emotion, and what challenges are researchers facing? In this paper, we provide a comprehensive overview of the available m
Externí odkaz:
http://arxiv.org/abs/2406.08809
Autor:
Ong, Joel, Herremans, Dorien
This paper introduces DeepUnifiedMom, a deep learning framework that enhances portfolio management through a multi-task learning approach and a multi-gate mixture of experts. The essence of DeepUnifiedMom lies in its ability to create unified momentu
Externí odkaz:
http://arxiv.org/abs/2406.08742