Zobrazeno 1 - 10
of 271
pro vyhledávání: '"Richard, Gaël"'
Publikováno v:
2024 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2024), Sep 2024, London (UK), United Kingdom
The Prototypical Network (ProtoNet) has emerged as a popular choice in Few-shot Learning (FSL) scenarios due to its remarkable performance and straightforward implementation. Building upon such success, we first propose a simple (yet novel) method to
Externí odkaz:
http://arxiv.org/abs/2410.05302
Publikováno v:
EUROPEAN SIGNAL PROCESSING CONFERENCE 2024 [EUSIPCO], Aug 2024, Lyon, France
Latent representation learning has been an active field of study for decades in numerous applications. Inspired among others by the tokenization from Natural Language Processing and motivated by the research of a simple data representation, recent wo
Externí odkaz:
http://arxiv.org/abs/2409.16677
Neural audio codecs have significantly advanced audio compression by efficiently converting continuous audio signals into discrete tokens. These codecs preserve high-quality sound and enable sophisticated sound generation through generative models tr
Externí odkaz:
http://arxiv.org/abs/2409.11228
Publikováno v:
2024 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2024), Sep 2024, London (UK), United Kingdom
As diffusion-based deep generative models gain prevalence, researchers are actively investigating their potential applications across various domains, including music synthesis and style alteration. Within this work, we are interested in timbre trans
Externí odkaz:
http://arxiv.org/abs/2409.15321
Autor:
Perera, David, Letzelter, Victor, Mariotte, Théo, Cortés, Adrien, Chen, Mickael, Essid, Slim, Richard, Gaël
We introduce Annealed Multiple Choice Learning (aMCL) which combines simulated annealing with MCL. MCL is a learning framework handling ambiguous tasks by predicting a small set of plausible hypotheses. These hypotheses are trained using the Winner-t
Externí odkaz:
http://arxiv.org/abs/2407.15580
Publikováno v:
INTERSPEECH, Sep 2024, Kos Island, Greece
Single-channel speech dereverberation aims at extracting a dry speech signal from a recording affected by the acoustic reflections in a room. However, most current deep learning-based approaches for speech dereverberation are not interpretable for ro
Externí odkaz:
http://arxiv.org/abs/2407.08657
In this article, we investigate the notion of model-based deep learning in the realm of music information research (MIR). Loosely speaking, we refer to the term model-based deep learning for approaches that combine traditional knowledge-based methods
Externí odkaz:
http://arxiv.org/abs/2406.11540
Autor:
Letzelter, Victor, Perera, David, Rommel, Cédric, Fontaine, Mathieu, Essid, Slim, Richard, Gael, Pérez, Patrick
Winner-takes-all training is a simple learning paradigm, which handles ambiguous tasks by predicting a set of plausible hypotheses. Recently, a connection was established between Winner-takes-all training and centroidal Voronoi tessellations, showing
Externí odkaz:
http://arxiv.org/abs/2406.04706
Publikováno v:
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2024, Seoul, South Korea
Music generated by deep learning methods often suffers from a lack of coherence and long-term organization. Yet, multi-scale hierarchical structure is a distinctive feature of music signals. To leverage this information, we propose a structure-inform
Externí odkaz:
http://arxiv.org/abs/2402.13301
Publikováno v:
IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2024, Seoul (Korea), South Korea
Diffusion models are receiving a growing interest for a variety of signal generation tasks such as speech or music synthesis. WaveGrad, for example, is a successful diffusion model that conditionally uses the mel spectrogram to guide a diffusion proc
Externí odkaz:
http://arxiv.org/abs/2402.15516