Zobrazeno 1 - 10
of 333
pro vyhledávání: '"Nguyen, Bac"'
Autor:
Nguyen, Bac, Lai, and Chieh-Hsin, Takida, Yuhta, Murata, Naoki, Uesaka, Toshimitsu, Ermon, Stefano, Mitsufuji, Yuki
Latent diffusion models have enabled continuous-state diffusion models to handle a variety of datasets, including categorical data. However, most methods rely on fixed pretrained embeddings, limiting the benefits of joint training with the diffusion
Externí odkaz:
http://arxiv.org/abs/2410.14758
Autor:
Murata, Naoki, Lai, Chieh-Hsin, Takida, Yuhta, Uesaka, Toshimitsu, Nguyen, Bac, Ermon, Stefano, Mitsufuji, Yuki
Recent literature has effectively utilized diffusion models trained on continuous variables as priors for solving inverse problems. Notably, discrete diffusion models with discrete latent codes have shown strong performance, particularly in modalitie
Externí odkaz:
http://arxiv.org/abs/2410.14710
Autor:
Nguyen, Bac, Uhlich, Stefan, Cardinaux, Fabien, Mauch, Lukas, Edraki, Marzieh, Courville, Aaron
Handling distribution shifts from training data, known as out-of-distribution (OOD) generalization, poses a significant challenge in the field of machine learning. While a pre-trained vision-language model like CLIP has demonstrated remarkable zero-s
Externí odkaz:
http://arxiv.org/abs/2407.03036
Selective attention helps us focus on task-relevant aspects in the constant flood of our sensory input. This constraint in our perception allows us to robustly generalize under distractions and to new compositions of perceivable concepts. Transformer
Externí odkaz:
http://arxiv.org/abs/2404.15721
Self-supervised learning (SSL) has achieved remarkable success across various speech-processing tasks. To enhance its efficiency, previous works often leverage the use of compression techniques. A notable recent attempt is DPHuBERT, which applies joi
Externí odkaz:
http://arxiv.org/abs/2402.16830
Autor:
Dang, Nguyen-Bac, Mehmeti, Vlerë
We show that the Hausdorff dimension of the limit set of a Schottky group varies continuously over the moduli space of Schottky groups defined over any complete valued field constructed by Poineau and Turchetti. To obtain this result, we first study
Externí odkaz:
http://arxiv.org/abs/2401.06107
State-of-the-art non-autoregressive text-to-speech (TTS) models based on FastSpeech 2 can efficiently synthesise high-fidelity and natural speech. For expressive speech datasets however, we observe characteristic audio distortions. We demonstrate tha
Externí odkaz:
http://arxiv.org/abs/2306.01442
Autor:
Nguyen, Bac, Mauch, Lukas
Deep equilibrium models (DEQs) have proven to be very powerful for learning data representations. The idea is to replace traditional (explicit) feedforward neural networks with an implicit fixed-point equation, which allows to decouple the forward an
Externí odkaz:
http://arxiv.org/abs/2304.11663
Self-supervised learning (SSL) has recently shown remarkable results in closing the gap between supervised and unsupervised learning. The idea is to learn robust features that are invariant to distortions of the input data. Despite its success, this
Externí odkaz:
http://arxiv.org/abs/2303.03717
Parallel text-to-speech (TTS) models have recently enabled fast and highly-natural speech synthesis. However, they typically require external alignment models, which are not necessarily optimized for the decoder as they are not jointly trained. In th
Externí odkaz:
http://arxiv.org/abs/2203.11049