Zobrazeno 1 - 10
of 45
pro vyhledávání: '"Ponti, Moacir Antonelli"'
Publikováno v:
Long version of the paper of ACM-SAC 2024
Machine learning models typically focus on specific targets like creating classifiers, often based on known population feature distributions in a business context. However, models calculating individual features adapt over time to improve precision,
Externí odkaz:
http://arxiv.org/abs/2401.05240
Taking advantage of the structure of large datasets to pre-train Deep Learning models is a promising strategy to decrease the need for supervised data. Self-supervised learning methods, such as contrastive and its variation are a promising way toward
Externí odkaz:
http://arxiv.org/abs/2312.11240
Sketch-an-Anchor is a novel method to train state-of-the-art Zero-shot Sketch-based Image Retrieval (ZSSBIR) models in under an epoch. Most studies break down the problem of ZSSBIR into two parts: domain alignment between images and sketches, inherit
Externí odkaz:
http://arxiv.org/abs/2303.16769
Autor:
Ponti, Moacir Antonelli, Oliveira, Lucas de Angelis, Esteban, Mathias, Garcia, Valentina, Román, Juan Martín, Argerich, Luis
Real world datasets contain incorrectly labeled instances that hamper the performance of the model and, in particular, the ability to generalize out of distribution. Also, each example might have different contribution towards learning. This motivate
Externí odkaz:
http://arxiv.org/abs/2210.11327
Autor:
Casanova, Edresson, Shulby, Christopher, Korolev, Alexander, Junior, Arnaldo Candido, Soares, Anderson da Silva, Aluísio, Sandra, Ponti, Moacir Antonelli
We explore cross-lingual multi-speaker speech synthesis and cross-lingual voice conversion applied to data augmentation for automatic speech recognition (ASR) systems in low/medium-resource scenarios. Through extensive experiments, we show that our a
Externí odkaz:
http://arxiv.org/abs/2204.00618
This technical report details changes applied to a noise filter to facilitate its application and improve its results. The filter is applied to denoise natural sounds recorded in the wild and to generate an acoustic index used in soundscape analysis.
Externí odkaz:
http://arxiv.org/abs/2201.02099
Autor:
Casanova, Edresson, Weber, Julian, Shulby, Christopher, Junior, Arnaldo Candido, Gölge, Eren, Ponti, Moacir Antonelli
Publikováno v:
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:2709-2720, 2022
YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker TTS. Our method builds upon the VITS model and adds several novel modifications for zero-shot multi-speaker and multilingual training. We achieved state-of-the
Externí odkaz:
http://arxiv.org/abs/2112.02418
Autor:
Ponti, Moacir Antonelli, Santos, Fernando Pereira dos, Ribeiro, Leo Sampaio Ferraz, Cavallari, Gabriel Biscaro
Training deep neural networks may be challenging in real world data. Using models as black-boxes, even with transfer learning, can result in poor generalization or inconclusive results when it comes to small datasets or specific applications. This tu
Externí odkaz:
http://arxiv.org/abs/2109.02752
Autor:
Casanova, Edresson, Shulby, Christopher, Gölge, Eren, Müller, Nicolas Michael, de Oliveira, Frederico Santos, Junior, Arnaldo Candido, Soares, Anderson da Silva, Aluisio, Sandra Maria, Ponti, Moacir Antonelli
In this paper, we propose SC-GlowTTS: an efficient zero-shot multi-speaker text-to-speech model that improves similarity for speakers unseen during training. We propose a speaker-conditional architecture that explores a flow-based decoder that works
Externí odkaz:
http://arxiv.org/abs/2104.05557
Autor:
Casanova, Edresson, Junior, Arnaldo Candido, Shulby, Christopher, de Oliveira, Frederico Santos, Teixeira, João Paulo, Ponti, Moacir Antonelli, Aluisio, Sandra Maria
Speech provides a natural way for human-computer interaction. In particular, speech synthesis systems are popular in different applications, such as personal assistants, GPS applications, screen readers and accessibility tools. However, not all langu
Externí odkaz:
http://arxiv.org/abs/2005.05144