Zobrazeno 1 - 7
of 7
pro vyhledávání: '"Kim, Bongwan"'
Although numerous recent studies have suggested new frameworks for zero-shot TTS using large-scale, real-world data, studies that focus on the intelligibility of zero-shot TTS are relatively scarce. Zero-shot TTS demands additional efforts to ensure
Externí odkaz:
http://arxiv.org/abs/2401.13921
Most neural vocoders employ band-limited mel-spectrograms to generate waveforms. If full-band spectral features are used as the input, the vocoder can be provided with as much acoustic information as possible. However, in some models employing full-b
Externí odkaz:
http://arxiv.org/abs/2106.07889
We propose Jointly trained Duration Informed Transformer (JDI-T), a feed-forward Transformer with a duration predictor jointly trained without explicit alignments in order to generate an acoustic feature sequence from an input text. In this work, ins
Externí odkaz:
http://arxiv.org/abs/2005.07799
Publikováno v:
In HPB 2019 21 Supplement 2:S510-S511
Publikováno v:
In HPB 2019 21 Supplement 2:S395-S395
Publikováno v:
In HPB 2019 21 Supplement 2:S400-S400
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.