Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Niu, Zhikang"'
Autor:
Chen, Yushen, Niu, Zhikang, Ma, Ziyang, Deng, Keqi, Wang, Chunhui, Zhao, Jian, Yu, Kai, Chen, Xie
This paper introduces F5-TTS, a fully non-autoregressive text-to-speech system based on flow matching with Diffusion Transformer (DiT). Without requiring complex designs such as duration model, text encoder, and phoneme alignment, the text input is s
Externí odkaz:
http://arxiv.org/abs/2410.06885
Built upon vector quantization (VQ), discrete audio codec models have achieved great success in audio compression and auto-regressive audio generation. However, existing models face substantial challenges in perceptual quality and signal distortion,
Externí odkaz:
http://arxiv.org/abs/2409.12717
Autor:
Du, Chenpeng, Guo, Yiwei, Wang, Hankun, Yang, Yifan, Niu, Zhikang, Wang, Shuai, Zhang, Hui, Chen, Xie, Yu, Kai
Recent TTS models with decoder-only Transformer architecture, such as SPEAR-TTS and VALL-E, achieve impressive naturalness and demonstrate the ability for zero-shot adaptation given a speech prompt. However, such decoder-only TTS models lack monotoni
Externí odkaz:
http://arxiv.org/abs/2401.14321
Recent years have witnessed significant advancements in self-supervised learning (SSL) methods for speech-processing tasks. Various speech-based SSL models have been developed and present promising performance on a range of downstream tasks including
Externí odkaz:
http://arxiv.org/abs/2309.13860
Publikováno v:
In Infrared Physics and Technology September 2023 133
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.