Zobrazeno 1 - 10
of 428
pro vyhledávání: '"Wang, Yuancheng"'
Autor:
He, Haorui, Shang, Zengqiang, Wang, Chaoren, Li, Xuyuan, Gu, Yicheng, Hua, Hua, Liu, Liwei, Yang, Chen, Li, Jiaqi, Shi, Peiyang, Wang, Yuancheng, Chen, Kai, Zhang, Pengyuan, Wu, Zhizheng
Recently, speech generation models have made significant progress by using large-scale training data. However, the research community struggle to produce highly spontaneous and human-like speech due to the lack of large-scale, diverse, and spontaneou
Externí odkaz:
http://arxiv.org/abs/2407.05361
Autor:
Zhang, Yiming, Gu, Yicheng, Zeng, Yanhong, Xing, Zhening, Wang, Yuancheng, Wu, Zhizheng, Chen, Kai
We study Neural Foley, the automatic generation of high-quality sound effects synchronizing with videos, enabling an immersive audio-visual experience. Despite its wide range of applications, existing approaches encounter limitations when it comes to
Externí odkaz:
http://arxiv.org/abs/2407.01494
Autor:
Ao, Junyi, Wang, Yuancheng, Tian, Xiaohai, Chen, Dekun, Zhang, Jun, Lu, Lu, Wang, Yuxuan, Li, Haizhou, Wu, Zhizheng
Speech encompasses a wealth of information, including but not limited to content, paralinguistic, and environmental information. This comprehensive nature of speech significantly impacts communication and is crucial for human-computer interaction. Ch
Externí odkaz:
http://arxiv.org/abs/2406.13340
Autor:
Xin, Detai, Tan, Xu, Shen, Kai, Ju, Zeqian, Yang, Dongchao, Wang, Yuancheng, Takamichi, Shinnosuke, Saruwatari, Hiroshi, Liu, Shujie, Li, Jinyu, Zhao, Sheng
We present RALL-E, a robust language modeling method for text-to-speech (TTS) synthesis. While previous work based on large language models (LLMs) shows impressive performance on zero-shot TTS, such methods often suffer from poor robustness, such as
Externí odkaz:
http://arxiv.org/abs/2404.03204
Autor:
Ju, Zeqian, Wang, Yuancheng, Shen, Kai, Tan, Xu, Xin, Detai, Yang, Dongchao, Liu, Yanqing, Leng, Yichong, Song, Kaitao, Tang, Siliang, Wu, Zhizheng, Qin, Tao, Li, Xiang-Yang, Ye, Wei, Zhang, Shikun, Bian, Jiang, He, Lei, Li, Jinyu, Zhao, Sheng
While recent large-scale text-to-speech (TTS) models have achieved significant progress, they still fall short in speech quality, similarity, and prosody. Considering speech intricately encompasses various attributes (e.g., content, prosody, timbre,
Externí odkaz:
http://arxiv.org/abs/2403.03100
Publikováno v:
Julius-Kühn-Archiv, Vol 463, Iss 1, Pp 395-400 (2018)
The insect population in grain stores can be kept under control by maintaining a high concentration of N2 gas throughout the grain bed. The development of controlled atmosphere storage technology for insect control requires an accurate prediction of
Externí odkaz:
https://doaj.org/article/e8e156c6f9c4403881120833582a2c24
Autor:
Zhang, Xueyao, Xue, Liumeng, Gu, Yicheng, Wang, Yuancheng, He, Haorui, Wang, Chaoren, Chen, Xi, Fang, Zihao, Chen, Haopeng, Zhang, Junan, Tang, Tze Ying, Zou, Lexiao, Wang, Mingxuan, Han, Jun, Chen, Kai, Li, Haizhou, Wu, Zhizheng
Amphion is an open-source toolkit for Audio, Music, and Speech Generation, targeting to ease the way for junior researchers and engineers into these fields. It presents a unified framework that is inclusive of diverse generation tasks and models, wit
Externí odkaz:
http://arxiv.org/abs/2312.09911
Autor:
Hu, Chuanfei, Xia, Tianyi, Cui, Ying, Zou, Quchen, Wang, Yuancheng, Xiao, Wenbo, Ju, Shenghong, Li, Xinde
Multi-phase liver contrast-enhanced computed tomography (CECT) images convey the complementary multi-phase information for liver tumor segmentation (LiTS), which are crucial to assist the diagnosis of liver cancer clinically. However, the performance
Externí odkaz:
http://arxiv.org/abs/2305.05344
Audio editing is applicable for various purposes, such as adding background sound effects, replacing a musical instrument, and repairing damaged audio. Recently, some diffusion-based methods achieved zero-shot audio editing by using a diffusion and d
Externí odkaz:
http://arxiv.org/abs/2304.00830
Image captioning (IC) systems, which automatically generate a text description of the salient objects in an image (real or synthetic), have seen great progress over the past few years due to the development of deep neural networks. IC plays an indisp
Externí odkaz:
http://arxiv.org/abs/2206.06550