Zobrazeno 1 - 10
of 32
pro vyhledávání: '"He, JinZheng"'
Autor:
Ye, Zhenhui, Zhong, Tianyun, Ren, Yi, Jiang, Ziyue, Huang, Jiawei, Huang, Rongjie, Liu, Jinglin, He, Jinzheng, Zhang, Chen, Wang, Zehan, Chen, Xize, Yin, Xiang, Zhao, Zhou
Talking face generation (TFG) aims to animate a target identity's face to create realistic talking videos. Personalized TFG is a variant that emphasizes the perceptual identity similarity of the synthesized result (from the perspective of appearance
Externí odkaz:
http://arxiv.org/abs/2410.06734
Autor:
Zhang, Yu, Jiang, Ziyue, Li, Ruiqi, Pan, Changhao, He, Jinzheng, Huang, Rongjie, Wang, Chuxin, Zhao, Zhou
Publikováno v:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 1960-1975
Zero-shot singing voice synthesis (SVS) with style transfer and style control aims to generate high-quality singing voices with unseen timbres and styles (including singing method, emotion, rhythm, technique, and pronunciation) from audio and text pr
Externí odkaz:
http://arxiv.org/abs/2409.15977
Autor:
Wu, Siwei, He, Jinzheng, Yuan, Ruibin, Wei, Haojie, Wei, Xipin, Lin, Chenghua, Xu, Jin, Lin, Junyang
The quantity of processed data is crucial for advancing the field of singing voice synthesis. While there are tools available for lyric or note transcription tasks, they all need pre-processed data which is relatively time-consuming (e.g., vocal and
Externí odkaz:
http://arxiv.org/abs/2409.14619
Autor:
Zhang, Yu, Pan, Changhao, Guo, Wenxiang, Li, Ruiqi, Zhu, Zhiyuan, Wang, Jialei, Xu, Wenhao, Lu, Jingyu, Hong, Zhiqing, Wang, Chuxin, Zhang, LiChao, He, Jinzheng, Jiang, Ziyue, Chen, Yuxin, Yang, Chen, Zhou, Jiecheng, Cheng, Xinyu, Zhao, Zhou
The scarcity of high-quality and multi-task singing datasets significantly hinders the development of diverse controllable and personalized singing tasks, as existing singing datasets suffer from low quality, limited diversity of languages and singer
Externí odkaz:
http://arxiv.org/abs/2409.13832
Autor:
Huang, Jiawei, Zhang, Chen, Ren, Yi, Jiang, Ziyue, Ye, Zhenhui, Liu, Jinglin, He, Jinzheng, Yin, Xiang, Zhao, Zhou
Voice conversion aims to modify the source speaker's voice to resemble the target speaker while preserving the original speech content. Despite notable advancements in voice conversion these days, multi-lingual voice conversion (including both monoli
Externí odkaz:
http://arxiv.org/abs/2408.04708
Autor:
Chu, Yunfei, Xu, Jin, Yang, Qian, Wei, Haojie, Wei, Xipin, Guo, Zhifang, Leng, Yichong, Lv, Yuanjun, He, Jinzheng, Lin, Junyang, Zhou, Chang, Zhou, Jingren
We introduce the latest progress of Qwen-Audio, a large-scale audio-language model called Qwen2-Audio, which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructi
Externí odkaz:
http://arxiv.org/abs/2407.10759
Autor:
Yang, An, Yang, Baosong, Hui, Binyuan, Zheng, Bo, Yu, Bowen, Zhou, Chang, Li, Chengpeng, Li, Chengyuan, Liu, Dayiheng, Huang, Fei, Dong, Guanting, Wei, Haoran, Lin, Huan, Tang, Jialong, Wang, Jialin, Yang, Jian, Tu, Jianhong, Zhang, Jianwei, Ma, Jianxin, Yang, Jianxin, Xu, Jin, Zhou, Jingren, Bai, Jinze, He, Jinzheng, Lin, Junyang, Dang, Kai, Lu, Keming, Chen, Keqin, Yang, Kexin, Li, Mei, Xue, Mingfeng, Ni, Na, Zhang, Pei, Wang, Peng, Peng, Ru, Men, Rui, Gao, Ruize, Lin, Runji, Wang, Shijie, Bai, Shuai, Tan, Sinan, Zhu, Tianhang, Li, Tianhao, Liu, Tianyu, Ge, Wenbin, Deng, Xiaodong, Zhou, Xiaohuan, Ren, Xingzhang, Zhang, Xinyu, Wei, Xipin, Ren, Xuancheng, Liu, Xuejing, Fan, Yang, Yao, Yang, Zhang, Yichang, Wan, Yu, Chu, Yunfei, Liu, Yuqiong, Cui, Zeyu, Zhang, Zhenru, Guo, Zhifang, Fan, Zhihao
This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to
Externí odkaz:
http://arxiv.org/abs/2407.10671
Autor:
Ye, Zhenhui, Zhong, Tianyun, Ren, Yi, Yang, Jiaqi, Li, Weichuang, Huang, Jiawei, Jiang, Ziyue, He, Jinzheng, Huang, Rongjie, Liu, Jinglin, Zhang, Chen, Yin, Xiang, Ma, Zejun, Zhao, Zhou
One-shot 3D talking portrait generation aims to reconstruct a 3D avatar from an unseen image, and then animate it with a reference video or audio to generate a talking portrait video. The existing methods fail to simultaneously achieve the goals of a
Externí odkaz:
http://arxiv.org/abs/2401.08503
Autor:
Zhang, Yu, Huang, Rongjie, Li, Ruiqi, He, JinZheng, Xia, Yan, Chen, Feiyang, Duan, Xinyu, Huai, Baoxing, Zhao, Zhou
Publikováno v:
Proceedings of the AAAI Conference on Artificial Intelligence, 38(17), 19597-19605. (2024)
Style transfer for out-of-domain (OOD) singing voice synthesis (SVS) focuses on generating high-quality singing voices with unseen styles (such as timbre, emotion, pronunciation, and articulation skills) derived from reference singing voice samples.
Externí odkaz:
http://arxiv.org/abs/2312.10741
Autor:
Jiang, Ziyue, Liu, Jinglin, Ren, Yi, He, Jinzheng, Ye, Zhenhui, Ji, Shengpeng, Yang, Qian, Zhang, Chen, Wei, Pengfei, Wang, Chunfeng, Yin, Xiang, Ma, Zejun, Zhao, Zhou
Zero-shot text-to-speech (TTS) aims to synthesize voices with unseen speech prompts, which significantly reduces the data and computation requirements for voice cloning by skipping the fine-tuning process. However, the prompting mechanisms of zero-sh
Externí odkaz:
http://arxiv.org/abs/2307.07218