Zobrazeno 1 - 10
of 3 907
pro vyhledávání: '"Wang,Yuping"'
Autor:
Bai, Ye, Chen, Haonan, Chen, Jitong, Chen, Zhuo, Deng, Yi, Dong, Xiaohong, Hantrakul, Lamtharn, Hao, Weituo, Huang, Qingqing, Huang, Zhongyi, Jia, Dongya, La, Feihu, Le, Duc, Li, Bochen, Li, Chumin, Li, Hui, Li, Xingxing, Liu, Shouda, Lu, Wei-Tsung, Lu, Yiqing, Shaw, Andrew, Spijkervet, Janne, Sun, Yakun, Wang, Bo, Wang, Ju-Chiang, Wang, Yuping, Wang, Yuxuan, Xu, Ling, Yang, Yifeng, Yao, Chao, Zhang, Shuo, Zhang, Yang, Zhang, Yilin, Zhao, Hang, Zhao, Ziyi, Zhong, Dejian, Zhou, Shicen, Zou, Pei
We introduce Seed-Music, a suite of music generation systems capable of producing high-quality music with fine-grained style control. Our unified framework leverages both auto-regressive language modeling and diffusion approaches to support two key m
Externí odkaz:
http://arxiv.org/abs/2409.09214
The cross-domain multicast routing problem in a software-defined wireless network with multiple controllers is a classic NP-hard optimization problem. As the network size increases, designing and implementing cross-domain multicast routing paths in t
Externí odkaz:
http://arxiv.org/abs/2409.05888
Autor:
Ma, Ziyang, Song, Yakun, Du, Chenpeng, Cong, Jian, Chen, Zhuo, Wang, Yuping, Wang, Yuxuan, Chen, Xie
Dialogue serves as the most natural manner of human-computer interaction (HCI). Recent advancements in speech language models (SLM) have significantly enhanced speech-based conversational AI. However, these models are limited to turn-based conversati
Externí odkaz:
http://arxiv.org/abs/2408.02622
StreamVoice has recently pushed the boundaries of zero-shot voice conversion (VC) in the streaming domain. It uses a streamable language model (LM) with a context-aware approach to convert semantic features from automatic speech recognition (ASR) int
Externí odkaz:
http://arxiv.org/abs/2408.02178
Autor:
Bai, Ye, Chen, Jingping, Chen, Jitong, Chen, Wei, Chen, Zhuo, Ding, Chuang, Dong, Linhao, Dong, Qianqian, Du, Yujiao, Gao, Kepan, Gao, Lu, Guo, Yi, Han, Minglun, Han, Ting, Hu, Wenchao, Hu, Xinying, Hu, Yuxiang, Hua, Deyu, Huang, Lu, Huang, Mingkun, Huang, Youjia, Jin, Jishuo, Kong, Fanliu, Lan, Zongwei, Li, Tianyu, Li, Xiaoyang, Li, Zeyang, Lin, Zehua, Liu, Rui, Liu, Shouda, Lu, Lu, Lu, Yizhou, Ma, Jingting, Ma, Shengtao, Pei, Yulin, Shen, Chen, Tan, Tian, Tian, Xiaogang, Tu, Ming, Wang, Bo, Wang, Hao, Wang, Yuping, Wang, Yuxuan, Xia, Hanzhang, Xia, Rui, Xie, Shuangyi, Xu, Hongmin, Yang, Meng, Zhang, Bihong, Zhang, Jun, Zhang, Wanyi, Zhang, Yang, Zhang, Yawei, Zheng, Yijie, Zou, Ming
Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-e
Externí odkaz:
http://arxiv.org/abs/2407.04675
Autor:
Yuan, Yi, Jia, Dongya, Zhuang, Xiaobin, Chen, Yuanzhe, Liu, Zhengxi, Chen, Zhuo, Wang, Yuping, Wang, Yuxuan, Liu, Xubo, Kang, Xiyuan, Plumbley, Mark D., Wang, Wenwu
Generative models have shown significant achievements in audio generation tasks. However, existing models struggle with complex and detailed prompts, leading to potential performance degradation. We hypothesize that this problem stems from the simpli
Externí odkaz:
http://arxiv.org/abs/2407.04416
Autor:
Anastassiou, Philip, Chen, Jiawei, Chen, Jitong, Chen, Yuanzhe, Chen, Zhuo, Chen, Ziyi, Cong, Jian, Deng, Lelai, Ding, Chuang, Gao, Lu, Gong, Mingqing, Huang, Peisong, Huang, Qingqing, Huang, Zhiying, Huo, Yuanyuan, Jia, Dongya, Li, Chumin, Li, Feiya, Li, Hui, Li, Jiaxin, Li, Xiaoyang, Li, Xingxing, Liu, Lin, Liu, Shouda, Liu, Sichao, Liu, Xudong, Liu, Yuchen, Liu, Zhengxi, Lu, Lu, Pan, Junjie, Wang, Xin, Wang, Yuping, Wang, Yuxuan, Wei, Zhen, Wu, Jian, Yao, Chao, Yang, Yifeng, Yi, Yuanhao, Zhang, Junteng, Zhang, Qidi, Zhang, Shuo, Zhang, Wenjie, Zhang, Yang, Zhao, Zilin, Zhong, Dejian, Zhuang, Xiaobin
We introduce Seed-TTS, a family of large-scale autoregressive text-to-speech (TTS) models capable of generating speech that is virtually indistinguishable from human speech. Seed-TTS serves as a foundation model for speech generation and excels in sp
Externí odkaz:
http://arxiv.org/abs/2406.02430
This survey explores the transformative impact of foundation models (FMs) in artificial intelligence, focusing on their integration with federated learning (FL) for advancing biomedical research. Foundation models such as ChatGPT, LLaMa, and CLIP, wh
Externí odkaz:
http://arxiv.org/abs/2405.06784
Autor:
Anastassiou, Philip, Tang, Zhenyu, Peng, Kainan, Jia, Dongya, Li, Jiaxin, Tu, Ming, Wang, Yuping, Wang, Yuxuan, Ma, Mingbo
We present VoiceShop, a novel speech-to-speech framework that can modify multiple attributes of speech, such as age, gender, accent, and speech style, in a single forward pass while preserving the input speaker's timbre. Previous works have been cons
Externí odkaz:
http://arxiv.org/abs/2404.06674
The confluence of the advancement of Autonomous Vehicles (AVs) and the maturity of Vehicle-to-Everything (V2X) communication has enabled the capability of cooperative connected and automated vehicles (CAVs). Building on top of cooperative perception,
Externí odkaz:
http://arxiv.org/abs/2403.17916