Zobrazeno 1 - 10
of 1 157
pro vyhledávání: '"Zhang, LiChao"'
Autor:
Zhang, Yu, Pan, Changhao, Guo, Wenxiang, Li, Ruiqi, Zhu, Zhiyuan, Wang, Jialei, Xu, Wenhao, Lu, Jingyu, Hong, Zhiqing, Wang, Chuxin, Zhang, LiChao, He, Jinzheng, Jiang, Ziyue, Chen, Yuxin, Yang, Chen, Zhou, Jiecheng, Cheng, Xinyu, Zhao, Zhou
The scarcity of high-quality and multi-task singing datasets significantly hinders the development of diverse controllable and personalized singing tasks, as existing singing datasets suffer from low quality, limited diversity of languages and singer
Externí odkaz:
http://arxiv.org/abs/2409.13832
Autor:
Li, Ruiqi, Hong, Zhiqing, Wang, Yongqi, Zhang, Lichao, Huang, Rongjie, Zheng, Siqi, Zhao, Zhou
Text-to-song (TTSong) is a music generation task that synthesizes accompanied singing voices. Current TTSong methods, inherited from singing voice synthesis (SVS), require melody-related information that can sometimes be impractical, such as music sc
Externí odkaz:
http://arxiv.org/abs/2407.02049
Autor:
Zhang, Lichao, Yu, Jia, Zhang, Shuai, Li, Long, Zhong, Yangyang, Liang, Guanbao, Yan, Yuming, Ma, Qing, Weng, Fangsheng, Pan, Fayu, Li, Jing, Xu, Renjun, Lan, Zhenzhong
Large Language Models (LLMs) have significantly advanced user-bot interactions, enabling more complex and coherent dialogues. However, the prevalent text-only modality might not fully exploit the potential for effective user engagement. This paper ex
Externí odkaz:
http://arxiv.org/abs/2406.15000
Autor:
Yu, Jia, Zhang, Lichao, Chen, Zijie, Pan, Fayu, Wen, MiaoMiao, Yan, Yuming, Weng, Fangsheng, Zhang, Shuai, Pan, Lili, Lan, Zhenzhong
The fusion of AI and fashion design has emerged as a promising research area. However, the lack of extensive, interrelated data on clothing and try-on stages has hindered the full potential of AI in this domain. Addressing this, we present the Fashio
Externí odkaz:
http://arxiv.org/abs/2311.12067
Autor:
Guan, Cong, Zhang, Lichao, Fan, Chunpeng, Li, Yichen, Chen, Feng, Li, Lihe, Tian, Yunjia, Yuan, Lei, Yu, Yang
Developing intelligent agents capable of seamless coordination with humans is a critical step towards achieving artificial general intelligence. Existing methods for human-AI coordination typically train an agent to coordinate with a diverse set of p
Externí odkaz:
http://arxiv.org/abs/2311.00416
Despite significant progress in the field, it is still challenging to create personalized visual representations that align closely with the desires and preferences of individual users. This process requires users to articulate their ideas in words t
Externí odkaz:
http://arxiv.org/abs/2310.08129
Autor:
Xun, Jiahao, Zhang, Shengyu, Yang, Yanting, Zhu, Jieming, Deng, Liqun, Zhao, Zhou, Dong, Zhenhua, Li, Ruiqi, Zhang, Lichao, Wu, Fei
In the field of music information retrieval (MIR), cover song identification (CSI) is a challenging task that aims to identify cover versions of a query song from a massive collection. Existing works still suffer from high intra-song variances and in
Externí odkaz:
http://arxiv.org/abs/2307.09775
Autor:
Huang, Rongjie, Liu, Huadai, Cheng, Xize, Ren, Yi, Li, Linjun, Ye, Zhenhui, He, Jinzheng, Zhang, Lichao, Liu, Jinglin, Yin, Xiang, Zhao, Zhou
Direct speech-to-speech translation (S2ST) aims to convert speech from one language into another, and has demonstrated significant progress to date. Despite the recent success, current S2ST models still suffer from distinct degradation in noisy envir
Externí odkaz:
http://arxiv.org/abs/2305.15403
The speech-to-singing (STS) voice conversion task aims to generate singing samples corresponding to speech recordings while facing a major challenge: the alignment between the target (singing) pitch contour and the source (speech) content is difficul
Externí odkaz:
http://arxiv.org/abs/2305.04476
Speech Emotion Recognition (SER) is to recognize human emotions in a natural verbal interaction scenario with machines, which is considered as a challenging problem due to the ambiguous human emotions. Despite the recent progress in SER, state-of-the
Externí odkaz:
http://arxiv.org/abs/2305.06273