Zobrazeno 1 - 10
of 124
pro vyhledávání: '"Fang, Minghui"'
Autor:
Fang, Xinyue, Huang, Zhen, Tian, Zhiliang, Fang, Minghui, Pan, Ziyi, Fang, Quntian, Wen, Zhihua, Pan, Hengyue, Li, Dongsheng
LLMs obtain remarkable performance but suffer from hallucinations. Most research on detecting hallucination focuses on the questions with short and concrete correct answers that are easy to check the faithfulness. Hallucination detections for text ge
Externí odkaz:
http://arxiv.org/abs/2409.11283
Autor:
Ji, Shengpeng, Jiang, Ziyue, Cheng, Xize, Chen, Yifu, Fang, Minghui, Zuo, Jialong, Yang, Qian, Li, Ruiqi, Zhang, Ziang, Yang, Xiaoda, Huang, Rongjie, Jiang, Yidi, Chen, Qian, Zheng, Siqi, Wang, Wen, Zhao, Zhou
Language models have been effectively applied to modeling natural signals, such as images, video, speech, and audio. A crucial component of these models is the codec tokenizer, which compresses high-dimensional natural signals into lower-dimensional
Externí odkaz:
http://arxiv.org/abs/2408.16532
Autor:
Fang, Minghui, Ji, Shengpeng, Zuo, Jialong, Huang, Hai, Xia, Yan, Zhu, Jieming, Cheng, Xize, Yang, Xiaoda, Liu, Wenrui, Wang, Gang, Dong, Zhenhua, Zhao, Zhou
Generative retrieval, which has demonstrated effectiveness in text-to-text retrieval, utilizes a sequence-to-sequence model to directly generate candidate identifiers based on natural language queries. Without explicitly computing the similarity betw
Externí odkaz:
http://arxiv.org/abs/2406.17507
Autor:
Ji, Shengpeng, Zuo, Jialong, Fang, Minghui, Zheng, Siqi, Chen, Qian, Wang, Wen, Jiang, Ziyue, Huang, Hai, Cheng, Xize, Huang, Rongjie, Zhao, Zhou
In this paper, we present ControlSpeech, a text-to-speech (TTS) system capable of fully cloning the speaker's voice and enabling arbitrary control and adjustment of speaking style, merely based on a few seconds of audio prompt and a simple textual st
Externí odkaz:
http://arxiv.org/abs/2406.01205
Autor:
Ji, Shengpeng, Fang, Minghui, Jiang, Ziyue, Zheng, Siqi, Chen, Qian, Huang, Rongjie, Zuo, Jialung, Wang, Shulei, Zhao, Zhou
In recent years, large language models have achieved significant success in generative tasks (e.g., speech cloning and audio generation) related to speech, audio, music, and other signal domains. A crucial element of these models is the discrete acou
Externí odkaz:
http://arxiv.org/abs/2402.12208
Autor:
Ji, Shengpeng, Zuo, Jialong, Fang, Minghui, Jiang, Ziyue, Chen, Feiyang, Duan, Xinyu, Huai, Baoxing, Zhao, Zhou
Publikováno v:
2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Recently, there has been a growing interest in the field of controllable Text-to-Speech (TTS). While previous studies have relied on users providing specific style factor values based on acoustic knowledge or selecting reference speeches that meet ce
Externí odkaz:
http://arxiv.org/abs/2308.14430
Autor:
Wu, Wei, Zhang, Ruiyan, Zheng, Xinyue, Fang, Minghui, Ma, Tianyuan, Hu, Qichang, Kong, Xiangzeng, Zhao, Chen
Publikováno v:
In Applied Acoustics 5 September 2024 224
Publikováno v:
In Journal of Energy Storage 20 November 2024 102 Part B
Publikováno v:
In Journal of Energy Storage 15 December 2023 73 Part C
Publikováno v:
In Journal of Molecular Liquids 1 December 2023 391 Part A