Zobrazeno 1 - 10
of 179
pro vyhledávání: '"Du Zhihao"'
Publikováno v:
BMC Public Health, Vol 24, Iss 1, Pp 1-11 (2024)
Abstract Objectives The significance of self-esteem in the relationship between physical activity and Internet addiction among college students cannot be over, as it lays a solid foundation for the prevention and control of Internet addiction. Method
Externí odkaz:
https://doaj.org/article/40bfaf6341f74f12b6c4c6ce52b2e13e
Autor:
Du, Zhihao, Chen, Qian, Zhang, Shiliang, Hu, Kai, Lu, Heng, Yang, Yexin, Hu, Hangrui, Zheng, Siqi, Gu, Yue, Ma, Ziyang, Gao, Zhifu, Yan, Zhijie
Recent years have witnessed a trend that large language model (LLM) based text-to-speech (TTS) emerges into the mainstream due to their high naturalness and zero-shot capacity. In this paradigm, speech signals are discretized into token sequences, wh
Externí odkaz:
http://arxiv.org/abs/2407.05407
Autor:
An, Keyu, Chen, Qian, Deng, Chong, Du, Zhihao, Gao, Changfeng, Gao, Zhifu, Gu, Yue, He, Ting, Hu, Hangrui, Hu, Kai, Ji, Shengpeng, Li, Yabin, Li, Zerui, Lu, Heng, Luo, Haoneng, Lv, Xiang, Ma, Bin, Ma, Ziyang, Ni, Chongjia, Song, Changhe, Shi, Jiaqi, Shi, Xian, Wang, Hao, Wang, Wen, Wang, Yuxuan, Xiao, Zhangyu, Yan, Zhijie, Yang, Yexin, Zhang, Bin, Zhang, Qinglin, Zhang, Shiliang, Zhao, Nan, Zheng, Siqi
This report introduces FunAudioLLM, a model family designed to enhance natural voice interactions between humans and large language models (LLMs). At its core are two innovative models: SenseVoice, which handles multilingual speech recognition, emoti
Externí odkaz:
http://arxiv.org/abs/2407.04051
Publikováno v:
Gong-kuang zidonghua, Vol 45, Iss 5, Pp 26-30 (2019)
In view of problems of poor fluidity, unsatisfactory fire-extinguishing effect and high cost of existing mine-used anti-fire extinguishing resistant material,inorganic clay composite superabsorbent resin was synthesized by aqueous solution polymeriza
Externí odkaz:
https://doaj.org/article/92282a6249d8462397d3b537cdf42227
Autor:
Ma, Ziyang, Yang, Guanrou, Yang, Yifan, Gao, Zhifu, Wang, Jiaming, Du, Zhihao, Yu, Fan, Chen, Qian, Zheng, Siqi, Zhang, Shiliang, Chen, Xie
In this paper, we focus on solving one of the most important tasks in the field of speech processing, i.e., automatic speech recognition (ASR), with speech foundation encoders and large language models (LLM). Recent works have complex designs such as
Externí odkaz:
http://arxiv.org/abs/2402.08846
Autor:
Li, Yangze, Yu, Fan, Liang, Yuhao, Guo, Pengcheng, Shi, Mohan, Du, Zhihao, Zhang, Shiliang, Xie, Lei
Joint modeling of multi-speaker ASR and speaker diarization has recently shown promising results in speaker-attributed automatic speech recognition (SA-ASR).Although being able to obtain state-of-the-art (SOTA) performance, most of the studies are ba
Externí odkaz:
http://arxiv.org/abs/2310.04863
Autor:
Du, Zhihao, Wang, Jiaming, Chen, Qian, Chu, Yunfei, Gao, Zhifu, Li, Zerui, Hu, Kai, Zhou, Xiaohuan, Xu, Jin, Ma, Ziyang, Wang, Wen, Zheng, Siqi, Zhou, Chang, Yan, Zhijie, Zhang, Shiliang
Generative Pre-trained Transformer (GPT) models have achieved remarkable performance on various natural language processing tasks, and have shown great potential as backbones for audio-and-text large language models (LLMs). Previous mainstream audio-
Externí odkaz:
http://arxiv.org/abs/2310.04673
Autor:
Liang, Yuhao, Shi, Mohan, Yu, Fan, Li, Yangze, Zhang, Shiliang, Du, Zhihao, Chen, Qian, Xie, Lei, Qian, Yanmin, Wu, Jian, Chen, Zhuo, Lee, Kong Aik, Yan, Zhijie, Bu, Hui
With the success of the first Multi-channel Multi-party Meeting Transcription challenge (M2MeT), the second M2MeT challenge (M2MeT 2.0) held in ASRU2023 particularly aims to tackle the complex task of \emph{speaker-attributed ASR (SA-ASR)}, which dir
Externí odkaz:
http://arxiv.org/abs/2309.13573
This paper presents FunCodec, a fundamental neural speech codec toolkit, which is an extension of the open-source speech processing toolkit FunASR. FunCodec provides reproducible training recipes and inference scripts for the latest neural speech cod
Externí odkaz:
http://arxiv.org/abs/2309.07405
Autor:
Shi, Mohan, Du, Zhihao, Chen, Qian, Yu, Fan, Li, Yangze, Zhang, Shiliang, Zhang, Jie, Dai, Li-Rong
Recently, speaker-attributed automatic speech recognition (SA-ASR) has attracted a wide attention, which aims at answering the question ``who spoke what''. Different from modular systems, end-to-end (E2E) SA-ASR minimizes the speaker-dependent recogn
Externí odkaz:
http://arxiv.org/abs/2305.12459