Výsledky vyhledávání

Akademický článek

The influence of physical activity on internet addiction among Chinese college students: the mediating role of self-esteem and the moderating role of gender

Autor: Du Zhihao, Wang Tao, Sun Yingjie, Zhai Feng

Publikováno v: BMC Public Health, Vol 24, Iss 1, Pp 1-11 (2024)

Abstract Objectives The significance of self-esteem in the relationship between physical activity and Internet addiction among college students cannot be over, as it lays a solid foundation for the prevention and control of Internet addiction. Method

Externí odkaz: https://doaj.org/article/40bfaf6341f74f12b6c4c6ce52b2e13e

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models

Autor: Du, Zhihao, Wang, Yuxuan, Chen, Qian, Shi, Xian, Lv, Xiang, Zhao, Tianyu, Gao, Zhifu, Yang, Yexin, Gao, Changfeng, Wang, Hui, Yu, Fan, Liu, Huadai, Sheng, Zhengyan, Gu, Yue, Deng, Chong, Wang, Wen, Zhang, Shiliang, Yan, Zhijie, Zhou, Jingren

In our previous work, we introduced CosyVoice, a multilingual speech synthesis model based on supervised discrete speech tokens. By employing progressive semantic decoding with two popular generative models, language models (LMs) and Flow Matching, C

Externí odkaz: http://arxiv.org/abs/2412.10117

Zobrazit plný text záznamu

Report

Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap

Autor: Yang, Guanrou, Yu, Fan, Ma, Ziyang, Du, Zhihao, Gao, Zhifu, Zhang, Shiliang, Chen, Xie

While automatic speech recognition (ASR) systems have achieved remarkable performance with large-scale datasets, their efficacy remains inadequate in low-resource settings, encompassing dialects, accents, minority languages, and long-tail hotwords, d

Externí odkaz: http://arxiv.org/abs/2410.16726

Zobrazit plný text záznamu

Report

IntrinsicVoice: Empowering LLMs with Intrinsic Real-time Voice Interaction Abilities

Autor: Zhang, Xin, Lyu, Xiang, Du, Zhihao, Chen, Qian, Zhang, Dong, Hu, Hangrui, Tan, Chaohong, Zhao, Tianyu, Wang, Yuxuan, Zhang, Bin, Lu, Heng, Zhou, Yaqian, Qiu, Xipeng

Current methods of building LLMs with voice interaction capabilities rely heavily on explicit text autoregressive generation before or during speech response generation to maintain content quality, which unfortunately brings computational overhead an

Externí odkaz: http://arxiv.org/abs/2410.08035

Zobrazit plný text záznamu

Report

CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens

Autor: Du, Zhihao, Chen, Qian, Zhang, Shiliang, Hu, Kai, Lu, Heng, Yang, Yexin, Hu, Hangrui, Zheng, Siqi, Gu, Yue, Ma, Ziyang, Gao, Zhifu, Yan, Zhijie

Recent years have witnessed a trend that large language model (LLM) based text-to-speech (TTS) emerges into the mainstream due to their high naturalness and zero-shot capacity. In this paradigm, speech signals are discretized into token sequences, wh

Externí odkaz: http://arxiv.org/abs/2407.05407

Zobrazit plný text záznamu

Report

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

This report introduces FunAudioLLM, a model family designed to enhance natural voice interactions between humans and large language models (LLMs). At its core are two innovative models: SenseVoice, which handles multilingual speech recognition, emoti

Externí odkaz: http://arxiv.org/abs/2407.04051

Zobrazit plný text záznamu

Akademický článek

Research on inhibition of coal spontaneous combustion by composite superabsorbent resi

Autor: ZHANG Kang, DU Zhihao, WANG Deming, DOU Guolan

Publikováno v: Gong-kuang zidonghua, Vol 45, Iss 5, Pp 26-30 (2019)

In view of problems of poor fluidity, unsatisfactory fire-extinguishing effect and high cost of existing mine-used anti-fire extinguishing resistant material,inorganic clay composite superabsorbent resin was synthesized by aqueous solution polymeriza

Externí odkaz: https://doaj.org/article/92282a6249d8462397d3b537cdf42227

Zobrazit plný text záznamu

Report

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity

Autor: Ma, Ziyang, Yang, Guanrou, Yang, Yifan, Gao, Zhifu, Wang, Jiaming, Du, Zhihao, Yu, Fan, Chen, Qian, Zheng, Siqi, Zhang, Shiliang, Chen, Xie

In this paper, we focus on solving one of the most important tasks in the field of speech processing, i.e., automatic speech recognition (ASR), with speech foundation encoders and large language models (LLM). Recent works have complex designs such as

Externí odkaz: http://arxiv.org/abs/2402.08846

Zobrazit plný text záznamu

Report

SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR

Autor: Li, Yangze, Yu, Fan, Liang, Yuhao, Guo, Pengcheng, Shi, Mohan, Du, Zhihao, Zhang, Shiliang, Xie, Lei

Joint modeling of multi-speaker ASR and speaker diarization has recently shown promising results in speaker-attributed automatic speech recognition (SA-ASR).Although being able to obtain state-of-the-art (SOTA) performance, most of the studies are ba

Externí odkaz: http://arxiv.org/abs/2310.04863

Zobrazit plný text záznamu

Report

LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

Autor: Du, Zhihao, Wang, Jiaming, Chen, Qian, Chu, Yunfei, Gao, Zhifu, Li, Zerui, Hu, Kai, Zhou, Xiaohuan, Xu, Jin, Ma, Ziyang, Wang, Wen, Zheng, Siqi, Zhou, Chang, Yan, Zhijie, Zhang, Shiliang

Generative Pre-trained Transformer (GPT) models have achieved remarkable performance on various natural language processing tasks, and have shown great potential as backbones for audio-and-text large language models (LLMs). Previous mainstream audio-

Externí odkaz: http://arxiv.org/abs/2310.04673

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání