Zobrazeno 1 - 10
of 1 304
pro vyhledávání: '"Qian, Yao"'
Autor:
Chen, Sanyuan, Liu, Shujie, Zhou, Long, Liu, Yanqing, Tan, Xu, Li, Jinyu, Zhao, Sheng, Qian, Yao, Wei, Furu
This paper introduces VALL-E 2, the latest advancement in neural codec language models that marks a milestone in zero-shot text-to-speech synthesis (TTS), achieving human parity for the first time. Based on its predecessor, VALL-E, the new iteration
Externí odkaz:
http://arxiv.org/abs/2406.05370
Autor:
Le, Chenyang, Qian, Yao, Wang, Dongmei, Zhou, Long, Liu, Shujie, Wang, Xiaofei, Yousefi, Midia, Qian, Yanmin, Li, Jinyu, Zhao, Sheng, Zeng, Michael
There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation. However, most end-to-end models struggle to outperform cascade models, i.e., a pipeli
Externí odkaz:
http://arxiv.org/abs/2405.17809
Autor:
Zhang, Leying, Qian, Yao, Zhou, Long, Liu, Shujie, Wang, Dongmei, Wang, Xiaofei, Yousefi, Midia, Qian, Yanmin, Li, Jinyu, He, Lei, Zhao, Sheng, Zeng, Michael
Recent advancements in zero-shot text-to-speech (TTS) modeling have led to significant strides in generating high-fidelity and diverse speech. However, dialogue generation, along with achieving human-like naturalness in speech, continues to be a chal
Externí odkaz:
http://arxiv.org/abs/2404.06690
Autor:
Zhang, Leying, Qian, Yao, Yu, Linfeng, Wang, Heming, Wang, Xinkai, Yang, Hemin, Zhou, Long, Liu, Shujie, Qian, Yanmin, Zeng, Michael
Target Speech Extraction (TSE) is a crucial task in speech processing that focuses on isolating the clean speech of a specific speaker from complex mixtures. While discriminative methods are commonly used for TSE, they can introduce distortion in ter
Externí odkaz:
http://arxiv.org/abs/2309.13874
Autor:
Ling, Shaoshi, Hu, Yuxuan, Qian, Shuangbei, Ye, Guoli, Qian, Yao, Gong, Yifan, Lin, Ed, Zeng, Michael
Most end-to-end (E2E) speech recognition models are composed of encoder and decoder blocks that perform acoustic and language modeling functions. Pretrained large language models (LLMs) have the potential to improve the performance of E2E ASR. Howeve
Externí odkaz:
http://arxiv.org/abs/2307.08234
Autor:
Li, Chenda, Qian, Yao, Chen, Zhuo, Kanda, Naoyuki, Wang, Dongmei, Yoshioka, Takuya, Qian, Yanmin, Zeng, Michael
State-of-the-art large-scale universal speech models (USMs) show a decent automatic speech recognition (ASR) performance across multiple domains and languages. However, it remains a challenge for these models to recognize overlapped speech, which is
Externí odkaz:
http://arxiv.org/abs/2305.18747
Autor:
Le, Chenyang, Qian, Yao, Zhou, Long, Liu, Shujie, Qian, Yanmin, Zeng, Michael, Huang, Xuedong
Joint speech-language training is challenging due to the large demand for training data and GPU consumption, as well as the modality gap between speech and language. We present ComSL, a speech-language model built atop a composite architecture of pub
Externí odkaz:
http://arxiv.org/abs/2305.14838
Autor:
Fang, Yuwei, Khademi, Mahmoud, Zhu, Chenguang, Yang, Ziyi, Pryzant, Reid, Xu, Yichong, Qian, Yao, Yoshioka, Takuya, Yuan, Lu, Zeng, Michael, Huang, Xuedong
Artificial General Intelligence (AGI) requires comprehensive understanding and generation capabilities for a variety of tasks spanning different modalities and functionalities. Integrative AI is one important direction to approach AGI, through combin
Externí odkaz:
http://arxiv.org/abs/2305.13738
Autor:
Yang, Ziyi, Khademi, Mahmoud, Xu, Yichong, Pryzant, Reid, Fang, Yuwei, Zhu, Chenguang, Chen, Dongdong, Qian, Yao, Gao, Mei, Chen, Yi-Ling, Gmyr, Robert, Kanda, Naoyuki, Codella, Noel, Xiao, Bin, Shi, Yu, Yuan, Lu, Yoshioka, Takuya, Zeng, Michael, Huang, Xuedong
The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities. We propose closing thi
Externí odkaz:
http://arxiv.org/abs/2305.12311
Autor:
Xiaobo Huang, Lingli Qiu, Tzung‐Dau Wang, Qian Yao, Jianxiong Liu, Ronghua Xu, Qingkun Zheng, Xingping Zhang, Jinhui Wu
Publikováno v:
The Journal of Clinical Hypertension, Vol 26, Iss 7, Pp 757-764 (2024)
Abstract The prevalence of isolated systolic hypertension (ISH) has doubled between 2002−2005 and 2014 among the oldest‐old population in China. However, the prevalence and characteristics of ISH among the oldest‐old population in southwestern
Externí odkaz:
https://doaj.org/article/13842b7ce4bd44cf9706454df665f555