Zobrazeno 1 - 10
of 30
pro vyhledávání: '"Liu, Linquan"'
Autor:
Hu, Shujie, Zhou, Long, Liu, Shujie, Chen, Sanyuan, Hao, Hongkun, Pan, Jing, Liu, Xunying, Li, Jinyu, Sivasankaran, Sunit, Liu, Linquan, Wei, Furu
The recent advancements in large language models (LLMs) have revolutionized the field of natural language processing, progressively broadening their scope to multimodal perception and generation. However, effectively integrating listening capabilitie
Externí odkaz:
http://arxiv.org/abs/2404.00656
Autor:
Wu, Jian, Gaur, Yashesh, Chen, Zhuo, Zhou, Long, Zhu, Yimeng, Wang, Tianrui, Li, Jinyu, Liu, Shujie, Ren, Bo, Liu, Linquan, Wu, Yu
Large language models (LLMs) have achieved remarkable success in the field of natural language processing, enabling better human-computer interaction using natural language. However, the seamless integration of speech signals into LLMs has not been e
Externí odkaz:
http://arxiv.org/abs/2307.03917
Autor:
Yu, Haibin, Hu, Yuxuan, Qian, Yao, Jin, Ma, Liu, Linquan, Liu, Shujie, Shi, Yu, Qian, Yanmin, Lin, Edward, Zeng, Michael
Code-switching speech refers to a means of expression by mixing two or more languages within a single utterance. Automatic Speech Recognition (ASR) with End-to-End (E2E) modeling for such speech can be a challenging task due to the lack of data. In t
Externí odkaz:
http://arxiv.org/abs/2303.10949
Neural text-to-speech (TTS) generally consists of cascaded architecture with separately optimized acoustic model and vocoder, or end-to-end architecture with continuous mel-spectrograms or self-extracted speech frames as the intermediate representati
Externí odkaz:
http://arxiv.org/abs/2303.02939
Autor:
Sun, Eric, Li, Jinyu, Hu, Yuxuan, Zhu, Yimeng, Zhou, Long, Xue, Jian, Wang, Peidong, Liu, Linquan, Liu, Shujie, Lin, Edward, Gong, Yifan
We propose gated language experts and curriculum training to enhance multilingual transformer transducer models without requiring language identification (LID) input from users during inference. Our method incorporates a gating mechanism and LID loss
Externí odkaz:
http://arxiv.org/abs/2303.00786
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition
Autor:
Kumatani, Kenichi, Gmyr, Robert, Salinas, Felipe Cruz, Liu, Linquan, Zuo, Wei, Patel, Devang, Sun, Eric, Shi, Yu
The sparsely-gated Mixture of Experts (MoE) can magnify a network capacity with a little computational complexity. In this work, we investigate how multi-lingual Automatic Speech Recognition (ASR) networks can be scaled up with a simple routing algor
Externí odkaz:
http://arxiv.org/abs/2112.05820
Autor:
Leng, Yichong, Tan, Xu, Wang, Rui, Zhu, Linchen, Xu, Jin, Liu, Wenjie, Liu, Linquan, Qin, Tao, Li, Xiang-Yang, Lin, Edward, Liu, Tie-Yan
Error correction is widely used in automatic speech recognition (ASR) to post-process the generated sentence, and can further reduce the word error rate (WER). Although multiple candidates are generated by an ASR system through beam search, current e
Externí odkaz:
http://arxiv.org/abs/2109.14420
Autor:
Leng, Yichong, Tan, Xu, Zhu, Linchen, Xu, Jin, Luo, Renqian, Liu, Linquan, Qin, Tao, Li, Xiang-Yang, Lin, Ed, Liu, Tie-Yan
Error correction techniques have been used to refine the output sentences from automatic speech recognition (ASR) models and achieve a lower word error rate (WER) than original ASR outputs. Previous works usually use a sequence-to-sequence model to c
Externí odkaz:
http://arxiv.org/abs/2105.03842
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Publikováno v:
In Speech Communication 2008 50(7):605-615