Zobrazeno 1 - 10
of 16
pro vyhledávání: '"Huang, Chuanzeng"'
Autor:
Liu, Mingshuai, Chen, Zhuangqi, Yan, Xiaopeng, Lv, Yuanjun, Xia, Xianjun, Huang, Chuanzeng, Xiao, Yijian, Xie, Lei
In real-time speech communication systems, speech signals are often degraded by multiple distortions. Recently, a two-stage Repair-and-Denoising network (RaD-Net) was proposed with superior speech quality improvement in the ICASSP 2024 Speech Signal
Externí odkaz:
http://arxiv.org/abs/2406.07498
Audio packet loss is an inevitable problem in real-time speech communication. A band-split packet loss concealment network (BS-PLCNet) targeting full-band signals was recently proposed. Although it performs superiorly in the ICASSP 2024 PLC Challenge
Externí odkaz:
http://arxiv.org/abs/2406.05961
Autor:
Liu, Mingshuai, Chen, Zhuangqi, Yan, Xiaopeng, Lv, Yuanjun, Xia, Xianjun, Huang, Chuanzeng, Xiao, Yijian, Xie, Lei
This paper introduces our repairing and denoising network (RaD-Net) for the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. We extend our previous framework based on a two-stage network and propose an upgraded model. Specifically, we replace t
Externí odkaz:
http://arxiv.org/abs/2401.04389
Packet loss is a common and unavoidable problem in voice over internet phone (VoIP) systems. To deal with the problem, we propose a band-split packet loss concealment network (BS-PLCNet). Specifically, we split the full-band signal into wide-band (0-
Externí odkaz:
http://arxiv.org/abs/2401.03687
Multilingual training is effective in improving low-resource ASR, which may partially be explained by phonetic representation sharing between languages. In end-to-end (E2E) ASR systems, graphemes are often used as basic modeling units, however graphe
Externí odkaz:
http://arxiv.org/abs/2305.11576
We improve low-resource ASR by integrating the ideas of multilingual training and self-supervised learning. Concretely, we leverage an International Phonetic Alphabet (IPA) multilingual model to create frame-level pseudo labels for unlabeled speech,
Externí odkaz:
http://arxiv.org/abs/2305.11569
Recent studies have shown that using an external Language Model (LM) benefits the end-to-end Automatic Speech Recognition (ASR). However, predicting tokens that appear less frequently in the training set is still quite challenging. The long-tail pred
Externí odkaz:
http://arxiv.org/abs/2301.00066
Autor:
Liu, Haohe, Liu, Xubo, Kong, Qiuqiang, Tian, Qiao, Zhao, Yan, Wang, DeLiang, Huang, Chuanzeng, Wang, Yuxuan
Publikováno v:
Proc. Interspeech 2022
Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus on a single type of distortion, such as speech denoising or dereverberation. However, speech signals can be degraded by several different distortions simultan
Externí odkaz:
http://arxiv.org/abs/2204.05841
Autor:
Liu, Haohe, Kong, Qiuqiang, Tian, Qiao, Zhao, Yan, Wang, DeLiang, Huang, Chuanzeng, Wang, Yuxuan
Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus on single-task speech restoration (SSR), such as speech denoising or speech declipping. However, SSR systems only focus on one task and do not address the gen
Externí odkaz:
http://arxiv.org/abs/2109.13731
Autor:
Shu, Xiaofeng, Zhu, Yehang, Chen, Yanjie, Chen, Li, Liu, Haohe, Huang, Chuanzeng, Wang, Yuxuan
Acoustic echo and background noise can seriously degrade the intelligibility of speech. In practice, echo and noise suppression are usually treated as two separated tasks and can be removed with various digital signal processing (DSP) and deep learni
Externí odkaz:
http://arxiv.org/abs/2107.09298