Výsledky vyhledávání - "Huang, Chuanzeng"

Report

RaD-Net 2: A causal two-stage repairing and denoising speech enhancement network with knowledge distillation and complex axial self-attention

Autor: Liu, Mingshuai, Chen, Zhuangqi, Yan, Xiaopeng, Lv, Yuanjun, Xia, Xianjun, Huang, Chuanzeng, Xiao, Yijian, Xie, Lei

In real-time speech communication systems, speech signals are often degraded by multiple distortions. Recently, a two-stage Repair-and-Denoising network (RaD-Net) was proposed with superior speech quality improvement in the ICASSP 2024 Speech Signal

Externí odkaz: http://arxiv.org/abs/2406.07498

Zobrazit plný text záznamu

Report

BS-PLCNet 2: Two-stage Band-split Packet Loss Concealment Network with Intra-model Knowledge Distillation

Autor: Zhang, Zihan, Xia, Xianjun, Huang, Chuanzeng, Xiao, Yijian, Xie, Lei

Audio packet loss is an inevitable problem in real-time speech communication. A band-split packet loss concealment network (BS-PLCNet) targeting full-band signals was recently proposed. Although it performs superiorly in the ICASSP 2024 PLC Challenge

Externí odkaz: http://arxiv.org/abs/2406.05961

Zobrazit plný text záznamu

Report

RaD-Net: A Repairing and Denoising Network for Speech Signal Improvement

Autor: Liu, Mingshuai, Chen, Zhuangqi, Yan, Xiaopeng, Lv, Yuanjun, Xia, Xianjun, Huang, Chuanzeng, Xiao, Yijian, Xie, Lei

This paper introduces our repairing and denoising network (RaD-Net) for the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. We extend our previous framework based on a two-stage network and propose an upgraded model. Specifically, we replace t

Externí odkaz: http://arxiv.org/abs/2401.04389

Zobrazit plný text záznamu

Report

BS-PLCNet: Band-split Packet Loss Concealment Network with Multi-task Learning Framework and Multi-discriminators

Autor: Zhang, Zihan, Sun, Jiayao, Xia, Xianjun, Huang, Chuanzeng, Xiao, Yijian, Xie, Lei

Packet loss is a common and unavoidable problem in voice over internet phone (VoIP) systems. To deal with the problem, we propose a band-split packet loss concealment network (BS-PLCNet). Specifically, we split the full-band signal into wide-band (0-

Externí odkaz: http://arxiv.org/abs/2401.03687

Zobrazit plný text záznamu

Report

Language-universal phonetic encoder for low-resource speech recognition

Autor: Feng, Siyuan, Tu, Ming, Xia, Rui, Huang, Chuanzeng, Wang, Yuxuan

Multilingual training is effective in improving low-resource ASR, which may partially be explained by phonetic representation sharing between languages. In end-to-end (E2E) ASR systems, graphemes are often used as basic modeling units, however graphe

Externí odkaz: http://arxiv.org/abs/2305.11576

Zobrazit plný text záznamu

Report

Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition

Autor: Feng, Siyuan, Tu, Ming, Xia, Rui, Huang, Chuanzeng, Wang, Yuxuan

We improve low-resource ASR by integrating the ideas of multilingual training and self-supervised learning. Concretely, we leverage an International Phonetic Alphabet (IPA) multilingual model to create frame-level pseudo labels for unlabeled speech,

Externí odkaz: http://arxiv.org/abs/2305.11569

Zobrazit plný text záznamu

Report

Memory Augmented Lookup Dictionary based Language Modeling for Automatic Speech Recognition

Autor: Feng, Yukun, Tu, Ming, Xia, Rui, Huang, Chuanzeng, Wang, Yuxuan

Recent studies have shown that using an external Language Model (LM) benefits the end-to-end Automatic Speech Recognition (ASR). However, predicting tokens that appear less frequently in the training set is still quite challenging. The long-tail pred

Externí odkaz: http://arxiv.org/abs/2301.00066

Zobrazit plný text záznamu

Report

VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration

Autor: Liu, Haohe, Liu, Xubo, Kong, Qiuqiang, Tian, Qiao, Zhao, Yan, Wang, DeLiang, Huang, Chuanzeng, Wang, Yuxuan

Publikováno v: Proc. Interspeech 2022

Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus on a single type of distortion, such as speech denoising or dereverberation. However, speech signals can be degraded by several different distortions simultan

Externí odkaz: http://arxiv.org/abs/2204.05841

Zobrazit plný text záznamu

Report

VoiceFixer: Toward General Speech Restoration with Neural Vocoder

Autor: Liu, Haohe, Kong, Qiuqiang, Tian, Qiao, Zhao, Yan, Wang, DeLiang, Huang, Chuanzeng, Wang, Yuxuan

Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus on single-task speech restoration (SSR), such as speech denoising or speech declipping. However, SSR systems only focus on one task and do not address the gen

Externí odkaz: http://arxiv.org/abs/2109.13731

Zobrazit plný text záznamu

Report

Joint Echo Cancellation and Noise Suppression based on Cascaded Magnitude and Complex Mask Estimation

Autor: Shu, Xiaofeng, Zhu, Yehang, Chen, Yanjie, Chen, Li, Liu, Haohe, Huang, Chuanzeng, Wang, Yuxuan

Acoustic echo and background noise can seriously degrade the intelligibility of speech. In practice, echo and noise suppression are usually treated as two separated tasks and can be removed with various digital signal processing (DSP) and deep learni

Externí odkaz: http://arxiv.org/abs/2107.09298

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání