Výsledky vyhledávání

Report

KS-Net: Multi-band joint speech restoration and enhancement network for 2024 ICASSP SSI Challenge

Autor: Yu, Guochen, Han, Runqiang, Xu, Chenglin, Zhao, Haoran, Li, Nan, Zhang, Chen, Zheng, Xiguang, Zhou, Chao, Huang, Qi, Yu, Bing

This paper presents the speech restoration and enhancement system created by the 1024K team for the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. Our system consists of a generative adversarial network (GAN) in complex-domain for speech rest

Externí odkaz: http://arxiv.org/abs/2402.01808

Zobrazit plný text záznamu

Report

BAE-Net: A Low complexity and high fidelity Bandwidth-Adaptive neural network for speech super-resolution

Autor: Yu, Guochen, Zheng, Xiguang, Li, Nan, Han, Runqiang, Zheng, Chengshi, Zhang, Chen, Zhou, Chao, Huang, Qi, Yu, Bing

Speech bandwidth extension (BWE) has demonstrated promising performance in enhancing the perceptual speech quality in real communication systems. Most existing BWE researches primarily focus on fixed upsampling ratios, disregarding the fact that the

Externí odkaz: http://arxiv.org/abs/2312.13722

Zobrazit plný text záznamu

Report

A General Unfolding Speech Enhancement Method Motivated by Taylor's Theorem

Autor: Li, Andong, Yu, Guochen, Zheng, Chengshi, Liu, Wenzhe, Li, Xiaodong

While deep neural networks have facilitated significant advancements in the field of speech enhancement, most existing methods are developed following either empirical or relatively blind criteria, lacking adequate guidelines in pipeline design. Insp

Externí odkaz: http://arxiv.org/abs/2211.16764

Zobrazit plný text záznamu

Report

TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary Perspective

Autor: Li, Andong, Yu, Guochen, Liu, Wenzhe, Li, Xiaodong, Zheng, Chengshi

Despite the promising performance of existing frame-wise all-neural beamformers in the speech enhancement field, it remains unclear what the underlying mechanism exists. In this paper, we revisit the beamforming behavior from the beam-space dictionar

Externí odkaz: http://arxiv.org/abs/2211.12024

Zobrazit plný text záznamu

Report

TMGAN-PLC: Audio Packet Loss Concealment using Temporal Memory Generative Adversarial Network

Autor: Guan, Yuansheng, Yu, Guochen, Li, Andong, Zheng, Chengshi, Wang, Jie

Real-time communications in packet-switched networks have become widely used in daily communication, while they inevitably suffer from network delays and data losses in constrained real-time conditions. To solve these problems, audio packet loss conc

Externí odkaz: http://arxiv.org/abs/2207.01255

Zobrazit plný text záznamu

Report

Taylor, Can You Hear Me Now? A Taylor-Unfolding Framework for Monaural Speech Enhancement

Autor: Li, Andong, You, Shan, Yu, Guochen, Zheng, Chengshi, Li, Xiaodong

While the deep learning techniques promote the rapid development of the speech enhancement (SE) community, most schemes only pursue the performance in a black-box manner and lack adequate model interpretability. Inspired by Taylor's approximation the

Externí odkaz: http://arxiv.org/abs/2205.00206

Zobrazit plný text záznamu

Report

Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement

Autor: Yu, Guochen, Li, Andong, Liu, Wenzhe, Zheng, Chengshi, Wang, Yutian, Wang, Hui

Due to the high computational complexity to model more frequency bands, it is still intractable to conduct real-time full-band speech enhancement based on deep neural networks. Recent studies typically utilize the compressed perceptually motivated fe

Externí odkaz: http://arxiv.org/abs/2203.16033

Zobrazit plný text záznamu

Report

TaylorBeamformer: Learning All-Neural Beamformer for Multi-Channel Speech Enhancement from Taylor's Approximation Theory

Autor: Li, Andong, Yu, Guochen, Zheng, Chengshi, Li, Xiaodong

While existing end-to-end beamformers achieve impressive performance in various front-end speech processing tasks, they usually encapsulate the whole process into a black box and thus lack adequate interpretability. As an attempt to fill the blank, w

Externí odkaz: http://arxiv.org/abs/2203.07195

Zobrazit plný text záznamu

Report

DMF-Net: A decoupling-style multi-band fusion model for full-band speech enhancement

Autor: Yu, Guochen, Guan, Yuansheng, Meng, Weixin, Zheng, Chengshi, Wang, Hui

For the difficulty and large computational complexity of modeling more frequency bands, full-band speech enhancement based on deep neural networks is still challenging. Previous studies usually adopt compressed full-band speech features in Bark and E

Externí odkaz: http://arxiv.org/abs/2203.00472

Zobrazit plný text záznamu

Report

DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement

Autor: Yu, Guochen, Li, Andong, Wang, Hui, Wang, Yutian, Ke, Yuxuan, Zheng, Chengshi

The decoupling-style concept begins to ignite in the speech enhancement area, which decouples the original complex spectrum estimation task into multiple easier sub-tasks i.e., magnitude-only recovery and the residual complex spectrum estimation)}, r

Externí odkaz: http://arxiv.org/abs/2202.07931

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání