LiSenNet: Lightweight Sub-band and Dual-Path Modeling for Real-Time Speech Enhancement

Autor: Yan, Haoyin, Zhang, Jie, Fan, Cunhang, Zhou, Yeping, Liu, Peiqi
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
Popis: Speech enhancement (SE) aims to extract the clean waveform from noise-contaminated measurements to improve the speech quality and intelligibility. Although learning-based methods can perform much better than traditional counterparts, the large computational complexity and model size heavily limit the deployment on latency-sensitive and low-resource edge devices. In this work, we propose a lightweight SE network (LiSenNet) for real-time applications. We design sub-band downsampling and upsampling blocks and a dual-path recurrent module to capture band-aware features and time-frequency patterns, respectively. A noise detector is developed to detect noisy regions in order to perform SE adaptively and save computational costs. Compared to recent higher-resource-dependent baseline models, the proposed LiSenNet can achieve a competitive performance with only 37k parameters (half of the state-of-the-art model) and 56M multiply-accumulate (MAC) operations per second.
Comment: 5 pages, submitted to 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025)
Databáze: arXiv