LiSenNet: Lightweight Sub-band and Dual-Path Modeling for Real-Time Speech Enhancement
Autor: | Yan, Haoyin, Zhang, Jie, Fan, Cunhang, Zhou, Yeping, Liu, Peiqi |
---|---|
Rok vydání: | 2024 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | Speech enhancement (SE) aims to extract the clean waveform from noise-contaminated measurements to improve the speech quality and intelligibility. Although learning-based methods can perform much better than traditional counterparts, the large computational complexity and model size heavily limit the deployment on latency-sensitive and low-resource edge devices. In this work, we propose a lightweight SE network (LiSenNet) for real-time applications. We design sub-band downsampling and upsampling blocks and a dual-path recurrent module to capture band-aware features and time-frequency patterns, respectively. A noise detector is developed to detect noisy regions in order to perform SE adaptively and save computational costs. Compared to recent higher-resource-dependent baseline models, the proposed LiSenNet can achieve a competitive performance with only 37k parameters (half of the state-of-the-art model) and 56M multiply-accumulate (MAC) operations per second. Comment: 5 pages, submitted to 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025) |
Databáze: | arXiv |
Externí odkaz: |