Speech Enhancement with Partial Signal Reconstruction Based on Deep Recurrent Neural Networks and Pitch-Specific Codebooks

Autor: Mohammed Krini, Tobias Stegmann
Rok vydání: 2020
Předmět:
Zdroj: 2020 15th IEEE International Conference on Signal Processing (ICSP).
DOI: 10.1109/icsp48669.2020.9320930
Popis: The speech quality achieved by conventional noise suppression methods at high noise conditions is often unsatisfactory. By recovering highly disturbed speech components with speech reconstruction methods, the overall speech quality can be further improved. The speech reconstruction method presented in this paper is based on the so-called source-filter model of speech production. The focus in this contribution will be on the estimation of the signal coming from the glottis (spectral excitation signal) and the vocal tract filter characteristics (spectral envelope) at high noise conditions as it has been proved to be very important for speech reconstruction. For this purpose a deep recurrent neural network (Deep-RNN) which operates as a regression model for given noise features is utilized for spectral envelope estimation and a codebook approach is used for estimation of the spectral excitation signal. The quality of the resulting enhanced speech is analyzed with objective measures as well as with subjective tests and indicates a significant quality improvement compared to conventional schemes – especially in high noise conditions.
Databáze: OpenAIRE