Speech Enhancement with Partial Signal Reconstruction Based on Deep Recurrent Neural Networks and Pitch-Specific Codebooks

Autor:	Mohammed Krini, Tobias Stegmann
Rok vydání:	2020
Předmět:	Speech production Glottis Excitation signal Noise measurement Signal reconstruction Computer science Speech recognition Noise reduction Speech coding Codebook Speech enhancement Noise medicine.anatomical_structure Signal-to-noise ratio Computer Science::Sound Spectral envelope medicine Vocal tract
Zdroj:	2020 15th IEEE International Conference on Signal Processing (ICSP).
DOI:	10.1109/icsp48669.2020.9320930
Popis:	The speech quality achieved by conventional noise suppression methods at high noise conditions is often unsatisfactory. By recovering highly disturbed speech components with speech reconstruction methods, the overall speech quality can be further improved. The speech reconstruction method presented in this paper is based on the so-called source-filter model of speech production. The focus in this contribution will be on the estimation of the signal coming from the glottis (spectral excitation signal) and the vocal tract filter characteristics (spectral envelope) at high noise conditions as it has been proved to be very important for speech reconstruction. For this purpose a deep recurrent neural network (Deep-RNN) which operates as a regression model for given noise features is utilized for spectral envelope estimation and a codebook approach is used for estimation of the spectral excitation signal. The quality of the resulting enhanced speech is analyzed with objective measures as well as with subjective tests and indicates a significant quality improvement compared to conventional schemes – especially in high noise conditions.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::935f8739dccbf36107a19c32f4e370ad https://doi.org/10.1109/icsp48669.2020.9320930 Zobrazit plný text záznamu