Ideal neighbourhood mask for speech enhancement using deep neural networks

Autor: Christian Arcos, Abraham Alcaim, Marley M. B. R. Vellasco
Rok vydání: 2019
Předmět:
Zdroj: IJCNN
DOI: 10.1109/ijcnn.2019.8852363
Popis: Degradation of speech signal due to adverse conditions is the major challenge for automatic speech recognition (ASR) systems. This paper introduces a novel approach to estimate an Ideal Neighbourhood Mask (INM) for speech segregation based on deep neural networks estimator. The method described here is based on the local binary patterns (LBP) technique often used in digital image processing. Ideal Neighbourhood Mask will indicate which time-frequency (T-F) units of the noisy speech are canceled. The performance assessment of the proposed application in conjunction with the traditional mask techniques, i.e., Ideal Binary Mask (IBM) and Ideal Ratio Mask (IRM), are carried out under various environments regarding the objective speech quality measures. The recognition experiments including results in the AURORA IV framework indicate that the proposed scheme, when applied in adverse environments yield significantly better performance than the conventional techniques.
Databáze: OpenAIRE