2-D psychoacoustic modeling for automatic speech recognition in noisy environment

Autor: Ketan J. Raut, Sampreeta Desai, Prasad D. Khandekar
Rok vydání: 2016
Předmět:
Zdroj: 2016 Conference on Advances in Signal Processing (CASP).
DOI: 10.1109/casp.2016.7746151
Popis: Powerful automatic speech recognition system (ASR)is matter of commercial importance as many leading companies are sprinting at industry and consumer level production. One of the major reasons for speech quality to hamper is environmental noise. Speech gets obscured by the loud background sound. This adversely affects the performance of automatic speech recognition system. We also know that human auditory system is comparatively more capable of managing noise than the machine. So as to improve the performance of ASR, auditory properties of human system is studied and modeled with the help of psychoacoustic filter. The filter is labeled as 2D P-filter as its parameter has values zero or positive. Also to remove noise, masking effect is implemented where the sounds falling under predetermined masking threshold are modified. Therefore the enhanced set of features are extracted by applying this filter to the Mel filter bank. The novelty of the paper is use of different distance metrics for classification and testing the performance of Automatic speech recognition system. Experiments are carried out on database of recording of rhyming words by articulatory disabled children in a studio. Expected results obtained after testing phase for noisy speech signals would be considerably improved.
Databáze: OpenAIRE