Autor: |
G, Thimmaraja Yadava, G, Nagaraja B, S, Jayanna H, R, Shivakumar B |
Zdroj: |
Multimedia Tools & Applications; Mar2024, Vol. 83 Issue 10, p28675-28688, 14p |
Abstrakt: |
We develop two improvements over our previously proposed spectral subtraction with voice activity detection and minimum mean square error spectrum power estimator based on zero crossing (SS-VAD + MMSE-SPZC) enhancement for a real-time spoken query system (SQS). Firstly, we introduce a time delay neural network (TDNN) based modeling technique. Secondly, to properly train the models, we increase the size of the database by collecting the Kannada speech data from an additional 500 farmers under real-time conditions. The proposed combined enhancement technique effectively removes background noise and improves speech quality. When evaluated on the updated degraded speech corpus, our proposed automatic speech recognition (ASR) system achieves better performance compared to previous framework. Moreover, experimental results demonstrate an improvement of 1.32% and 1.48% in terms of speech recognition accuracy for noisy and enhanced speech data respectively, compared to our earlier work. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|