Voice Activity Detection Using Fuzzy Entropy and Support Vector Machine
Autor: | P. Vasuki, R. Johny Elton, J. Mohanalin |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2016 |
Předmět: |
Computer science
Speech recognition General Physics and Astronomy lcsh:Astrophysics 02 engineering and technology Cross-validation 030507 speech-language pathology & audiology 03 medical and health sciences Fuzzy entropy voice activity detection fuzzy entropy support vector machine k-NN lcsh:QB460-466 0202 electrical engineering electronic engineering information engineering lcsh:Science Voice activity detection business.industry Timit database 020206 networking & telecommunications Ranging Pattern recognition lcsh:QC1-999 Support vector machine Feature (computer vision) Computer Science::Sound lcsh:Q Noise (video) Artificial intelligence 0305 other medical science business lcsh:Physics |
Zdroj: | Entropy, Vol 18, Iss 8, p 298 (2016) Entropy; Volume 18; Issue 8; Pages: 298 |
ISSN: | 1099-4300 |
Popis: | This paper proposes support vector machine (SVM) based voice activity detection using FuzzyEn to improve detection performance under noisy conditions. The proposed voice activity detection (VAD) uses fuzzy entropy (FuzzyEn) as a feature extracted from noise-reduced speech signals to train an SVM model for speech/non-speech classification. The proposed VAD method was tested by conducting various experiments by adding real background noises of different signal-to-noise ratios (SNR) ranging from −10 dB to 10 dB to actual speech signals collected from the TIMIT database. The analysis proves that FuzzyEn feature shows better results in discriminating noise and corrupted noisy speech. The efficacy of the SVM classifier was validated using 10-fold cross validation. Furthermore, the results obtained by the proposed method was compared with those of previous standardized VAD algorithms as well as recently developed methods. Performance comparison suggests that the proposed method is proven to be more efficient in detecting speech under various noisy environments with an accuracy of 93.29%, and the FuzzyEn feature detects speech efficiently even at low SNR levels. |
Databáze: | OpenAIRE |
Externí odkaz: |