Acoustic Event Classification using spectral band selection and Non-Negative Matrix Factorization-based features
Autor: | Ascensión Gallardo-Antolín, Jimmy Ludeña-Choez |
---|---|
Přispěvatelé: | Ministerio de Economía y Competitividad (España) |
Rok vydání: | 2016 |
Předmět: |
System
Speech recognition Feature extraction Feature selection 02 engineering and technology Audio signals Theoretic Feature Selection Non-negative matrix factorization Matrix decomposition 030507 speech-language pathology & audiology 03 medical and health sciences Artificial Intelligence Robust 0202 electrical engineering electronic engineering information engineering Filterbank Mathematics Non-Negative Matrix Factorization Telecomunicaciones business.industry Variable complementarity General Engineering Pattern recognition Mutual information Computer Science Applications Support vector machine Feature (computer vision) Temporal feature integration 020201 artificial intelligence & image processing Mel-frequency cepstrum Artificial intelligence Acoustic Event Classification 0305 other medical science business |
Zdroj: | e-Archivo. Repositorio Institucional de la Universidad Carlos III de Madrid instname e-Archivo: Repositorio Institucional de la Universidad Carlos III de Madrid Universidad Carlos III de Madrid (UC3M) |
Popis: | We propose a new front-end for Acoustic Event Classification tasks (AEC).It consists of two stages: short-time feature extraction and temporal integration.The first module relies on mutual information-based frequency band selection.The second module is based on Non-Negative Matrix Factorization (NMF).Results show that it outperforms the baseline system in clean and noisy conditions. Feature extraction methods for sound events have been traditionally based on parametric representations specifically developed for speech signals, such as the well-known Mel Frequency Cepstrum Coefficients (MFCC). However, the discrimination capabilities of these features for Acoustic Event Classification (AEC) tasks could be enhanced by taking into account the spectro-temporal structure of acoustic event signals. In this paper, a new front-end for AEC which incorporates this specific information is proposed. It consists of two different stages: short-time feature extraction and temporal feature integration. The first module aims at providing a better spectral representation of the different acoustic events on a frame-by-frame basis, by means of the automatic selection of the optimal set of frequency bands from which cepstral-like features are extracted. The second stage is designed for capturing the most relevant temporal information in the short-time features, through the application of Non-Negative Matrix Factorization (NMF) on their periodograms computed over long audio segments. The whole front-end has been evaluated in clean and noisy conditions. Experiments show that the removal of certain frequency bands (which are mainly located in the medium region of the spectrum for clean conditions and in low frequencies for noisy environments) in the short-time feature computation process in conjunction with the NMF technique for temporal feature integration improves significantly the performance of a Support Vector Machine (SVM) based AEC system with respect to the use of conventional MFCCs. |
Databáze: | OpenAIRE |
Externí odkaz: |