Acoustic Event Classification using spectral band selection and Non-Negative Matrix Factorization-based features

Autor: Ascensión Gallardo-Antolín, Jimmy Ludeña-Choez
Přispěvatelé: Ministerio de Economía y Competitividad (España)
Rok vydání: 2016
Předmět:
System
Speech recognition
Feature extraction
Feature selection
02 engineering and technology
Audio signals
Theoretic Feature Selection
Non-negative matrix factorization
Matrix decomposition
030507 speech-language pathology & audiology
03 medical and health sciences
Artificial Intelligence
Robust
0202 electrical engineering
electronic engineering
information engineering

Filterbank
Mathematics
Non-Negative Matrix Factorization
Telecomunicaciones
business.industry
Variable complementarity
General Engineering
Pattern recognition
Mutual information
Computer Science Applications
Support vector machine
Feature (computer vision)
Temporal feature integration
020201 artificial intelligence & image processing
Mel-frequency cepstrum
Artificial intelligence
Acoustic Event Classification
0305 other medical science
business
Zdroj: e-Archivo. Repositorio Institucional de la Universidad Carlos III de Madrid
instname
e-Archivo: Repositorio Institucional de la Universidad Carlos III de Madrid
Universidad Carlos III de Madrid (UC3M)
Popis: We propose a new front-end for Acoustic Event Classification tasks (AEC).It consists of two stages: short-time feature extraction and temporal integration.The first module relies on mutual information-based frequency band selection.The second module is based on Non-Negative Matrix Factorization (NMF).Results show that it outperforms the baseline system in clean and noisy conditions. Feature extraction methods for sound events have been traditionally based on parametric representations specifically developed for speech signals, such as the well-known Mel Frequency Cepstrum Coefficients (MFCC). However, the discrimination capabilities of these features for Acoustic Event Classification (AEC) tasks could be enhanced by taking into account the spectro-temporal structure of acoustic event signals. In this paper, a new front-end for AEC which incorporates this specific information is proposed. It consists of two different stages: short-time feature extraction and temporal feature integration. The first module aims at providing a better spectral representation of the different acoustic events on a frame-by-frame basis, by means of the automatic selection of the optimal set of frequency bands from which cepstral-like features are extracted. The second stage is designed for capturing the most relevant temporal information in the short-time features, through the application of Non-Negative Matrix Factorization (NMF) on their periodograms computed over long audio segments. The whole front-end has been evaluated in clean and noisy conditions. Experiments show that the removal of certain frequency bands (which are mainly located in the medium region of the spectrum for clean conditions and in low frequencies for noisy environments) in the short-time feature computation process in conjunction with the NMF technique for temporal feature integration improves significantly the performance of a Support Vector Machine (SVM) based AEC system with respect to the use of conventional MFCCs.
Databáze: OpenAIRE