Reducing dimensionality of spectrograms using convolutional autoencoders

Autor: William F. Jenkins, Peter Gerstoft, Chih-Chieh Chien, Emma Ozanich
Rok vydání: 2023
Předmět:
Zdroj: The Journal of the Acoustical Society of America. 153:A178-A178
ISSN: 1520-8524
0001-4966
DOI: 10.1121/10.0018582
Popis: Under the “curse of dimensionality,” distance-based algorithms, such as k-means or Gaussian mixture model clustering, can lose meaning and interpretability in high-dimensional space. Acoustic data, specifically spectrograms, are subject to such limitations due to their high dimensionality: for example, a spectrogram with 100 time- and 100 frequency-bins contains 104 pixels, and its vectorized form constitutes a point in 104-dimensional space. In this talk, we look at four papers that used autoencoding convolutional neural networks to extract salient features of real data. The convolutional autoencoder consists of an encoder which compresses spectrograms into a low-dimensional latent feature space, and a decoder which seeks to reconstruct the original spectrogram from the latent feature space. The error between the original spectrogram and reconstruction is used to train the network. Once trained, the salient features of the data are embedded in the latent space and algorithms can be applied to the lower-dimensional latent space. We demonstrate how lower-dimensional representations result in interpretable clustering of complex physical data, which can contribute to reducing errors in classification and clustering tasks and enable exploratory analysis of large data sets.
Databáze: OpenAIRE