Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks

Autor:	Roneel V. Sharan, Hao Xiong, Shlomo Berkovsky
Jazyk:	angličtina
Rok vydání:	2021
Předmět:	convolutional neural networks fusion interpolation machine learning spectrogram time-frequency image Chemical technology TP1-1185
Zdroj:	Sensors, Vol 21, Iss 10, p 3434 (2021)
Druh dokumentu:	article
ISSN:	1424-8220
DOI:	10.3390/s21103434
Popis:	Audio signal classification finds various applications in detecting and monitoring health conditions in healthcare. Convolutional neural networks (CNN) have produced state-of-the-art results in image classification and are being increasingly used in other tasks, including signal classification. However, audio signal classification using CNN presents various challenges. In image classification tasks, raw images of equal dimensions can be used as a direct input to CNN. Raw time-domain signals, on the other hand, can be of varying dimensions. In addition, the temporal signal often has to be transformed to frequency-domain to reveal unique spectral characteristics, therefore requiring signal transformation. In this work, we overview and benchmark various audio signal representation techniques for classification using CNN, including approaches that deal with signals of different lengths and combine multiple representations to improve the classification accuracy. Hence, this work surfaces important empirical evidence that may guide future works deploying CNN for audio signal classification purposes.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/dc607b9d8c98434c904608fdc8007047 Zobrazit plný text záznamu View record in DOAJ Plný text ve formátu PDF Plný text ve formátu HTML
Nepřihlášeným uživatelům se plný text nezobrazuje	K zobrazení výsledku je třeba se přihlásit.