A Fusion of Deep and Shallow Learning to Predict Genres Based on Instrument and Timbre Features

Autor:	Benedikt Adrian, Jurij Kuzmic, Igor Vatolkin
Rok vydání:	2021
Předmět:	Training set business.industry Computer science Training time Instrument recognition 02 engineering and technology Machine learning computer.software_genre Convolutional neural network 020204 information systems Classifier (linguistics) 0202 electrical engineering electronic engineering information engineering Deep neural networks 020201 artificial intelligence & image processing Artificial intelligence business Timbre computer Interpretability
Zdroj:	Artificial Intelligence in Music, Sound, Art and Design ISBN: 9783030729134 EvoMUSART
DOI:	10.1007/978-3-030-72914-1_21
Popis:	Deep neural networks have recently received a lot of attention and have very successfully contributed to many music classification tasks. However, they have also drawbacks compared to the traditional methods: a very high number of parameters, a decreased performance for small training sets, lack of model interpretability, long training time, and hence a larger environmental impact with regard to computing resources. Therefore, it can still be a better choice to apply shallow classifiers for a particular application scenario with specific evaluation criteria, like the size of the training set or a required interpretability of models. In this work, we propose an approach based on both deep and shallow classifiers for music genre classification: The convolutional neural networks are trained once to predict instruments, and their outputs are used as features to predict music genres with a shallow classifier. The results show that the individual performance of such descriptors is comparable to other instrument-related features and they are even better for more than half of 19 genre categories.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::0835b4027326b6e2a027ff194d76c48c https://doi.org/10.1007/978-3-030-72914-1_21 Zobrazit plný text záznamu