Music emotion recognition using convolutional long short term memory deep neural networks
Autor: | Serdar Yildirim, Zekeriya Tufekci, Serhat Hizlisoy |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Computer Networks and Communications
Computer science 020209 energy Speech recognition 02 engineering and technology Convolutional neural network Biomaterials 0202 electrical engineering electronic engineering information engineering Convolutional long short term memory deep neural networks Civil and Structural Engineering Fluid Flow and Transfer Processes Artificial neural network Mechanical Engineering 020208 electrical & electronic engineering Metals and Alloys Filter bank Music emotion recognition Electronic Optical and Magnetic Materials Random forest Support vector machine Hardware and Architecture lcsh:TA1-2040 Mel-frequency cepstrum lcsh:Engineering (General). Civil engineering (General) Classifier (UML) Turkish emotional music database |
Zdroj: | Engineering Science and Technology, an International Journal, Vol 24, Iss 3, Pp 760-767 (2021) |
ISSN: | 2215-0986 |
Popis: | © 2020 Karabuk UniversityIn this paper, we propose an approach for music emotion recognition based on convolutional long short term memory deep neural network (CLDNN) architecture. In addition, we construct a new Turkish emotional music database composed of 124 Turkish traditional music excerpts with a duration of 30 s each and the performance of the proposed approach is evaluated on the constructed database. We utilize features obtained by feeding convolutional neural network (CNN) layers with log-mel filterbank energies and mel frequency cepstral coefficients (MFCCs) in addition to standard acoustic features. Classification results show that the best performance is obtained when the new feature set is combined with the standard features using the long short term memory (LSTM) + deep neural network (DNN) classi fier. The overall accuracy of 99.19% is obtained using the proposed system with 10 fold cross-validation. Specifically, 6.45 points improvement is achieved. Additionally, the results also show that the LSTM + DNN classifier yields 1.61, 1.61 and 3.23 points improvements in music emotion recognition accuracies compared to k-nearest neighbor (k-NN), support vector machine (SVM), and Random Forest classifiers, respectively. |
Databáze: | OpenAIRE |
Externí odkaz: |