Birdcall Identification Using Mel-spectrum Based on ResNeSt50 Model

Autor: Yihan Hong, Shizhen Liu, Chenhao Cui, Siyuan Lin
Rok vydání: 2021
Předmět:
Zdroj: 2021 IEEE International Conference on Computer Science, Electronic Information Engineering and Intelligent Control Technology (CEI).
DOI: 10.1109/cei52496.2021.9574552
Popis: Audio processing and recognition technology is mainly to realize the recognition and prediction of audio data by converting audio data into image data by means of spectrum and other forms, and then performing network model training on the image data. This paper uses the ResNeSt50 model to classify and predict bird call based on a number of collected bird calls. We first preprocessed the raw data given by the material, and transformed the audio data into processed data that can be used by the model with the help of the Mel spectrogram. Then we use the ResNet50 network model to train and test the data set, and finally get the prediction result. Through experiments, we finally found that the ResNeSt50 model has better prediction performance than ResNet34, VGG19 and Simple NN models.
Databáze: OpenAIRE