A systematic study of the class imbalance problem: Automatically identifying empty camera trap images using convolutional neural networks

Autor: Meng-Tao Liu, Tao Li, Xiaowei Li, Benhui Chen, Dengqi Yang
Rok vydání: 2021
Předmět:
Zdroj: Ecological Informatics. 64:101350
ISSN: 1574-9541
Popis: Camera traps, which are widely used in wildlife surveys, often produce massive images, and many of them are empty images not contain animals. Using the deep learning model to automatically identify the empty camera trap images can reduce the workload of manual classification significantly. However, the performance of deep learning models is easily affected by the class imbalance problem of training datasets, which is a common problem for actual wildlife survey projects. Almost all previous studies on empty image recognition used down-sampling or oversampling methods to eliminate the effect of class imbalance on the performance of deep learning classifiers. The class imbalance problem has been systematically studied in the field of traditional image recognition, yet very limited research is available in the context of identifying camera trap images taken from highly cluttered natural scenes. This study systematically studied the impact of class imbalance on model performance when using a deep learning model to identify empty camera trap images. Then we proposed the construction method of training sets of the deep learning model when the data set has different class imbalance levels. Based on results from our experiments we concluded that (i) the class imbalance showed little effect on the performance of the model when the empty image ratio (EIR) in the data set was between 10% and 70%, so the training sets can be randomly built without changing the class distribution; (ii) we recommended using oversampling to partially eliminate class imbalance to reduce omission errors when the EIR of the data set exceeded 70%; (iii) when the EIRs of the training set and the test set were close, the overall error, omission error, and commission error of the model were relatively smaller, and the model tended to achieve a better overall performance; (iv) the omission and commission errors can be adjusted by changing the percentage of empty images in the training set.
Databáze: OpenAIRE