Abstrakt: |
One-class anomaly detection is conducted to identify anomalous instances with different distributions from the expected normal instances. For this task, an Encoder-Decoder-Encoder typed Generative Adversarial Network (EDE-GAN) in previous research has shown state-of-the-art performance. However, there is a lack of research exploration on why this structure has such superior performance and the impact of hyperparameter settings on model performance. Therefore, in this paper, we first construct two GAN architectures to study these issues. We conclude that the following three factors play a key important role: (1) The EDE-GAN calculates the distance between two latent vectors as the anomaly score, which is unlike the previous methods by utilizing the reconstruction error between images. (2) Unlike other GAN architectures, the EDE-GAN model always obtains best results when the batch size is set to 1. (3) There is also evidence of how beneficial constraint on the latent space are when engaging in model training. Furthermore, to learn a compact and fast model, we also propose a Progressive Knowledge Distillation with GANs (P-KDGAN), which connects two standard GANs through the designed distillation loss. Two-step progressive learning continuously augments the performance of student GANs with improved results over single-step approach. Our experimental results on CIFAR-10, MNIST, and FMNIST datasets illustrate that P-KDGAN improves the performance of the student GAN by 2.44%, 1.77%, and 1.73% when compressing the computation at ratios of 24.45:1, 311.11:1, and 700:1, respectively [ABSTRACT FROM AUTHOR] |