Abstrakt: |
Recent researches demonstrate that Deep Neural Networks (DNN) models are vulnerable to backdoor attacks. The backdoored DNN model will behave maliciously when images containing backdoor triggers arrive. To date, almost all the existing backdoor attacks are single-trigger and single-target attacks, and the triggers of most existing backdoor attacks are obvious thus are easy to be detected or noticed. In this paper, we propose a novel imperceptible and multi-channel backdoor attack method against Deep Neural Networks by exploiting Discrete Cosine Transform (DCT) steganography. The proposed method injects backdoor instances into the training set and does not require controlling the whole training process. Specifically, for a colored image, we utilize DCT steganography to construct and embed trigger into different channels of the image in frequency domain. As a result, the trigger shown in the time domain is stealthy and natural. Then the generated backdoor instances are injected into the training dataset to train the DNN model. Based on the proposed backdoor attack method, we implement two cunning variants of backdoor attacks, imperceptible N-to-N (multi-target) backdoor attack and imperceptible N-to-One (multi-trigger) backdoor attack. Experimental results demonstrate that the attack success rate of the N-to-N backdoor attack is 95.09% on CIFAR-10 dataset, 93.33% on TinyImageNet dataset and 92.45% on ImageNet dataset, respectively. The attack success rate of the N-to-One attack is 90.22% on CIFAR-10 dataset, 89.56% on TinyImageNet dataset and 88.29% on ImageNet dataset, respectively. Meanwhile, the proposed backdoor attack does not affect the classification accuracy of the DNN models. Moreover, the proposed attack is demonstrated to be robust against two state-of-the-art backdoor defenses, including the recent frequency domain defense. [ABSTRACT FROM AUTHOR] |