An information-theoretic perspective of physical adversarial patches.

Autor: Tarchoun B; Université de Sousse, Ecole Nationale d'Ingénieurs de Sousse, LATIS- Laboratory of Advanced Technology and Intelligent Systems, 4023, Sousse, Tunisia. Electronic address: bilel.tarchoun@eniso.u-sousse.tn., Ben Khalifa A; Université de Sousse, Ecole Nationale d'Ingénieurs de Sousse, LATIS- Laboratory of Advanced Technology and Intelligent Systems, 4023, Sousse, Tunisia; Université de Jendouba, Institut National des Technologies et des Sciences du Kef, 7100, Le Kef, Tunisia., Mahjoub MA; Université de Sousse, Ecole Nationale d'Ingénieurs de Sousse, LATIS- Laboratory of Advanced Technology and Intelligent Systems, 4023, Sousse, Tunisia., Abu-Ghazaleh N; University of California Riverside, CA, USA., Alouani I; IEMN CNRS 8520, INSA Hauts-de-France, UPHF, France; CSIT, Queen's University Belfast, UK.
Jazyk: angličtina
Zdroj: Neural networks : the official journal of the International Neural Network Society [Neural Netw] 2024 Nov; Vol. 179, pp. 106590. Date of Electronic Publication: 2024 Aug 03.
DOI: 10.1016/j.neunet.2024.106590
Abstrakt: Real-world adversarial patches were shown to be successful in compromising state-of-the-art models in various computer vision applications. Most existing defenses rely on analyzing input or feature level gradients to detect the patch. However, these methods have been compromised by recent GAN-based attacks that generate naturalistic patches. In this paper, we propose a new perspective to defend against adversarial patches based on the entropy carried by the input, rather than on its saliency. We present Jedi, a new defense against adversarial patches that tackles the patch localization problem from an information theory perspective; leveraging the high entropy of adversarial patches to identify potential patch zones, and using an autoencoder to complete patch regions from high entropy kernels. Jedi achieves high-precision adversarial patch localization and removal, detecting on average 90% of adversarial patches across different benchmarks, and recovering up to 94% of successful patch attacks. Since Jedi relies on an input entropy analysis, it is model-agnostic, and can be applied to off-the-shelf models without changes to the training or inference of the models. Moreover, we propose a comprehensive qualitative analysis that investigates the cases where Jedi fails, comparatively with related methods. Interestingly, we find a significant core failure cases among the different defenses share one common property: high entropy. We think that this work offers a new perspective to understand the adversarial effect under physical-world settings. We also leverage these findings to enhance Jedi's handling of entropy outliers by introducing Adaptive Jedi, which boosts performance by up to 9% in challenging images.
Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
(Copyright © 2024 Elsevier Ltd. All rights reserved.)
Databáze: MEDLINE