Black-box Adversarial Attacks on Video Recognition Models

Autor: Xingjun Ma, Linxi Jiang, Yu-Gang Jiang, James Bailey, Shaoxiang Chen
Rok vydání: 2019
Předmět:
FOS: Computer and information sciences
Computer Science - Machine Learning
Computer Science - Cryptography and Security
Computer science
Computer Vision and Pattern Recognition (cs.CV)
Computer Science - Computer Vision and Pattern Recognition
050801 communication & media studies
Machine Learning (stat.ML)
02 engineering and technology
Machine learning
computer.software_genre
Field (computer science)
Image (mathematics)
Machine Learning (cs.LG)
0508 media and communications
Robustness (computer science)
Statistics - Machine Learning
0202 electrical engineering
electronic engineering
information engineering

Vulnerability (computing)
Black box (phreaking)
business.industry
05 social sciences
Adversary
Multimedia (cs.MM)
Projection (relational algebra)
Benchmark (computing)
020201 artificial intelligence & image processing
Artificial intelligence
business
computer
Cryptography and Security (cs.CR)
Subspace topology
Computer Science - Multimedia
Zdroj: ACM Multimedia
DOI: 10.48550/arxiv.1904.05181
Popis: Deep neural networks (DNNs) are known for their vulnerability to adversarial examples. These are examples that have undergone small, carefully crafted perturbations, and which can easily fool a DNN into making misclassifications at test time. Thus far, the field of adversarial research has mainly focused on image models, under either a white-box setting, where an adversary has full access to model parameters, or a black-box setting where an adversary can only query the target model for probabilities or labels. Whilst several white-box attacks have been proposed for video models, black-box video attacks are still unexplored. To close this gap, we propose the first black-box video attack framework, called V-BAD. V-BAD utilizes tentative perturbations transferred from image models, and partition-based rectifications found by the NES on partitions (patches) of tentative perturbations, to obtain good adversarial gradient estimates with fewer queries to the target model. V-BAD is equivalent to estimating the projection of an adversarial gradient on a selected subspace. Using three benchmark video datasets, we demonstrate that V-BAD can craft both untargeted and targeted attacks to fool two state-of-the-art deep video recognition models. For the targeted attack, it achieves $>$93\% success rate using only an average of $3.4 \sim 8.4 \times 10^4$ queries, a similar number of queries to state-of-the-art black-box image attacks. This is despite the fact that videos often have two orders of magnitude higher dimensionality than static images. We believe that V-BAD is a promising new tool to evaluate and improve the robustness of video recognition models to black-box adversarial attacks.
Databáze: OpenAIRE