Fast and Accurate Action Detection in Videos With Motion-Centric Attention Model

Autor: Wenmin Wang, Wen Gao, Jinzhuo Wang
Rok vydání: 2020
Předmět:
Zdroj: IEEE Transactions on Circuits and Systems for Video Technology. 30:117-130
ISSN: 1558-2205
1051-8215
Popis: A key factor that makes action detection in videos different from general video classification is human-guided clues, especially motion signals. Since not all the pixels in a video are informative for action recognition, the irrelevant and redundant parts can lead to a lot of noise and be burdensome for both feature extraction and classifier training. This encourages the researchers to seek out the design of the attentive model that can dynamically focus computations on the key spatiotemporal volumes. In this paper, we propose a motion-centric attention model for action detection in videos which imitates the human perception of saccade and fixation procedures while detecting actions in a video. Specifically, we first present a strategy to generate motion-centric locations based on the density peak of motion signals, providing reliable candidates around which actions have high possibilities to occur. Then, we introduce an attention model that conducts the saccade and fixation procedures on these candidates to observe local spatiotemporal visual information, preserve internal comprehension, and produce the action proposals on temporal bounds. Afterward, a classifier with several variants is prepared to classify the action proposals and decide which one to fixate and generate the final predictions. We show how to efficiently train our model to produce fast and accurate action detection, by scanning only a small fraction of locations in a video. The extensive experiments on three challenging datasets show promising results with both accuracy and speed.
Databáze: OpenAIRE