Moving Foreground-Aware Visual Attention and Key Volume Mining for Human Action Recognition

Autor: Junxuan Zhang, Xinlong Lu, Haifeng Hu
Rok vydání: 2019
Předmět:
Zdroj: ACM Transactions on Multimedia Computing, Communications, and Applications. 15:1-16
ISSN: 1551-6865
1551-6857
DOI: 10.1145/3321511
Popis: Recently, many deep learning approaches have shown remarkable progress on human action recognition. However, it remains unclear how to extract the useful information in videos since only video-level labels are available in the training phase. To address this limitation, many efforts have been made to improve the performance of action recognition by applying the visual attention mechanism in the deep learning model. In this article, we propose a novel deep model called Moving Foreground Attention (MFA) that enhances the performance of action recognition by guiding the model to focus on the discriminative foreground targets. In our work, MFA detects the moving foreground through a proposed variance-based algorithm. Meanwhile, an unsupervised proposal is utilized to mine the action-related key volumes and generate corresponding correlation scores. Based on these scores, a newly proposed stochastic-out scheme is exploited to train the MFA. Experiment results show that action recognition performance can be significantly improved by using our proposed techniques, and our model achieves state-of-the-art performance on UCF101 and HMDB51.
Databáze: OpenAIRE