Symmetrical Enhanced Fusion Network for Skeleton-Based Action Recognition

Autor: Min Jiang, Haoyang Deng, Jun Kong
Rok vydání: 2021
Předmět:
Zdroj: IEEE Transactions on Circuits and Systems for Video Technology. 31:4394-4408
ISSN: 1558-2205
1051-8215
DOI: 10.1109/tcsvt.2021.3050807
Popis: A novel method for skeleton-based action recognition by fusing multi-level spatial features and multi-level temporal features is proposed in this article. Recently, Graph Convolutional Network (GCN) for skeleton-based action recognition has attracted the eyes of many researchers and has a great performance in the field of action recognition. But most of them focus on changing architecture of single-stream network and only use simple methods like average fusion to fuse different forms of skeleton data. In this article, we shift the focus to the problem that insufficient interactions between the different forms of features for that networks are unable to fully capture efficient information from skeleton data. To tackle this problem, we propose a multi-stream network called Symmetrical Enhanced Fusion Network (SEFN). The network is composed of a spatial stream, a temporal stream and a fusion stream. The spatial stream extracts spatial features from skeleton data by GCN. The temporal stream is able to extract temporal features from skeleton data with the help of the embedded Motion Sequence Calculation Algorithm. The fusion stream provides an early fusion method and extra fusion information for the whole network. It gathers multi-level features from two feature extractions and fuses them with the Multi-perspective Attention Fusion Module (MPAFM) we propose. The MPAFM enables different forms of data to enhance each other and can strengthen feature extractions. In the final, we generalize the skeleton data from joint data to bone data and evaluate our network in three large-scale benchmarks: NTU-RGBD, NTU-RGBD 120 and Kinetics-Skeleton. Experiment results demonstrate that our method achieves competitive performance.
Databáze: OpenAIRE