Infrared and 3D Skeleton Feature Fusion for RGB-D Action Recognition

Autor: Alban Main De Boissiere, Rita Noumeir
Jazyk: angličtina
Rok vydání: 2020
Zdroj: IEEE Access, Vol 8, Pp 168297-168308 (2020)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2020.3023599
Popis: For skeleton-based action recognition from depth cameras, distinguishing object-related actions with similar motions is a difficult task. The other available video streams (RGB, infrared, depth) may provide additional clues, given an appropriate feature fusion strategy. We propose a modular network combining skeleton and infrared data. A pre-trained 2D convolutional neural network (CNN) is used as a pose module to extract features from skeleton data. A pre-trained 3D CNN is used as an infrared module to extract visual features from videos. Both feature vectors are then fused and exploited jointly using a multilayer perceptron (MLP). The 2D skeleton coordinates are used to crop a region of interest around the subjects for the infrared videos. Infrared is favored over RGB, as it is less affected by illumination conditions and usable in the dark. We are the first to combine infrared and skeleton data. We evaluate our method on the NTU RGB+D dataset, the largest dataset for human action recognition from depth cameras. We perform extensive ablation studies. In particular, we show the strong contributions of our cropping strategy and pre-training on action classification accuracy. We also test various feature fusion schemes. Feature sum on an element-wise level yields the best results. Our method achieves state-of-the-art performances on NTU RBG+D.
Databáze: Directory of Open Access Journals