L3D: Light and Mixed Kernel Convolutional Neural Network for Action Recognition

Autor: Su, Sheng-Ya, 蘇聖雅
Rok vydání: 2019
Druh dokumentu: 學位論文 ; thesis
Popis: 107
In this work we introduce a new network architecture called “L3D” which is inspired by the question if a single network can focus on spatial features first then focus on temporal features, finally predict the result? Traditional 3D convolutional neural network get spatiotemporal features through all the network but not every parts of network has been used perfectly. In other words, use 3D convolution through all network can lead to some redundancy. Our model is highly parameter efficient by using different 3D kernel sizes so that every feature can be used wisely and focused. We empirically demonstrate the accuracy advantage of our model over other 3D CNNs on UCF101 trained from scratch and also get comparable results on Kinetics, UCF101 and HMDB51.
Databáze: Networked Digital Library of Theses & Dissertations