Unsupervised Deep Networks for Temporal Localization of Human Actions in Streaming Videos

Autor: Binu M. Nair
Rok vydání: 2016
Předmět:
Zdroj: Advances in Visual Computing ISBN: 9783319508313
ISVC (2)
DOI: 10.1007/978-3-319-50832-0_15
Popis: We propose a deep neural network which captures latent temporal features suitable for localizing actions temporally in streaming videos. This network uses unsupervised generative models containing autoencoders and conditional restricted Boltzmann machines to model temporal structure present in an action. Human motions are non-linear in nature, and thus require continuous temporal model representation of motion which are crucial for streaming videos. The generative ability would help predict features at future time steps which can give an indication of completion of action at any instant. To accumulate M classes of action, we train an autencoder to seperate out actions spaces, and learn generative models per action space. The final layer accumulates statistics from each model, and estimates action class and percentage of completion in a segment of frames. Experimental results prove that this network provides a good predictive and recognition capability required for action localization in streaming videos.
Databáze: OpenAIRE