Space-Time Tree Ensemble for Action Recognition and Localization
Autor: | Shugao Ma, Jianming Zhang, Nazli Ikizler-Cinbis, Stan Sclaroff, Leonid Sigal |
---|---|
Rok vydání: | 2017 |
Předmět: |
Vocabulary
business.industry media_common.quotation_subject 020207 software engineering Pattern recognition 02 engineering and technology Machine learning computer.software_genre Ranking (information retrieval) Tree (data structure) Tree structure Discriminative model Artificial Intelligence Exponential search Pattern recognition (psychology) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Pairwise comparison Computer Vision and Pattern Recognition Artificial intelligence business computer Software media_common Mathematics |
Zdroj: | International Journal of Computer Vision. 126:314-332 |
ISSN: | 1573-1405 0920-5691 |
DOI: | 10.1007/s11263-016-0980-8 |
Popis: | Human actions are, inherently, structured patterns of body movements. We explore ensembles of hierarchical spatio-temporal trees, discovered directly from training data, to model these structures for action recognition and spatial localization. Discovery of frequent and discriminative tree structures is challenging due to the exponential search space, particularly if one allows partial matching. We address this by first building a concise action word vocabulary via discriminative clustering of the hierarchical space-time segments, which is a two-level video representation that captures both static and non-static relevant space-time segments of the video. Using this vocabulary we then utilize tree mining with subsequent tree clustering and ranking to select a compact set of discriminative tree patterns. Our experiments show that these tree patterns, alone, or in combination with shorter patterns (action words and pairwise patterns) achieve promising performance on three challenging datasets: UCF Sports, HighFive and Hollywood3D. Moreover, we perform cross-dataset validation, using trees learned on HighFive to recognize the same actions in Hollywood3D, and using trees learned on UCF-Sports to recognize and localize the similar actions in JHMDB. The results demonstrate the potential for cross-dataset generalization of the trees our approach discovers. |
Databáze: | OpenAIRE |
Externí odkaz: |