Jointly Learning Visual Poses and Pose Lexicon for Semantic Action Recognition
Autor: | Wanqing Li, Philip Ogunbona, Zhengyou Zhang, Lijuan Zhou |
---|---|
Rok vydání: | 2020 |
Předmět: |
Computer science
business.industry Probabilistic logic 02 engineering and technology computer.software_genre Lexicon Visualization 0202 electrical engineering electronic engineering information engineering Media Technology Frame (artificial intelligence) 020201 artificial intelligence & image processing Artificial intelligence Electrical and Electronic Engineering Hidden Markov model business computer Natural language processing |
Zdroj: | IEEE Transactions on Circuits and Systems for Video Technology. 30:457-467 |
ISSN: | 1558-2205 1051-8215 |
DOI: | 10.1109/tcsvt.2019.2890829 |
Popis: | A novel method for semantic action recognition through learning a pose lexicon is presented in this paper. A pose lexicon comprises a set of semantic poses, a set of visual poses, and a probabilistic mapping between the visual and semantic poses. This paper assumes that both the visual poses and mapping are hidden and proposes a method to simultaneously learn a visual pose model that estimates the likelihood of an observed video frame being generated from hidden visual poses, and a pose lexicon model establishes the probabilistic mapping between the hidden visual poses and the semantic poses parsed from textual instructions. Specifically, the proposed method consists of two-level hidden Markov models. One level represents the alignment between the visual poses and semantic poses. The other level represents a visual pose sequence, and each visual pose is modeled as a Gaussian mixture. An expectation-maximization algorithm is developed to train a pose lexicon. With the learned lexicon, action classification is formulated as a problem of finding the maximum posterior probability of a given sequence of video frames that follows a given sequence of semantic poses, constrained by the most likely visual pose and the alignment sequences. The proposed method was evaluated on MSRC-12, WorkoutSU-10, WorkoutUOW-18, Combined-15, Combined-17, and Combined-50 action datasets using cross-subject, cross-dataset, zero-shot, and seen/unseen protocols. |
Databáze: | OpenAIRE |
Externí odkaz: |