Affective interaction recognition using spatio-temporal features and context

Autor: Jinglian Liang, Chao Xu, Zhiyong Feng, Xirong Ma
Rok vydání: 2016
Předmět:
Zdroj: Computer Vision and Image Understanding. 144:155-165
ISSN: 1077-3142
DOI: 10.1016/j.cviu.2015.10.008
Popis: A hierarchical representation structure for interaction recognition is introduced.We adopt hierarchical coding models to encode low-level features.A segmental clustering method is applied to extract the mid-level features.Contextual information is incorporated with motion features by extracting the interactive contours.We demonstrate empirical results on three datasets. This paper focuses on recognizing the human interaction relative to human emotion, and addresses the problem of interaction features representation. We propose a two-layer feature description structure that exploits the representation of spatio-temporal motion features and context features hierarchically. On the lower layer, the local features for motion and interactive context are extracted respectively. We first characterize the local spatio-temporal trajectories as the motion features. Instead of hand-crafted features, a new hierarchical spatio-temporal trajectory coding model is presented to learn and represent the local spatio-temporal trajectories. To further exploit the spatial and temporal relationships in the interactive activities, we then propose an interactive context descriptor, which extracts the local interactive contours from frames. These contours implicitly incorporate the contextual spatial and temporal information. On the higher layer, semi-global features are represented based on the local features encoded on the lower layer. And a spatio-temporal segment clustering method is designed for features extraction on this layer. This method takes the spatial relationship and temporal order of local features into account and creates the mid-level motion features and mid-level context features. Experiments on three challenging action datasets in video, including HMDB51, Hollywood2 and UT-Interaction, are conducted. The results demonstrate the efficacy of the proposed structure, and validate the effectiveness of the proposed context descriptor.
Databáze: OpenAIRE