Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation

Autor:	Pallabi Ghosh, Larry S. Davis, Yi Yao, Ajay Divakaran
Rok vydání:	2020
Předmět:	FOS: Computer and information sciences Computer science business.industry Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION Inference Pattern recognition 02 engineering and technology 010501 environmental sciences 01 natural sciences Data type Improved performance Performance comparison 0202 electrical engineering electronic engineering information engineering Graph (abstract data type) Leverage (statistics) Action recognition 020201 artificial intelligence & image processing Segmentation Artificial intelligence business 0105 earth and related environmental sciences
Zdroj:	WACV
DOI:	10.1109/wacv45572.2020.9093361
Popis:	We propose novel Stacked Spatio-Temporal Graph Convolutional Networks (Stacked-STGCN) for action segmentation, i.e., predicting and localizing a sequence of actions over long videos. We extend the Spatio-Temporal Graph Convolutional Network (STGCN) originally proposed for skeleton-based action recognition to enable nodes with different characteristics (e.g., scene, actor, object, action), feature descriptors with varied lengths, and arbitrary temporal edge connections to account for large graph deformation commonly associated with complex activities. We further introduce the stacked hourglass architecture to STGCN to leverage the advantages of an encoder-decoder design for improved generalization performance and localization accuracy. We explore various descriptors such as frame- level VGG, segment-level I3D, RCNN-based object, etc. as node descriptors to enable action segmentation based on joint inference over comprehensive contextual information. We show results on CAD120 (which provides pre-computed node features and edge weights for fair performance comparison across algorithms) as well as a more complex real- world activity dataset, Charades. Our Stacked-STGCN in general achieves improved performance over the state-of- the-art for both CAD120 and Charades. Moreover, due to its generic design, Stacked-STGCN can be applied to a wider range of applications that require structured inference over long sequences with heterogeneous data types and varied temporal extent.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b0878519d2d9d4a5ed931edf2df8f72a https://doi.org/10.1109/wacv45572.2020.9093361 Zobrazit plný text záznamu