Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

Autor:	Yu Zhou, Dezhao Luo, Qixiang Ye, Can Ma, Chang Liu, Dongbao Yang, Weiping Wang
Jazyk:	angličtina
Rok vydání:	2020
Předmět:	FOS: Computer and information sciences 0209 industrial biotechnology Computer science business.industry Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition Flexibility (personality) 02 engineering and technology General Medicine Machine learning computer.software_genre Task (project management) 020901 industrial engineering & automation 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence Representation (mathematics) Proxy (statistics) Temporal learning business computer Feature learning
Zdroj:	AAAI
Popis:	We propose a novel self-supervised method, referred to as Video Cloze Procedure (VCP), to learn rich spatial-temporal representations. VCP first generates "blanks" by withholding video clips and then creates "options" by applying spatio-temporal operations on the withheld clips. Finally, it fills the blanks with "options" and learns representations by predicting the categories of operations applied on the clips. VCP can act as either a proxy task or a target task in self-supervised learning. As a proxy task, it converts rich self-supervised representations into video clip operations (options), which enhances the flexibility and reduces the complexity of representation learning. As a target task, it can assess learned representation models in a uniform and interpretable manner. With VCP, we train spatial-temporal representation models (3D-CNNs) and apply such models on action recognition and video retrieval tasks. Experiments on commonly used benchmarks show that the trained models outperform the state-of-the-art self-supervised models with significant margins. AAAI2020(Oral)
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c6e4b4fb8814a89d014894047612d678 http://arxiv.org/abs/2001.00294 Zobrazit plný text záznamu