Výsledky vyhledávání

Report

EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation

Autor: Pei, Baoqi, Chen, Guo, Xu, Jilan, He, Yuping, Liu, Yicheng, Pan, Kanghua, Huang, Yifei, Wang, Yali, Lu, Tong, Wang, Limin, Qiao, Yu

In this report, we present our solutions to the EgoVis Challenges in CVPR 2024, including five tracks in the Ego4D challenge and three tracks in the EPIC-Kitchens challenge. Building upon the video-language two-tower model and leveraging our meticulo

Externí odkaz: http://arxiv.org/abs/2406.18070

Zobrazit plný text záznamu

Report

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World

Autor: Huang, Yifei, Chen, Guo, Xu, Jilan, Zhang, Mingfang, Yang, Lijin, Pei, Baoqi, Zhang, Hongjie, Dong, Lu, Wang, Yali, Wang, Limin, Qiao, Yu

Being able to map the activities of others into one's own point of view is one fundamental human skill even from a very early age. Taking a step toward understanding this human ability, we introduce EgoExoLearn, a large-scale dataset that emulates th

Externí odkaz: http://arxiv.org/abs/2403.16182

Zobrazit plný text záznamu

Report

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding

We introduce InternVideo2, a new family of video foundation models (ViFM) that achieve the state-of-the-art results in video recognition, video-text tasks, and video-centric dialogue. Our core design is a progressive training approach that unifies th

Externí odkaz: http://arxiv.org/abs/2403.15377

Zobrazit plný text záznamu

Report

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding

Autor: Chen, Guo, Huang, Yifei, Xu, Jilan, Pei, Baoqi, Chen, Zhe, Li, Zhiqi, Wang, Jiahao, Li, Kunchang, Lu, Tong, Wang, Limin

Understanding videos is one of the fundamental directions in computer vision research, with extensive efforts dedicated to exploring various architectures such as RNN, 3D CNN, and Transformers. The newly proposed architecture of state space model, e.

Externí odkaz: http://arxiv.org/abs/2403.09626

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání