Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Pei, Baoqi"'
Autor:
Pei, Baoqi, Chen, Guo, Xu, Jilan, He, Yuping, Liu, Yicheng, Pan, Kanghua, Huang, Yifei, Wang, Yali, Lu, Tong, Wang, Limin, Qiao, Yu
In this report, we present our solutions to the EgoVis Challenges in CVPR 2024, including five tracks in the Ego4D challenge and three tracks in the EPIC-Kitchens challenge. Building upon the video-language two-tower model and leveraging our meticulo
Externí odkaz:
http://arxiv.org/abs/2406.18070
Autor:
Huang, Yifei, Chen, Guo, Xu, Jilan, Zhang, Mingfang, Yang, Lijin, Pei, Baoqi, Zhang, Hongjie, Dong, Lu, Wang, Yali, Wang, Limin, Qiao, Yu
Being able to map the activities of others into one's own point of view is one fundamental human skill even from a very early age. Taking a step toward understanding this human ability, we introduce EgoExoLearn, a large-scale dataset that emulates th
Externí odkaz:
http://arxiv.org/abs/2403.16182
Autor:
Wang, Yi, Li, Kunchang, Li, Xinhao, Yu, Jiashuo, He, Yinan, Wang, Chenting, Chen, Guo, Pei, Baoqi, Yan, Ziang, Zheng, Rongkun, Xu, Jilan, Wang, Zun, Shi, Yansong, Jiang, Tianxiang, Li, Songze, Zhang, Hongjie, Huang, Yifei, Qiao, Yu, Wang, Yali, Wang, Limin
We introduce InternVideo2, a new family of video foundation models (ViFM) that achieve the state-of-the-art results in video recognition, video-text tasks, and video-centric dialogue. Our core design is a progressive training approach that unifies th
Externí odkaz:
http://arxiv.org/abs/2403.15377
Autor:
Chen, Guo, Huang, Yifei, Xu, Jilan, Pei, Baoqi, Chen, Zhe, Li, Zhiqi, Wang, Jiahao, Li, Kunchang, Lu, Tong, Wang, Limin
Understanding videos is one of the fundamental directions in computer vision research, with extensive efforts dedicated to exploring various architectures such as RNN, 3D CNN, and Transformers. The newly proposed architecture of state space model, e.
Externí odkaz:
http://arxiv.org/abs/2403.09626