Výsledky vyhledávání - "CARREIRA, João"

Report

Autor: Pătrăucean, Viorica, He, Xu Owen, Heyward, Joseph, Zhang, Chuhan, Sajjadi, Mehdi S. M., Muraru, George-Cristian, Zholus, Artem, Karami, Mahdi, Goroshin, Ross, Chen, Yutian, Osindero, Simon, Carreira, João, Pascanu, Razvan

We propose a novel block for video modelling. It relies on a time-space-channel factorisation with dedicated blocks for each dimension: gated linear recurrent units (LRUs) perform information mixing over time, self-attention layers perform mixing ove

Externí odkaz: http://arxiv.org/abs/2412.14294

Zobrazit plný text záznamu

Report

Probabilistic Inverse Cameras: Image to 3D via Multiview Geometry

Autor: Kabra, Rishabh, Hudson, Drew A., van Steenkiste, Sjoerd, Carreira, Joao, Mitra, Niloy J.

We introduce a hierarchical probabilistic approach to go from a 2D image to multiview 3D: a diffusion "prior" models the unseen 3D geometry, which then conditions a diffusion "decoder" to generate novel views of the subject. We use a pointmap-based g

Externí odkaz: http://arxiv.org/abs/2412.10273

Zobrazit plný text záznamu

Report

Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark

Autor: Heyward, Joseph, Carreira, João, Damen, Dima, Zisserman, Andrew, Pătrăucean, Viorica

Following the successful 2023 edition, we organised the Second Perception Test challenge as a half-day workshop alongside the IEEE/CVF European Conference on Computer Vision (ECCV) 2024, with the goal of benchmarking state-of-the-art video models and

Externí odkaz: http://arxiv.org/abs/2411.19941

Zobrazit plný text záznamu

Report

Moving Off-the-Grid: Scene-Grounded Video Representations

Autor: van Steenkiste, Sjoerd, Zoran, Daniel, Yang, Yi, Rubanova, Yulia, Kabra, Rishabh, Doersch, Carl, Gokay, Dilara, Heyward, Joseph, Pot, Etienne, Greff, Klaus, Hudson, Drew A., Keck, Thomas Albert, Carreira, Joao, Dosovitskiy, Alexey, Sajjadi, Mehdi S. M., Kipf, Thomas

Current vision models typically maintain a fixed correspondence between their representation structure and image space. Each layer comprises a set of tokens arranged "on-the-grid," which biases patches or tokens to encode information at a specific sp

Externí odkaz: http://arxiv.org/abs/2411.05927

Zobrazit plný text záznamu

Report

TAPVid-3D: A Benchmark for Tracking Any Point in 3D

Autor: Koppula, Skanda, Rocco, Ignacio, Yang, Yi, Heyward, Joe, Carreira, João, Zisserman, Andrew, Brostow, Gabriel, Doersch, Carl

We introduce a new benchmark, TAPVid-3D, for evaluating the task of long-range Tracking Any Point in 3D (TAP-3D). While point tracking in two dimensions (TAP) has many benchmarks measuring performance on real-world videos, such as TAPVid-DAVIS, three

Externí odkaz: http://arxiv.org/abs/2407.05921

Zobrazit plný text záznamu

Report

BootsTAP: Bootstrapped Training for Tracking-Any-Point

Autor: Doersch, Carl, Luc, Pauline, Yang, Yi, Gokay, Dilara, Koppula, Skanda, Gupta, Ankush, Heyward, Joseph, Rocco, Ignacio, Goroshin, Ross, Carreira, João, Zisserman, Andrew

To endow models with greater understanding of physics and motion, it is useful to enable them to perceive how solid surfaces move and deform in real scenes. This can be formalized as Tracking-Any-Point (TAP), which requires the algorithm to track any

Externí odkaz: http://arxiv.org/abs/2402.00847

Zobrazit plný text záznamu

Report

Perception Test 2023: A Summary of the First Challenge And Outcome

Autor: Heyward, Joseph, Carreira, João, Damen, Dima, Zisserman, Andrew, Pătrăucean, Viorica

The First Perception Test challenge was held as a half-day workshop alongside the IEEE/CVF International Conference on Computer Vision (ICCV) 2023, with the goal of benchmarking state-of-the-art video models on the recently proposed Perception Test b

Externí odkaz: http://arxiv.org/abs/2312.13090

Zobrazit plný text záznamu

Report

Learning from One Continuous Video Stream

Autor: Carreira, João, King, Michael, Pătrăucean, Viorica, Gokay, Dilara, Ionescu, Cătălin, Yang, Yi, Zoran, Daniel, Heyward, Joseph, Doersch, Carl, Aytar, Yusuf, Damen, Dima, Zisserman, Andrew

We introduce a framework for online learning from a single continuous video stream -- the way people and animals learn, without mini-batches, data augmentation or shuffling. This poses great challenges given the high correlation between consecutive v

Externí odkaz: http://arxiv.org/abs/2312.00598

Zobrazit plný text záznamu

Report

Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video

Autor: Venkataramanan, Shashanka, Rizve, Mamshad Nayeem, Carreira, João, Asano, Yuki M., Avrithis, Yannis

Self-supervised learning has unlocked the potential of scaling up pretraining to billions of images, since annotation is unnecessary. But are we making the best use of data? How more economical can we be? In this work, we attempt to answer this quest

Externí odkaz: http://arxiv.org/abs/2310.08584

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání