Výsledky vyhledávání

Report

EgoPoints: Advancing Point Tracking for Egocentric Videos

Autor: Darkhalil, Ahmad, Guerrier, Rhodri, Harley, Adam W., Damen, Dima

We introduce EgoPoints, a benchmark for point tracking in egocentric videos. We annotate 4.7K challenging tracks in egocentric sequences. Compared to the popular TAP-Vid-DAVIS evaluation benchmark, we include 9x more points that go out-of-view and 59

Externí odkaz: http://arxiv.org/abs/2412.04592

Zobrazit plný text záznamu

Report

ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions

Autor: Souček, Tomáš, Gatti, Prajwal, Wray, Michael, Laptev, Ivan, Damen, Dima, Sivic, Josef

The goal of this work is to generate step-by-step visual instructions in the form of a sequence of images, given an input image that provides the scene context and the sequence of textual instructions. This is a challenging problem as it requires gen

Externí odkaz: http://arxiv.org/abs/2412.01987

Zobrazit plný text záznamu

Report

Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark

Autor: Heyward, Joseph, Carreira, João, Damen, Dima, Zisserman, Andrew, Pătrăucean, Viorica

Following the successful 2023 edition, we organised the Second Perception Test challenge as a half-day workshop alongside the IEEE/CVF European Conference on Computer Vision (ECCV) 2024, with the goal of benchmarking state-of-the-art video models and

Externí odkaz: http://arxiv.org/abs/2411.19941

Zobrazit plný text záznamu

Report

Context-Aware Multimodal Pretraining

Autor: Roth, Karsten, Akata, Zeynep, Damen, Dima, Balažević, Ivana, Hénaff, Olivier J.

Large-scale multimodal representation learning successfully optimizes for zero-shot transfer at test time. Yet the standard pretraining paradigm (contrastive learning on large amounts of image-text data) does not explicitly encourage representations

Externí odkaz: http://arxiv.org/abs/2411.15099

Zobrazit plný text záznamu

Report

Feasibility study of a novel thermal neutron detection system using event mode camera and LYSO scintillation crystal

Autor: Gao, Tianqi, Alsulimane, Mohammad, Burdin, Sergey, DAmen, Gabriele, Da Via, Cinzia, Mavrokoridis, Konstantinos, Nomerotski, Andrei, Roberts, Adam, Svihra, Peter, Taylor, Jon, Tricoli, Alessandro

The feasibility study of a new technique for thermal neutron detection using a Timepix3 camera (TPX3Cam) with custom-made optical add-ons operated in event-mode data acquisition is presented. The camera has a spatial resolution of ~ 16 um and a tempo

Externí odkaz: http://arxiv.org/abs/2411.12095

Zobrazit plný text záznamu

Report

It's Just Another Day: Unique Video Captioning by Discriminative Prompting

Autor: Perrett, Toby, Han, Tengda, Damen, Dima, Zisserman, Andrew

Long videos contain many repeating actions, events and shots. These repetitions are frequently given identical captions, which makes it difficult to retrieve the exact desired clip using a text search. In this paper, we formulate the problem of uniqu

Externí odkaz: http://arxiv.org/abs/2410.11702

Zobrazit plný text záznamu

Report

AMEGO: Active Memory from long EGOcentric videos

Autor: Goletto, Gabriele, Nagarajan, Tushar, Averta, Giuseppe, Damen, Dima

Egocentric videos provide a unique perspective into individuals' daily experiences, yet their unstructured nature presents challenges for perception. In this paper, we introduce AMEGO, a novel approach aimed at enhancing the comprehension of very-lon

Externí odkaz: http://arxiv.org/abs/2409.10917

Zobrazit plný text záznamu

Report

Inspired by AI? A Novel Generative AI System To Assist Conceptual Automotive Design

Autor: Wang, Ye, Damen, Nicole B., Gale, Thomas, Seo, Voho, Shayani, Hooman

Publikováno v: IDETC 2024

Design inspiration is crucial for establishing the direction of a design as well as evoking feelings and conveying meanings during the conceptual design process. Many practice designers use text-based searches on platforms like Pinterest to gather im

Externí odkaz: http://arxiv.org/abs/2407.11991

Zobrazit plný text záznamu

Report

Rank2Reward: Learning Shaped Reward Functions from Passive Video

Autor: Yang, Daniel, Tjia, Davin, Berg, Jacob, Damen, Dima, Agrawal, Pulkit, Gupta, Abhishek

Teaching robots novel skills with demonstrations via human-in-the-loop data collection techniques like kinesthetic teaching or teleoperation puts a heavy burden on human supervisors. In contrast to this paradigm, it is often significantly easier to p

Externí odkaz: http://arxiv.org/abs/2404.14735

Zobrazit plný text záznamu

Report

HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision

Autor: Bansal, Siddhant, Wray, Michael, Damen, Dima

Large Vision Language Models (VLMs) are now the de facto state-of-the-art for a number of tasks including visual question answering, recognising objects, and spatial referral. In this work, we propose the HOI-Ref task for egocentric images that aims

Externí odkaz: http://arxiv.org/abs/2404.09933

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání