Zobrazeno 1 - 10
of 436
pro vyhledávání: '"Lara, Laura"'
Autor:
Hao, Xinyue, Li, Gen, Gowda, Shreyank N, Fisher, Robert B, Huang, Jonathan, Arnab, Anurag, Sevilla-Lara, Laura
Video understanding has made huge strides in recent years, relying largely on the power of the transformer architecture. As this architecture is notoriously expensive and video is highly redundant, research into improving efficiency has become partic
Externí odkaz:
http://arxiv.org/abs/2411.13626
Zero-shot action recognition requires a strong ability to generalize from pre-training and seen classes to novel unseen classes. Similarly, continual learning aims to develop models that can generalize effectively and learn new tasks without forgetti
Externí odkaz:
http://arxiv.org/abs/2410.10497
Autor:
Li, Gen, Tsagkas, Nikolaos, Song, Jifei, Mon-Williams, Ruaridh, Vijayakumar, Sethu, Shao, Kun, Sevilla-Lara, Laura
Affordance, defined as the potential actions that an object offers, is crucial for robotic manipulation tasks. A deep understanding of affordance can lead to more intelligent AI systems. For example, such knowledge directs an agent to grasp a knife b
Externí odkaz:
http://arxiv.org/abs/2408.10123
We focus on the problem of recognising the end state of an action in an image, which is critical for understanding what action is performed and in which manner. We study this focusing on the task of predicting the coarseness of a cut, i.e., deciding
Externí odkaz:
http://arxiv.org/abs/2405.07723
Procedural videos, exemplified by recipe demonstrations, are instrumental in conveying step-by-step instructions. However, understanding such videos is challenging as it involves the precise localization of steps and the generation of textual instruc
Externí odkaz:
http://arxiv.org/abs/2311.15964
Autor:
Gowda, Shreyank N, Hao, Xinyue, Li, Gen, Gowda, Shashank Narayana, Jin, Xiaobo, Sevilla-Lara, Laura
Deep learning models have revolutionized various fields, from image recognition to natural language processing, by achieving unprecedented levels of accuracy. However, their increasing energy consumption has raised concerns about their environmental
Externí odkaz:
http://arxiv.org/abs/2310.06522
Autor:
Gowda, Shreyank N, Sevilla-Lara, Laura
Video understanding has long suffered from reliance on large labeled datasets, motivating research into zero-shot learning. Recent progress in language modeling presents opportunities to advance zero-shot video analysis, but constructing an effective
Externí odkaz:
http://arxiv.org/abs/2309.17327
The goal of this work is to understand the way actions are performed in videos. That is, given a video, we aim to predict an adverb indicating a modification applied to the action (e.g. cut "finely"). We cast this problem as a regression task. We mea
Externí odkaz:
http://arxiv.org/abs/2303.15086