Výsledky vyhledávání - "Şener, Fadime"

Report

Long-Tail Temporal Action Segmentation with Group-wise Temporal Logit Adjustment

Autor: Pang, Zhanzhong, Sener, Fadime, Ramasubramanian, Shrinivas, Yao, Angela

Procedural activity videos often exhibit a long-tailed action distribution due to varying action frequencies and durations. However, state-of-the-art temporal action segmentation methods overlook the long tail and fail to recognize tail actions. Exis

Externí odkaz: http://arxiv.org/abs/2408.09919

Zobrazit plný text záznamu

Report

X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization

Autor: Kukleva, Anna, Sener, Fadime, Remelli, Edoardo, Tekin, Bugra, Sauser, Eric, Schiele, Bernt, Ma, Shugao

Lately, there has been growing interest in adapting vision-language models (VLMs) to image and third-person video classification due to their success in zero-shot recognition. However, the adaptation of these models to egocentric videos has been larg

Externí odkaz: http://arxiv.org/abs/2403.19811

Zobrazit plný text záznamu

Report

DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions

Autor: Christen, Sammy, Hampali, Shreyas, Sener, Fadime, Remelli, Edoardo, Hodan, Tomas, Sauser, Eric, Ma, Shugao, Tekin, Bugra

Publikováno v: SIGGRAPH Asia Conference Papers, Article 145, 2024

Generating natural hand-object interactions in 3D is challenging as the resulting hand and object motions are expected to be physically plausible and semantically meaningful. Furthermore, generalization to unseen objects is hindered by the limited sc

Externí odkaz: http://arxiv.org/abs/2403.17827

Zobrazit plný text záznamu

Report

On the Utility of 3D Hand Poses for Action Recognition

Autor: Shamil, Md Salman, Chatterjee, Dibyadip, Sener, Fadime, Ma, Shugao, Yao, Angela

3D hand pose is an underexplored modality for action recognition. Poses are compact yet informative and can greatly benefit applications with limited compute budgets. However, poses alone offer an incomplete understanding of actions, as they cannot f

Externí odkaz: http://arxiv.org/abs/2403.09805

Zobrazit plný text záznamu

Report

Opening the Vocabulary of Egocentric Actions

Autor: Chatterjee, Dibyadip, Sener, Fadime, Ma, Shugao, Yao, Angela

Human actions in egocentric videos are often hand-object interactions composed from a verb (performed by the hand) applied to an object. Despite their extensive scaling up, egocentric datasets still face two limitations - sparsity of action compositi

Externí odkaz: http://arxiv.org/abs/2308.11488

Zobrazit plný text záznamu

Report

Every Mistake Counts in Assembly

Autor: Ding, Guodong, Sener, Fadime, Ma, Shugao, Yao, Angela

One promising use case of AI assistants is to help with complex procedures like cooking, home repair, and assembly tasks. Can we teach the assistant to interject after the user makes a mistake? This paper targets the problem of identifying ordering m

Externí odkaz: http://arxiv.org/abs/2307.16453

Zobrazit plný text záznamu

Report

AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation

Autor: Ohkawa, Takehiko, He, Kun, Sener, Fadime, Hodan, Tomas, Tran, Luan, Keskin, Cem

We present AssemblyHands, a large-scale benchmark dataset with accurate 3D hand pose annotations, to facilitate the study of egocentric activities with challenging hand-object interactions. The dataset includes synchronized egocentric and exocentric

Externí odkaz: http://arxiv.org/abs/2304.12301

Zobrazit plný text záznamu

Report

Temporal Action Segmentation: An Analysis of Modern Techniques

Autor: Ding, Guodong, Sener, Fadime, Yao, Angela

Temporal action segmentation (TAS) in videos aims at densely identifying video frames in minutes-long videos with multiple action classes. As a long-range video understanding task, researchers have developed an extended collection of methods and exam

Externí odkaz: http://arxiv.org/abs/2210.10352

Zobrazit plný text záznamu

Report

Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities

Autor: Sener, Fadime, Chatterjee, Dibyadip, Shelepov, Daniel, He, Kun, Singhania, Dipika, Wang, Robert, Yao, Angela

Assembly101 is a new procedural activity dataset featuring 4321 videos of people assembling and disassembling 101 "take-apart" toy vehicles. Participants work without fixed instructions, and the sequences feature rich and natural variations in action

Externí odkaz: http://arxiv.org/abs/2203.14712

Zobrazit plný text záznamu

Report

Transformed ROIs for Capturing Visual Transformations in Videos

Autor: Rai, Abhinav, Sener, Fadime, Yao, Angela

Modeling the visual changes that an action brings to a scene is critical for video understanding. Currently, CNNs process one local neighbourhood at a time, thus contextual relationships over longer ranges, while still learnable, are indirect. We pre

Externí odkaz: http://arxiv.org/abs/2106.03162

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání