Zobrazeno 1 - 10
of 1 614
pro vyhledávání: '"Little, James"'
The emergence of attention-based transformer models has led to their extensive use in various tasks, due to their superior generalization and transfer properties. Recent research has demonstrated that such models, when prompted appropriately, are exc
Externí odkaz:
http://arxiv.org/abs/2404.11732
Existing dense or paragraph video captioning approaches rely on holistic representations of videos, possibly coupled with learned object/action representations, to condition hierarchical language decoders. However, they fundamentally lack the commons
Externí odkaz:
http://arxiv.org/abs/2303.07545
Publikováno v:
2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2024, pp. 988-998
Recent advances in pixel-level tasks (e.g. segmentation) illustrate the benefit of of long-range interactions between aggregated region-based representations that can enhance local features. However, such aggregated representations, often in the form
Externí odkaz:
http://arxiv.org/abs/2212.03338
We propose a bootstrapping framework to enhance human optical flow and pose. We show that, for videos involving humans in scenes, we can improve both the optical flow and the pose estimation quality of humans by considering the two tasks at the same
Externí odkaz:
http://arxiv.org/abs/2210.15121
Neural Radiance Fields (NeRFs) increase reconstruction detail for novel view synthesis and scene reconstruction, with applications ranging from large static scenes to dynamic human motion. However, the increased resolution and model-free nature of su
Externí odkaz:
http://arxiv.org/abs/2206.11952
Publikováno v:
In Computer Vision and Image Understanding October 2024 247
Human pose estimation from single images is a challenging problem that is typically solved by supervised learning. Unfortunately, labeled training data does not yet exist for many human activities since 3D annotation requires dedicated motion capture
Externí odkaz:
http://arxiv.org/abs/2112.07088
The problem of language grounding has attracted much attention in recent years due to its pivotal role in more general image-lingual high level reasoning tasks (e.g., image captioning, VQA). Despite the tremendous progress in visual grounding, the pe
Externí odkaz:
http://arxiv.org/abs/1912.00076