Zobrazeno 1 - 10
of 91
pro vyhledávání: '"Ehinger, Krista A."'
Exploring the narratives conveyed by fine-art paintings is a challenge in image captioning, where the goal is to generate descriptions that not only precisely represent the visual content but also offer a in-depth interpretation of the artwork's mean
Externí odkaz:
http://arxiv.org/abs/2409.10921
To fully understand the 3D context of a single image, a visual system must be able to segment both the visible and occluded regions of objects, while discerning their occlusion order. Ideally, the system should be able to handle any object and not be
Externí odkaz:
http://arxiv.org/abs/2405.05791
We present a novel bi-directional Transformer architecture (BiXT) which scales linearly with input size in terms of computational cost and memory consumption, but does not suffer the drop in performance or limitation to only one input modality seen w
Externí odkaz:
http://arxiv.org/abs/2402.12138
This paper introduces two key contributions aimed at improving the speed and quality of images generated through inverse diffusion processes. The first contribution involves reparameterizing the diffusion process in terms of the angle on a quarter-ci
Externí odkaz:
http://arxiv.org/abs/2310.17167
It has been shown recently that successful techniques in classical planning, such as goal-oriented heuristics and landmarks, can improve the ability to compute planning programs for generalized planning (GP) problems. In this work, we introduce the n
Externí odkaz:
http://arxiv.org/abs/2307.00735
Images of realistic scenes often contain intra-class objects that are heavily occluded from each other, making the amodal perception task that requires parsing the occluded parts of the objects challenging. Although important for downstream tasks suc
Externí odkaz:
http://arxiv.org/abs/2303.06596
Existing computer vision systems can compete with humans in understanding the visible parts of objects, but still fall far short of humans when it comes to depicting the invisible parts of partially occluded objects. Image amodal completion aims to e
Externí odkaz:
http://arxiv.org/abs/2207.02062
Convolutional neural network (CNN) models for computer vision are powerful but lack explainability in their most basic form. This deficiency remains a key challenge when applying CNNs in important domains. Recent work on explanations through feature
Externí odkaz:
http://arxiv.org/abs/2006.15417
Publikováno v:
In Pattern Recognition November 2023 143
Publikováno v:
In Computer Vision and Image Understanding March 2023 229