Zobrazeno 1 - 10
of 44
pro vyhledávání: '"Hong, Sunghwan"'
Autor:
Shin, Heeseong, Kim, Chaehyun, Hong, Sunghwan, Cho, Seokju, Arnab, Anurag, Seo, Paul Hongsuck, Kim, Seungryong
Large-scale vision-language models like CLIP have demonstrated impressive open-vocabulary capabilities for image-level tasks, excelling in recognizing what objects are present. However, they struggle with pixel-level recognition tasks like semantic s
Externí odkaz:
http://arxiv.org/abs/2409.19846
This paper introduces a Transformer-based integrative feature and cost aggregation network designed for dense matching tasks. In the context of dense matching, many works benefit from one of two forms of aggregation: feature aggregation, which pertai
Externí odkaz:
http://arxiv.org/abs/2403.11120
This work delves into the task of pose-free novel view synthesis from stereo pairs, a challenging and pioneering task in 3D vision. Our innovative framework, unlike any before, seamlessly integrates 2D correspondence matching, camera pose estimation,
Externí odkaz:
http://arxiv.org/abs/2312.07246
In the paradigm of AI-generated content (AIGC), there has been increasing attention to transferring knowledge from pre-trained text-to-image (T2I) models to text-to-video (T2V) generation. Despite their effectiveness, these frameworks face challenges
Externí odkaz:
http://arxiv.org/abs/2305.14330
Autor:
Cho, Seokju, Shin, Heeseong, Hong, Sunghwan, Arnab, Anurag, Seo, Paul Hongsuck, Kim, Seungryong
Open-vocabulary semantic segmentation presents the challenge of labeling each pixel within an image based on a wide range of text descriptions. In this work, we introduce a novel cost-based approach to adapt vision-language foundation models, notably
Externí odkaz:
http://arxiv.org/abs/2303.11797
Autor:
Hong, Sunghwan, Nam, Jisu, Cho, Seokju, Hong, Susung, Jeon, Sangryul, Min, Dongbo, Kim, Seungryong
Existing pipelines of semantic correspondence commonly include extracting high-level semantic features for the invariance against intra-class variations and background clutters. This architecture, however, inevitably results in a low-resolution match
Externí odkaz:
http://arxiv.org/abs/2210.02689
We present a novel architecture for dense correspondence. The current state-of-the-art are Transformer-based approaches that focus on either feature descriptors or cost volume aggregation. However, they generally aggregate one or the other but not bo
Externí odkaz:
http://arxiv.org/abs/2209.08742
This paper presents a novel cost aggregation network, called Volumetric Aggregation with Transformers (VAT), for few-shot segmentation. The use of transformers can benefit correlation map aggregation through self-attention over a global receptive fie
Externí odkaz:
http://arxiv.org/abs/2207.10866
Cost aggregation is a highly important process in image matching tasks, which aims to disambiguate the noisy matching scores. Existing methods generally tackle this by hand-crafted or CNN-based methods, which either lack robustness to severe deformat
Externí odkaz:
http://arxiv.org/abs/2202.06817
We introduce a novel cost aggregation network, dubbed Volumetric Aggregation with Transformers (VAT), to tackle the few-shot segmentation task by using both convolutions and transformers to efficiently handle high dimensional correlation maps between
Externí odkaz:
http://arxiv.org/abs/2112.11685