Zobrazeno 1 - 10
of 23 124
pro vyhledávání: '"In So Kweon"'
Publikováno v:
IEEE Access, Vol 12, Pp 93580-93592 (2024)
We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models. Constructing a large-scale labeled image captioning dataset is expensive in terms of labor, time, and cost. In contrast to manually
Externí odkaz:
https://doaj.org/article/324f333acf4846b98f34291ad88f0d40
Publikováno v:
IEEE Access, Vol 11, Pp 95201-95212 (2023)
Counterfactuals have been shown to be a powerful method in Visual Question Answering in the alleviation of Visual Question Answering’s unimodal bias. However, existing counterfactual methods tend to generate samples that are not diverse or require
Externí odkaz:
https://doaj.org/article/73244e79e595422fbd919943a74fe1c5
Diversity control is an important task to alleviate bias amplification and filter bubble problems. The desired degree of diversity may fluctuate based on users' daily moods or business strategies. However, existing methods for controlling diversity o
Externí odkaz:
http://arxiv.org/abs/2411.11240
This study investigates the sleep characteristics and brain activity of individuals in the gray zone of insomnia, a population that experiences sleep disturbances yet does not fully meet the clinical criteria for chronic insomnia. Thirteen healthy pa
Externí odkaz:
http://arxiv.org/abs/2411.09875
Semantic Scene Completion (SSC) aims to perform geometric completion and semantic segmentation simultaneously. Despite the promising results achieved by existing studies, the inherently ill-posed nature of the task presents significant challenges in
Externí odkaz:
http://arxiv.org/abs/2410.15674
In this paper, we propose a new method to enhance compositional understanding in pre-trained vision and language models (VLMs) without sacrificing performance in zero-shot multi-modal tasks. Traditional fine-tuning approaches often improve compositio
Externí odkaz:
http://arxiv.org/abs/2410.05210
Autor:
Miller, John Joshua, Mak, Simon, Sun, Benny, Narayanan, Sai Ranjeet, Yang, Suo, Sun, Zongxuan, Kim, Kenneth S., Kweon, Chol-Bum Mike
The optimization of expensive black-box simulators arises in a myriad of modern scientific and engineering applications. Bayesian optimization provides an appealing solution, by leveraging a fitted surrogate model to guide the selection of subsequent
Externí odkaz:
http://arxiv.org/abs/2410.01196
Event cameras excel in capturing high-contrast scenes and dynamic objects, offering a significant advantage over traditional frame-based cameras. Despite active research into leveraging event cameras for semantic segmentation, generating pixel-wise d
Externí odkaz:
http://arxiv.org/abs/2407.11216
The large abundance of perspective camera datasets facilitated the emergence of novel learning-based strategies for various tasks, such as camera localization, single image depth estimation, or view synthesis. However, panoramic or omnidirectional im
Externí odkaz:
http://arxiv.org/abs/2406.18898
Autor:
Oh, Youngtaek, Ahn, Pyunghwan, Kim, Jinhyung, Song, Gwangmo, Lee, Soonyoung, Kweon, In So, Kim, Junmo
Vision and language models (VLMs) such as CLIP have showcased remarkable zero-shot recognition abilities yet face challenges in visio-linguistic compositionality, particularly in linguistic comprehension and fine-grained image-text alignment. This pa
Externí odkaz:
http://arxiv.org/abs/2406.09388