Zobrazeno 1 - 10
of 2 585
pro vyhledávání: '"Kim , Junho"'
Large Language Models (LLMs) have displayed remarkable performances across various complex tasks by leveraging Chain-of-Thought (CoT) prompting. Recently, studies have proposed a Knowledge Distillation (KD) approach, reasoning distillation, which tra
Externí odkaz:
http://arxiv.org/abs/2410.09037
Autor:
Park, Yong-Hyun, Yun, Sangdoo, Kim, Jin-Hwa, Kim, Junho, Jang, Geonhui, Jeong, Yonghyun, Jo, Junghyo, Lee, Gayoung
Recent advancements in text-to-image (T2I) models have greatly benefited from large-scale datasets, but they also pose significant risks due to the potential generation of unsafe content. To mitigate this issue, researchers have developed unlearning
Externí odkaz:
http://arxiv.org/abs/2407.21035
Large Multi-modal Models (LMMs) have recently demonstrated remarkable abilities in visual context understanding and coherent response generation. However, alongside these advancements, the issue of hallucinations has emerged as a significant challeng
Externí odkaz:
http://arxiv.org/abs/2406.01920
We introduce a lightweight and accurate localization method that only utilizes the geometry of 2D-3D lines. Given a pre-captured 3D map, our approach localizes a panorama image, taking advantage of the holistic 360 view. The system mitigates potentia
Externí odkaz:
http://arxiv.org/abs/2403.19904
This paper presents a way of enhancing the reliability of Large Multi-modal Models (LMMs) in addressing hallucination, where the models generate cross-modal inconsistent responses. Without additional training, we propose Counterfactual Inception, a n
Externí odkaz:
http://arxiv.org/abs/2403.13513
In the evolving domain of text-to-image generation, diffusion models have emerged as powerful tools in content creation. Despite their remarkable capability, existing models still face challenges in achieving controlled generation with a consistent s
Externí odkaz:
http://arxiv.org/abs/2402.12974
Unsupervised semantic segmentation aims to achieve high-quality semantic grouping without human-labeled annotations. With the advent of self-supervised pre-training, various frameworks utilize the pre-trained features to train prediction heads for un
Externí odkaz:
http://arxiv.org/abs/2310.07379
We present the Groupwise Diffusion Model (GDM), which divides data into multiple groups and diffuses one group at one time interval in the forward diffusion process. GDM generates data sequentially from one group at one time interval, leading to seve
Externí odkaz:
http://arxiv.org/abs/2310.01400
The absolute depth values of surrounding environments provide crucial cues for various assistive technologies, such as localization, navigation, and 3D structure estimation. We propose that accurate depth estimated from panoramic images can serve as
Externí odkaz:
http://arxiv.org/abs/2308.14005
We introduce LDL, a fast and robust algorithm that localizes a panorama to a 3D map using line segments. LDL focuses on the sparse structural information of lines in the scene, which is robust to illumination changes and can potentially enable effici
Externí odkaz:
http://arxiv.org/abs/2308.13989