Zobrazeno 1 - 10
of 2 141
pro vyhledávání: '"Lee, Jiyoung"'
This paper introduces VLAP, a novel approach that bridges pretrained vision models and large language models (LLMs) to make frozen LLMs understand the visual world. VLAP transforms the embedding space of pretrained vision models into the LLMs' word e
Externí odkaz:
http://arxiv.org/abs/2404.09632
Autor:
Lee, Jiyoung, Kim, Minwoo, Kim, Seungho, Kim, Junghwan, Won, Seunghyun, Lee, Hwaran, Choi, Edward
For Large Language Models (LLMs) to be effectively deployed in a specific country, they must possess an understanding of the nation's culture and basic knowledge. To this end, we introduce National Alignment, which measures an alignment between an LL
Externí odkaz:
http://arxiv.org/abs/2402.13605
Autor:
Lee, Jiyoung
Gene expression regulation is dynamic and specific to various factors such as developmental stages, environmental conditions, and stimulation of pathogens. Nowadays, a tremendous amount of transcriptome data sets are available from diverse species. T
Externí odkaz:
http://hdl.handle.net/10919/99878
Existing text-to-image diffusion models struggle to synthesize realistic images given dense captions, where each text prompt provides a detailed description for a specific image region. To address this, we propose DenseDiffusion, a training-free meth
Externí odkaz:
http://arxiv.org/abs/2308.12964
Compositional zero-shot learning (CZSL) aims to recognize unseen compositions with prior knowledge of known primitives (attribute and object). Previous works for CZSL often suffer from grasping the contextuality between attribute and object, as well
Externí odkaz:
http://arxiv.org/abs/2308.04016
Autor:
Lee, Jiyoung, Kim, Seungho, Won, Seunghyun, Lee, Joonseok, Ghassemi, Marzyeh, Thorne, James, Choi, Jaeseok, Kwon, O-Kil, Choi, Edward
AI alignment refers to models acting towards human-intended goals, preferences, or ethical principles. Given that most large-scale deep learning models act as black boxes and cannot be manually controlled, analyzing the similarity between models and
Externí odkaz:
http://arxiv.org/abs/2308.01525
Autor:
Kim, Soohyun, Kim, Junho, Kim, Taekyung, Heo, Hwan, Kim, Seungryong, Lee, Jiyoung, Kim, Jin-Hwa
In this paper, we tackle the challenging task of Panoramic Image-to-Image translation (Pano-I2I) for the first time. This task is difficult due to the geometric distortion of panoramic images and the lack of a panoramic image dataset with diverse con
Externí odkaz:
http://arxiv.org/abs/2304.04960
Recovering 3D human mesh in the wild is greatly challenging as in-the-wild (ITW) datasets provide only 2D pose ground truths (GTs). Recently, 3D pseudo-GTs have been widely used to train 3D human mesh estimation networks as the 3D pseudo-GTs enable 3
Externí odkaz:
http://arxiv.org/abs/2304.04875
In this paper, we efficiently transfer the surpassing representation power of the vision foundation models, such as ViT and Swin, for video understanding with only a few trainable parameters. Previous adaptation methods have simultaneously considered
Externí odkaz:
http://arxiv.org/abs/2303.09857
Autor:
Seo, Junyoung, Jang, Wooseok, Kwak, Min-Seop, Kim, Hyeonsu, Ko, Jaehoon, Kim, Junho, Kim, Jin-Hwa, Lee, Jiyoung, Kim, Seungryong
Text-to-3D generation has shown rapid progress in recent days with the advent of score distillation, a methodology of using pretrained text-to-2D diffusion models to optimize neural radiance field (NeRF) in the zero-shot setting. However, the lack of
Externí odkaz:
http://arxiv.org/abs/2303.07937