Zobrazeno 1 - 10
of 22
pro vyhledávání: '"Roh, Byungseok"'
Free-text radiology reports present a rich data source for various medical tasks, but effectively labeling these texts remains challenging. Traditional rule-based labeling methods fall short of capturing the nuances of diverse free-text patterns. Mor
Externí odkaz:
http://arxiv.org/abs/2401.11505
In Multimodal Large Language Models (MLLMs), a visual projector plays a crucial role in bridging pre-trained vision encoders with LLMs, enabling profound visual understanding while harnessing the LLMs' robust capabilities. Despite the importance of t
Externí odkaz:
http://arxiv.org/abs/2312.06742
Open-vocabulary object detection (OVOD) has recently gained significant attention as a crucial step toward achieving human-like visual intelligence. Existing OVOD methods extend target vocabulary from pre-defined categories to open-world by transferr
Externí odkaz:
http://arxiv.org/abs/2312.02103
Large Language Models (LLMs) have shown remarkable performances on a wide range of natural language understanding and generation tasks. We observe that the LLMs provide effective priors in exploiting $\textit{linguistic shortcuts}$ for temporal and c
Externí odkaz:
http://arxiv.org/abs/2310.15747
Autor:
You, Kihyun, Gu, Jawook, Ham, Jiyeon, Park, Beomhee, Kim, Jiho, Hong, Eun Kyoung, Baek, Woonhyunk, Roh, Byungseok
A large-scale image-text pair dataset has greatly contributed to the development of vision-language pre-training (VLP) models, which enable zero-shot or few-shot classification without costly annotation. However, in the medical domain, the scarcity o
Externí odkaz:
http://arxiv.org/abs/2310.13292
Autor:
Kim, Taehoon, Ahn, Pyunghwan, Kim, Sangyun, Lee, Sihaeng, Marsden, Mark, Sala, Alessandra, Kim, Seung Hwan, Han, Bohyung, Lee, Kyoung Mu, Lee, Honglak, Bae, Kyounghoon, Wu, Xiangyu, Gao, Yi, Zhang, Hailiang, Yang, Yang, Guo, Weili, Lu, Jianfeng, Oh, Youngtaek, Cho, Jae Won, Kim, Dong-jin, Kweon, In So, Kim, Junmo, Kang, Wooyoung, Jhoo, Won Young, Roh, Byungseok, Mun, Jonghwan, Oh, Solgil, Ak, Kenan Emir, Lee, Gwang-Gook, Xu, Yan, Shen, Mingwei, Hwang, Kyomin, Shin, Wonsik, Lee, Kamin, Park, Wonhark, Lee, Dongkwan, Kwak, Nojun, Wang, Yujin, Wang, Yimu, Gu, Tiancheng, Lv, Xingchang, Sun, Mingmao
In this report, we introduce NICE (New frontiers for zero-shot Image Captioning Evaluation) project and share the results and outcomes of 2023 challenge. This project is designed to challenge the computer vision community to develop robust image capt
Externí odkaz:
http://arxiv.org/abs/2309.01961
Recent open-vocabulary detection methods aim to detect novel objects by distilling knowledge from vision-language models (VLMs) trained on a vast amount of image-text pairs. To improve the effectiveness of these methods, researchers have utilized dat
Externí odkaz:
http://arxiv.org/abs/2303.13040
Autor:
Ko, Dohwan, Choi, Joonmyung, Choi, Hyeong Kyu, On, Kyoung-Woon, Roh, Byungseok, Kim, Hyunwoo J.
Foundation models have shown outstanding performance and generalization capabilities across domains. Since most studies on foundation models mainly focus on the pretraining phase, a naive strategy to minimize a single task-specific loss is adopted fo
Externí odkaz:
http://arxiv.org/abs/2303.13009
Image captioning is one of the straightforward tasks that can take advantage of large-scale web-crawled data which provides rich knowledge about the visual world for a captioning model. However, since web-crawled data contains image-text pairs that a
Externí odkaz:
http://arxiv.org/abs/2212.13563
We tackle open-world semantic segmentation, which aims at learning to segment arbitrary visual concepts in images, by using only image-text pairs without dense annotations. Existing open-world segmentation methods have shown impressive advances by em
Externí odkaz:
http://arxiv.org/abs/2212.00785