Výsledky vyhledávání - "Kang, Wooyoung"

Report

Honeybee: Locality-enhanced Projector for Multimodal LLM

Autor: Cha, Junbum, Kang, Wooyoung, Mun, Jonghwan, Roh, Byungseok

In Multimodal Large Language Models (MLLMs), a visual projector plays a crucial role in bridging pre-trained vision encoders with LLMs, enabling profound visual understanding while harnessing the LLMs' robust capabilities. Despite the importance of t

Externí odkaz: http://arxiv.org/abs/2312.06742

Zobrazit plný text záznamu

Report

Large Language Models are Temporal and Causal Reasoners for Video Question Answering

Autor: Ko, Dohwan, Lee, Ji Soo, Kang, Wooyoung, Roh, Byungseok, Kim, Hyunwoo J.

Large Language Models (LLMs) have shown remarkable performances on a wide range of natural language understanding and generation tasks. We observe that the LLMs provide effective priors in exploiting $\textit{linguistic shortcuts}$ for temporal and c

Externí odkaz: http://arxiv.org/abs/2310.15747

Zobrazit plný text záznamu

Report

NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

In this report, we introduce NICE (New frontiers for zero-shot Image Captioning Evaluation) project and share the results and outcomes of 2023 challenge. This project is designed to challenge the computer vision community to develop robust image capt

Externí odkaz: http://arxiv.org/abs/2309.01961

Zobrazit plný text záznamu

Report

Open-Vocabulary Object Detection using Pseudo Caption Labels

Autor: Cho, Han-Cheol, Jhoo, Won Young, Kang, Wooyoung, Roh, Byungseok

Recent open-vocabulary detection methods aim to detect novel objects by distilling knowledge from vision-language models (VLMs) trained on a vast amount of image-text pairs. To improve the effectiveness of these methods, researchers have utilized dat

Externí odkaz: http://arxiv.org/abs/2303.13040

Zobrazit plný text záznamu

Report

Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning

Autor: Kang, Wooyoung, Mun, Jonghwan, Lee, Sungjun, Roh, Byungseok

Image captioning is one of the straightforward tasks that can take advantage of large-scale web-crawled data which provides rich knowledge about the visual world for a captioning model. However, since web-crawled data contains image-text pairs that a

Externí odkaz: http://arxiv.org/abs/2212.13563

Zobrazit plný text záznamu

Report

Dense but Efficient VideoQA for Intricate Compositional Reasoning

Autor: Lee, Jihyeon, Kang, Wooyoung, Kim, Eun-Sol

It is well known that most of the conventional video question answering (VideoQA) datasets consist of easy questions requiring simple reasoning processes. However, long videos inevitably contain complex and compositional semantic structures along wit

Externí odkaz: http://arxiv.org/abs/2210.10300

Zobrazit plný text záznamu

Akademický článek

Gray and white matter abnormalities in major depressive disorder patients and its associations with childhood adversity

Autor: Kang, Wooyoung, Kang, Youbin, Kim, Aram, Kim, Hyeyoung, Han, Kyu-Man, Ham, Byung-Joo

Publikováno v: In Journal of Affective Disorders 1 June 2023 330:16-23

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání