Výsledky vyhledávání - "LEE, JOONSEOK"

Report

Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation

Autor: Ha, Seongsu, Kim, Chaeyun, Kim, Donghwa, Lee, Junho, Lee, Sangho, Lee, Joonseok

Referring Image Segmentation is a comprehensive task to segment an object referred by a textual query from an image. In nature, the level of difficulty in this task is affected by the existence of similar objects and the complexity of the referring e

Externí odkaz: http://arxiv.org/abs/2411.01494

Zobrazit plný text záznamu

Report

Scalable Frame Sampling for Video Classification: A Semi-Optimal Policy Approach with Reduced Search Space

Autor: Lee, Junho, Shin, Jeongwoo, Ko, Seung Woo, Ha, Seongsu, Lee, Joonseok

Given a video with $T$ frames, frame sampling is a task to select $N \ll T$ frames, so as to maximize the performance of a fixed video classifier. Not just brute-force search, but most existing methods suffer from its vast search space of $\binom{T}{

Externí odkaz: http://arxiv.org/abs/2409.05260

Zobrazit plný text záznamu

Report

Isometric Representation Learning for Disentangled Latent Space of Diffusion Models

Autor: Hahm, Jaehoon, Lee, Junho, Kim, Sunghyun, Lee, Joonseok

Publikováno v: Forty-first International Conference on Machine Learning (ICML 2024)

The latent space of diffusion model mostly still remains unexplored, despite its great success and potential in the field of generative modeling. In fact, the latent space of existing diffusion models are entangled, with a distorted mapping from its

Externí odkaz: http://arxiv.org/abs/2407.11451

Zobrazit plný text záznamu

Report

General Item Representation Learning for Cold-start Content Recommendations

Autor: Kim, Jooeun, Kim, Jinri, Yeo, Kwangeun, Kim, Eungi, On, Kyoung-Woon, Mun, Jonghwan, Lee, Joonseok

Cold-start item recommendation is a long-standing challenge in recommendation systems. A common remedy is to use a content-based approach, but rich information from raw contents in various forms has not been fully utilized. In this paper, we propose

Externí odkaz: http://arxiv.org/abs/2404.13808

Zobrazit plný text záznamu

Report

Modality-Aware Representation Learning for Zero-shot Sketch-based Image Retrieval

Autor: Lyou, Eunyi, Lee, Doyeon, Kim, Jooeun, Lee, Joonseok

Zero-shot learning offers an efficient solution for a machine learning model to treat unseen categories, avoiding exhaustive data collection. Zero-shot Sketch-based Image Retrieval (ZS-SBIR) simulates real-world scenarios where it is hard and costly

Externí odkaz: http://arxiv.org/abs/2401.04860

Zobrazit plný text záznamu

Report

Activity Grammars for Temporal Action Segmentation

Autor: Gong, Dayoung, Lee, Joonseok, Jung, Deunsol, Kwak, Suha, Cho, Minsu

Sequence prediction on temporal data requires the ability to understand compositional structures of multi-level semantics beyond individual and contextual properties. The task of temporal action segmentation, which aims at translating an untrimmed ac

Externí odkaz: http://arxiv.org/abs/2312.04266

Zobrazit plný text záznamu

Report

Towards Robust and Smooth 3D Multi-Person Pose Estimation from Monocular Videos in the Wild

Autor: Park, Sungchan, You, Eunyi, Lee, Inhoe, Lee, Joonseok

3D pose estimation is an invaluable task in computer vision with various practical applications. Especially, 3D pose estimation for multi-person from a monocular video (3DMPPE) is particularly challenging and is still largely uncharted, far from appl

Externí odkaz: http://arxiv.org/abs/2309.08644

Zobrazit plný text záznamu

Report

VisAlign: Dataset for Measuring the Degree of Alignment between AI and Humans in Visual Perception

Autor: Lee, Jiyoung, Kim, Seungho, Won, Seunghyun, Lee, Joonseok, Ghassemi, Marzyeh, Thorne, James, Choi, Jaeseok, Kwon, O-Kil, Choi, Edward

AI alignment refers to models acting towards human-intended goals, preferences, or ethical principles. Given that most large-scale deep learning models act as black boxes and cannot be manually controlled, analyzing the similarity between models and

Externí odkaz: http://arxiv.org/abs/2308.01525

Zobrazit plný text záznamu

Report

V2Meow: Meowing to the Visual Beat via Video-to-Music Generation

Autor: Su, Kun, Li, Judith Yue, Huang, Qingqing, Kuzmin, Dima, Lee, Joonseok, Donahue, Chris, Sha, Fei, Jansen, Aren, Wang, Yu, Verzetti, Mauro, Denk, Timo I.

Video-to-music generation demands both a temporally localized high-quality listening experience and globally aligned video-acoustic signatures. While recent music generation models excel at the former through advanced audio codecs, the exploration of

Externí odkaz: http://arxiv.org/abs/2305.06594

Zobrazit plný text záznamu

Report

Shuffle & Divide: Contrastive Learning for Long Text

Autor: Lee, Joonseok, Joe, Seongho, Park, Kyoungwon, Kim, Bogun, Kang, Hoyoung, Park, Jaeseon, Gwon, Youngjune

Publikováno v: 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 2022, pp. 2935-2941

We propose a self-supervised learning method for long text documents based on contrastive learning. A key to our method is Shuffle and Divide (SaD), a simple text augmentation algorithm that sets up a pretext task required for contrastive updates to

Externí odkaz: http://arxiv.org/abs/2304.09374

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání