Výsledky vyhledávání - "Wee, Dongyoon"

Report

A Simple Baseline with Single-encoder for Referring Image Segmentation

Autor: Yu, Seonghoon, Jung, Ilchae, Han, Byeongju, Kim, Taeoh, Kim, Yunho, Wee, Dongyoon, Son, Jeany

Referring image segmentation (RIS) requires dense vision-language interactions between visual pixels and textual words to segment objects based on a given description. However, commonly adapted dual-encoders in RIS, e.g., Swin transformer and BERT (u

Externí odkaz: http://arxiv.org/abs/2408.15521

Zobrazit plný text záznamu

Report

Classification Matters: Improving Video Action Detection with Class-Specific Attention

Autor: Lee, Jinsung, Kim, Taeoh, Lee, Inwoong, Shim, Minho, Wee, Dongyoon, Cho, Minsu, Kwak, Suha

Video action detection (VAD) aims to detect actors and classify their actions in a video. We figure that VAD suffers more from classification rather than localization of actors. Hence, we analyze how prevailing methods form features for classificatio

Externí odkaz: http://arxiv.org/abs/2407.19698

Zobrazit plný text záznamu

Report

Regularizing Dynamic Radiance Fields with Kinematic Fields

Autor: Im, Woobin, Cha, Geonho, Lee, Sebin, Lee, Jumin, Seon, Juhyeong, Wee, Dongyoon, Yoon, Sung-Eui

This paper presents a novel approach for reconstructing dynamic radiance fields from monocular videos. We integrate kinematics with dynamic radiance fields, bridging the gap between the sparse nature of monocular videos and the real-world physics. Ou

Externí odkaz: http://arxiv.org/abs/2407.14059

Zobrazit plný text záznamu

Report

Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling

Autor: Kim, Jaehyeok, Wee, Dongyoon, Xu, Dan

This paper introduces Motion-oriented Compositional Neural Radiance Fields (MoCo-NeRF), a framework designed to perform free-viewpoint rendering of monocular human videos via novel non-rigid motion modeling approach. In the context of dynamic clothed

Externí odkaz: http://arxiv.org/abs/2407.11962

Zobrazit plný text záznamu

Report

Masked Autoencoder for Unsupervised Video Summarization

Autor: Shim, Minho, Kim, Taeoh, Kim, Jinhyung, Wee, Dongyoon

Summarizing a video requires a diverse understanding of the video, ranging from recognizing scenes to evaluating how much each frame is essential enough to be selected as a summary. Self-supervised learning (SSL) is acknowledged for its robustness an

Externí odkaz: http://arxiv.org/abs/2306.01395

Zobrazit plný text záznamu

Report

Decomposed Cross-modal Distillation for RGB-based Temporal Action Detection

Autor: Lee, Pilhyeon, Kim, Taeoh, Shim, Minho, Wee, Dongyoon, Byun, Hyeran

Temporal action detection aims to predict the time intervals and the classes of action instances in the video. Despite the promising performance, existing two-stream models exhibit slow inference speed due to their reliance on computationally expensi

Externí odkaz: http://arxiv.org/abs/2303.17285

Zobrazit plný text záznamu

Report

You Only Train Once: Multi-Identity Free-Viewpoint Neural Human Rendering from Monocular Videos

Autor: Kim, Jaehyeok, Wee, Dongyoon, Xu, Dan

We introduce You Only Train Once (YOTO), a dynamic human generation framework, which performs free-viewpoint rendering of different human identities with distinct motions, via only one-time training from monocular videos. Most prior works for the tas

Externí odkaz: http://arxiv.org/abs/2303.05835

Zobrazit plný text záznamu

Report

MEEV: Body Mesh Estimation On Egocentric Video

Autor: Monet, Nicolas, Wee, Dongyoon

This technical report introduces our solution, MEEV, proposed to the EgoBody Challenge at ECCV 2022. Captured from head-mounted devices, the dataset consists of human body shape and motion of interacting people. The EgoBody dataset has challenges suc

Externí odkaz: http://arxiv.org/abs/2210.14165

Zobrazit plný text záznamu

Report

Exploring Temporally Dynamic Data Augmentation for Video Recognition

Autor: Kim, Taeoh, Kim, Jinhyung, Shim, Minho, Yun, Sangdoo, Kang, Myunggu, Wee, Dongyoon, Lee, Sangyoun

Data augmentation has recently emerged as an essential component of modern training recipes for visual recognition tasks. However, data augmentation for video recognition has been rarely explored despite its effectiveness. Few existing augmentation r

Externí odkaz: http://arxiv.org/abs/2206.15015

Zobrazit plný text záznamu

Report

Out of Sight, Out of Mind: A Source-View-Wise Feature Aggregation for Multi-View Image-Based Rendering

Autor: Cha, Geonho, Shin, Chaehun, Yoon, Sungroh, Wee, Dongyoon

To estimate the volume density and color of a 3D point in the multi-view image-based rendering, a common approach is to inspect the consensus existence among the given source image features, which is one of the informative cues for the estimation pro

Externí odkaz: http://arxiv.org/abs/2206.04906

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání