Výsledky vyhledávání

Report

Activating Self-Attention for Multi-Scene Absolute Pose Regression

Autor: Lee, Miso, Kim, Jihwan, Heo, Jae-Pil

Multi-scene absolute pose regression addresses the demand for fast and memory-efficient camera pose estimation across various real-world environments. Nowadays, transformer-based model has been devised to regress the camera pose directly in multi-sce

Externí odkaz: http://arxiv.org/abs/2411.01443

Zobrazit plný text záznamu

Report

Single-shot reconstruction of three-dimensional morphology of biological cells in digital holographic microscopy using a physics-driven neural network

Autor: Kim, Jihwan, Kim, Youngdo, Lee, Hyo Seung, Seo, Eunseok, Lee, Sang Joon

Recent advances in deep learning-based image reconstruction techniques have led to significant progress in phase retrieval using digital in-line holographic microscopy (DIHM). However, existing deep learning-based phase retrieval methods have technic

Externí odkaz: http://arxiv.org/abs/2409.20013

Zobrazit plný text záznamu

Report

Prediction-Feedback DETR for Temporal Action Detection

Autor: Kim, Jihwan, Lee, Miso, Cho, Cheol-Ho, Lee, Jihyun, Heo, Jae-Pil

Temporal Action Detection (TAD) is fundamental yet challenging for real-world video applications. Leveraging the unique benefits of transformers, various DETR-based approaches have been adopted in TAD. However, it has recently been identified that th

Externí odkaz: http://arxiv.org/abs/2408.16729

Zobrazit plný text záznamu

Report

Long-term Pre-training for Temporal Action Detection with Transformers

Autor: Kim, Jihwan, Lee, Miso, Heo, Jae-Pil

Temporal action detection (TAD) is challenging, yet fundamental for real-world video applications. Recently, DETR-based models for TAD have been prevailing thanks to their unique benefits. However, transformers demand a huge dataset, and unfortunatel

Externí odkaz: http://arxiv.org/abs/2408.13152

Zobrazit plný text záznamu

Report

TWLV-I: Analysis and Insights from Holistic Evaluation on Video Foundation Models

In this work, we discuss evaluating video foundation models in a fair and robust manner. Unlike language or image foundation models, many video foundation models are evaluated with differing parameters (such as sampling rate, number of frames, pretra

Externí odkaz: http://arxiv.org/abs/2408.11318

Zobrazit plný text záznamu

Report

Mutually-Aware Feature Learning for Few-Shot Object Counting

Autor: Jeon, Yerim, Lee, Subeen, Kim, Jihwan, Heo, Jae-Pil

Few-shot object counting has garnered significant attention for its practicality as it aims to count target objects in a query image based on given exemplars without the need for additional training. However, there is a shortcoming in the prevailing

Externí odkaz: http://arxiv.org/abs/2408.09734

Zobrazit plný text záznamu

Report

Boundary-Recovering Network for Temporal Action Detection

Autor: Kim, Jihwan, Choi, Jaehyun, Jeon, Yerim, Heo, Jae-Pil

Temporal action detection (TAD) is challenging, yet fundamental for real-world video applications. Large temporal scale variation of actions is one of the most primary difficulties in TAD. Naturally, multi-scale features have potential in localizing

Externí odkaz: http://arxiv.org/abs/2408.09354

Zobrazit plný text záznamu

Report

Microwave Quantum Illumination with Optical Memory and Single-Mode Phase-Conjugate Receiver

Autor: Jeon, Sangwoo, Kim, Jihwan, Kim, Duk Y., Kim, Zaeill, Jeong, Taek, Lee, Su-Yong

Microwave quantum illumination with entangled pairs of microwave signal and optical idler modes, can achieve the sub-optimal performance with joint measurement of the signal and idler modes. Here, we first propose a testbed of microwave quantum illum

Externí odkaz: http://arxiv.org/abs/2405.14118

Zobrazit plný text záznamu

Report

FIFO-Diffusion: Generating Infinite Videos from Text without Training

Autor: Kim, Jihwan, Kang, Junoh, Choi, Jinyoung, Han, Bohyung

Publikováno v: NeurIPS 2024

We propose a novel inference technique based on a pretrained diffusion model for text-conditional video generation. Our approach, called FIFO-Diffusion, is conceptually capable of generating infinitely long videos without additional training. This is

Externí odkaz: http://arxiv.org/abs/2405.11473

Zobrazit plný text záznamu

Report

Uncovering a Paleotsunami Triggered by Mass-Movement in an Alpine Lake

Autor: Zafar, Muhammad Naveed, Dutykh, Denys, Sabatier, Pierre, Banjan, Mathilde, Kim, Jihwan

Mass movements and delta collapses are significant sources of tsunamis in lacustrine environments, impacting human societies enormously. Palaeotsunamis play an essential role in understanding historical events and their consequences along with their

Externí odkaz: http://arxiv.org/abs/2310.17989

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání