Zobrazeno 1 - 10
of 478
pro vyhledávání: '"Kim, Jihwan"'
Multi-scene absolute pose regression addresses the demand for fast and memory-efficient camera pose estimation across various real-world environments. Nowadays, transformer-based model has been devised to regress the camera pose directly in multi-sce
Externí odkaz:
http://arxiv.org/abs/2411.01443
Recent advances in deep learning-based image reconstruction techniques have led to significant progress in phase retrieval using digital in-line holographic microscopy (DIHM). However, existing deep learning-based phase retrieval methods have technic
Externí odkaz:
http://arxiv.org/abs/2409.20013
Temporal Action Detection (TAD) is fundamental yet challenging for real-world video applications. Leveraging the unique benefits of transformers, various DETR-based approaches have been adopted in TAD. However, it has recently been identified that th
Externí odkaz:
http://arxiv.org/abs/2408.16729
Temporal action detection (TAD) is challenging, yet fundamental for real-world video applications. Recently, DETR-based models for TAD have been prevailing thanks to their unique benefits. However, transformers demand a huge dataset, and unfortunatel
Externí odkaz:
http://arxiv.org/abs/2408.13152
Autor:
Lee, Hyeongmin, Kim, Jin-Young, Baek, Kyungjune, Kim, Jihwan, Go, Hyojun, Ha, Seongsu, Han, Seokjin, Jang, Jiho, Jung, Raehyuk, Kim, Daewoo, Kim, GeunOh, Kim, JongMok, Kim, Jongseok, Kim, Junwan, Kwon, Soonwoo, Lee, Jangwon, Park, Seungjoon, Seo, Minjoon, Suh, Jay, Yi, Jaehyuk, Lee, Aiden
In this work, we discuss evaluating video foundation models in a fair and robust manner. Unlike language or image foundation models, many video foundation models are evaluated with differing parameters (such as sampling rate, number of frames, pretra
Externí odkaz:
http://arxiv.org/abs/2408.11318
Few-shot object counting has garnered significant attention for its practicality as it aims to count target objects in a query image based on given exemplars without the need for additional training. However, there is a shortcoming in the prevailing
Externí odkaz:
http://arxiv.org/abs/2408.09734
Temporal action detection (TAD) is challenging, yet fundamental for real-world video applications. Large temporal scale variation of actions is one of the most primary difficulties in TAD. Naturally, multi-scale features have potential in localizing
Externí odkaz:
http://arxiv.org/abs/2408.09354
Microwave quantum illumination with entangled pairs of microwave signal and optical idler modes, can achieve the sub-optimal performance with joint measurement of the signal and idler modes. Here, we first propose a testbed of microwave quantum illum
Externí odkaz:
http://arxiv.org/abs/2405.14118
Publikováno v:
NeurIPS 2024
We propose a novel inference technique based on a pretrained diffusion model for text-conditional video generation. Our approach, called FIFO-Diffusion, is conceptually capable of generating infinitely long videos without additional training. This is
Externí odkaz:
http://arxiv.org/abs/2405.11473
Mass movements and delta collapses are significant sources of tsunamis in lacustrine environments, impacting human societies enormously. Palaeotsunamis play an essential role in understanding historical events and their consequences along with their
Externí odkaz:
http://arxiv.org/abs/2310.17989