Zobrazeno 1 - 10
of 3 344
pro vyhledávání: '"Kim, DongHyun"'
Autor:
Hwang, Hochul, Suzuki, Ken, Giudice, Nicholas A, Biswas, Joydeep, Lee, Sunghoon Ivan, Kim, Donghyun
While guide dogs offer essential mobility assistance, their high cost, limited availability, and care requirements make them inaccessible to most blind or low vision (BLV) individuals. Recent advances in quadruped robots provide a scalable solution f
Externí odkaz:
http://arxiv.org/abs/2409.19778
Robotic mobility aids for blind and low-vision (BLV) individuals rely heavily on deep learning-based vision models specialized for various navigational tasks. However, the performance of these models is often constrained by the availability and diver
Externí odkaz:
http://arxiv.org/abs/2409.11164
Autor:
Kim, Donghyun, Oh, Jaeseong
We introduce the Macdonald piece polynomial $\operatorname{I}_{\mu,\lambda,k}[X;q,t]$, which is a vast generalization of the Macdonald intersection polynomial in the science fiction conjecture by Bergeron and Garsia. We demonstrate a remarkable conne
Externí odkaz:
http://arxiv.org/abs/2409.01041
Large-scale image-text pre-trained models enable zero-shot classification and provide consistent accuracy across various data distributions. Nonetheless, optimizing these models in downstream tasks typically requires fine-tuning, which reduces genera
Externí odkaz:
http://arxiv.org/abs/2408.05749
Vision-language (VL) models often exhibit a limited understanding of complex expressions of visual objects (e.g., attributes, shapes, and their relations), given complex and diverse language queries. Traditional approaches attempt to improve VL model
Externí odkaz:
http://arxiv.org/abs/2407.15296
Soccer kicking is a complex whole-body motion that requires intricate coordination of various motor actions. To accomplish such dynamic motion in a humanoid robot, the robot needs to simultaneously: 1) transfer high kinetic energy to the kicking leg,
Externí odkaz:
http://arxiv.org/abs/2407.14612
Autonomous Vehicles (AV) and Advanced Driver Assistant Systems (ADAS) prioritize safety over comfort. The intertwining factors of safety and comfort emerge as pivotal elements in ensuring the effectiveness of Autonomous Driving (AD). Users often expe
Externí odkaz:
http://arxiv.org/abs/2407.08073
Image dehazing, addressing atmospheric interference like fog and haze, remains a pervasive challenge crucial for robust vision applications such as surveillance and remote sensing under adverse visibility. While various methodologies have evolved fro
Externí odkaz:
http://arxiv.org/abs/2407.00972
While advancements in Vision Language Models (VLMs) have significantly improved the alignment of visual and textual data, these models primarily focus on aligning images with short descriptive captions. This focus limits their ability to handle compl
Externí odkaz:
http://arxiv.org/abs/2407.09541
Autor:
Park, Jongwoo, Ranasinghe, Kanchana, Kahatapitiya, Kumara, Ryoo, Wonjeong, Kim, Donghyun, Ryoo, Michael S.
Long-form videos that span across wide temporal intervals are highly information redundant and contain multiple distinct events or entities that are often loosely related. Therefore, when performing long-form video question answering (LVQA), all info
Externí odkaz:
http://arxiv.org/abs/2406.09396