Zobrazeno 1 - 10
of 236
pro vyhledávání: '"Hsu, Winston"'
Autor:
Su, Hung-Ting, Hsu, Ya-Ching, Lin, Xudong, Shi, Xiang-Qian, Niu, Yulei, Hsu, Han-Yuan, Lee, Hung-yi, Hsu, Winston H.
Large language models (LLMs) equipped with chain-of-thoughts (CoT) prompting have shown significant multi-step reasoning capabilities in factual content like mathematics, commonsense, and logic. However, their performance in narrative reasoning, whic
Externí odkaz:
http://arxiv.org/abs/2409.14324
The robust self-training (RST) framework has emerged as a prominent approach for semi-supervised adversarial training. To explore the possibility of tackling more complicated tasks with even lower labeling budgets, unlike prior approaches that rely o
Externí odkaz:
http://arxiv.org/abs/2409.12946
LiDAR-based 3D object detection is a critical technology for the development of autonomous driving and robotics. However, the high cost of data annotation limits its advancement. We propose a novel and effective active learning (AL) method called Dis
Externí odkaz:
http://arxiv.org/abs/2409.05425
Pre-explored Semantic Maps, constructed through prior exploration using visual language models (VLMs), have proven effective as foundational elements for training-free robotic applications. However, existing approaches assume the map's accuracy and d
Externí odkaz:
http://arxiv.org/abs/2409.04837
Autor:
Faure, Gueter Josmy, Yeh, Jia-Fong, Chen, Min-Hung, Su, Hung-Ting, Hsu, Winston H., Lai, Shang-Hong
Existing research often treats long-form videos as extended short videos, leading to several limitations: inadequate capture of long-range dependencies, inefficient processing of redundant information, and failure to extract high-level semantic conce
Externí odkaz:
http://arxiv.org/abs/2408.17443
Autor:
Su, Hung-Ting, Chao, Chun-Tong, Hsu, Ya-Ching, Lin, Xudong, Niu, Yulei, Lee, Hung-Yi, Hsu, Winston H.
Large Language Models (LLMs) have demonstrated effectiveness not only in language tasks but also in video reasoning. This paper introduces a novel dataset, Tropes in Movies (TiM), designed as a testbed for exploring two critical yet previously overlo
Externí odkaz:
http://arxiv.org/abs/2406.10923
We observe that current state-of-the-art (SOTA) methods suffer from the performance imbalance issue when performing multi-task reinforcement learning (MTRL) tasks. While these methods may achieve impressive performance on average, they perform extrem
Externí odkaz:
http://arxiv.org/abs/2406.00761
We study reward models for long-horizon manipulation tasks by learning from action-free videos and language instructions, which we term the visual-instruction correlation (VIC) problem. Recent advancements in cross-modality modeling have highlighted
Externí odkaz:
http://arxiv.org/abs/2405.16545
Traditional traffic prediction, limited by the scope of sensor data, falls short in comprehensive traffic management. Mobile networks offer a promising alternative using network activity counts, but these lack crucial directionality. Thus, we present
Externí odkaz:
http://arxiv.org/abs/2405.17507
Currently, low-light conditions present a significant challenge for machine cognition. In this paper, rather than optimizing models by assuming that human and machine cognition are correlated, we use zero-reference low-light enhancement to improve th
Externí odkaz:
http://arxiv.org/abs/2405.11478