Zobrazeno 1 - 10
of 177
pro vyhledávání: '"Chen-min Hung"'
Autor:
Liu, Shih-Yang, Yang, Huck, Wang, Chein-Yi, Fung, Nai Chit, Yin, Hongxu, Sakr, Charbel, Muralidharan, Saurav, Cheng, Kwang-Ting, Kautz, Jan, Wang, Yu-Chiang Frank, Molchanov, Pavlo, Chen, Min-Hung
In this work, we re-formulate the model compression problem into the customized compensation problem: Given a compressed model, we aim to introduce residual low-rank paths to compensate for compression errors under customized requirements from users
Externí odkaz:
http://arxiv.org/abs/2410.21271
Autor:
Faure, Gueter Josmy, Yeh, Jia-Fong, Chen, Min-Hung, Su, Hung-Ting, Lai, Shang-Hong, Hsu, Winston H.
Existing research often treats long-form videos as extended short videos, leading to several limitations: inadequate capture of long-range dependencies, inefficient processing of redundant information, and failure to extract high-level semantic conce
Externí odkaz:
http://arxiv.org/abs/2408.17443
Spatio-temporal action detection encompasses the tasks of localizing and classifying individual actions within a video. Recent works aim to enhance this process by incorporating interaction modeling, which captures the relationship between people and
Externí odkaz:
http://arxiv.org/abs/2408.15996
Autor:
Hirota, Yusuke, Chen, Min-Hung, Wang, Chien-Yi, Nakashima, Yuta, Wang, Yu-Chiang Frank, Hachiuma, Ryo
Large-scale vision-language models, such as CLIP, are known to contain harmful societal bias regarding protected attributes (e.g., gender and age). In this paper, we aim to address the problems of societal bias in CLIP. Although previous studies have
Externí odkaz:
http://arxiv.org/abs/2408.10202
Autor:
Lin, Ci-Siang, Liu, I-Jieh, Chen, Min-Hung, Wang, Chien-Yi, Liu, Sifei, Wang, Yu-Chiang Frank
Referring Video Object Segmentation (RVOS) aims to segment the object referred to by the query sentence throughout the entire video. Most existing methods require end-to-end training with dense mask annotations, which could be computation-consuming a
Externí odkaz:
http://arxiv.org/abs/2406.12834
Autor:
Lai, Chun-Mao, Wang, Hsiang-Chun, Hsieh, Ping-Chun, Wang, Yu-Chiang Frank, Chen, Min-Hung, Sun, Shao-Hua
Imitation learning aims to learn a policy from observing expert demonstrations without access to reward signals from environments. Generative adversarial imitation learning (GAIL) formulates imitation learning as adversarial learning, employing a gen
Externí odkaz:
http://arxiv.org/abs/2405.16194
Autor:
Wu, Ji-Jia, Chang, Andy Chia-Hao, Chuang, Chieh-Yu, Chen, Chun-Pei, Liu, Yu-Lun, Chen, Min-Hung, Hu, Hou-Ning, Chuang, Yung-Yu, Lin, Yen-Yu
This paper addresses text-supervised semantic segmentation, aiming to learn a model capable of segmenting arbitrary visual concepts within images by using only image-text pairs without dense annotations. Existing methods have demonstrated that contra
Externí odkaz:
http://arxiv.org/abs/2404.04231
Autor:
Liu, Shih-Yang, Wang, Chien-Yi, Yin, Hongxu, Molchanov, Pavlo, Wang, Yu-Chiang Frank, Cheng, Kwang-Ting, Chen, Min-Hung
Among the widely used parameter-efficient fine-tuning (PEFT) methods, LoRA and its variants have gained considerable popularity because of avoiding additional inference costs. However, there still often exists an accuracy gap between these methods an
Externí odkaz:
http://arxiv.org/abs/2402.09353
Weakly-Supervised Semantic Segmentation (WSSS) aims to train segmentation models using image data with only image-level supervision. Since precise pixel-level annotations are not accessible, existing methods typically focus on producing pseudo masks
Externí odkaz:
http://arxiv.org/abs/2401.11791
This paper proposes a cross-modal distillation framework, PartDistill, which transfers 2D knowledge from vision-language models (VLMs) to facilitate 3D shape part segmentation. PartDistill addresses three major challenges in this task: the lack of 3D
Externí odkaz:
http://arxiv.org/abs/2312.04016