Zobrazeno 1 - 10
of 15 811
pro vyhledávání: '"LIU Ting-an"'
Parameter-efficient tuning (PET) techniques calibrate the model's predictions on downstream tasks by freezing the pre-trained models and introducing a small number of learnable parameters. However, despite the numerous PET methods proposed, their rob
Externí odkaz:
http://arxiv.org/abs/2410.09845
Autor:
Du, Yanrui, Zhao, Sendong, Cao, Jiawei, Ma, Ming, Zhao, Danyang, Fan, Fenglei, Liu, Ting, Qin, Bing
Instruction Fine-Tuning (IFT) has become an essential method for adapting base Large Language Models (LLMs) into variants for professional and private use. However, researchers have raised concerns over a significant decrease in LLMs' security follow
Externí odkaz:
http://arxiv.org/abs/2410.04524
Autor:
Zhao, Weixiang, Hu, Yulin, Guo, Jiahe, Sui, Xingyu, Wu, Tongtong, Deng, Yang, Zhao, Yanyan, Qin, Bing, Che, Wanxiang, Liu, Ting
Despite the growing global demand for large language models (LLMs) that serve users from diverse linguistic backgrounds, most cutting-edge LLMs remain predominantly English-centric. This creates a performance gap across languages, restricting access
Externí odkaz:
http://arxiv.org/abs/2410.04407
Autor:
Zhao, Long, Woo, Sanghyun, Wan, Ziyu, Li, Yandong, Zhang, Han, Gong, Boqing, Adam, Hartwig, Jia, Xuhui, Liu, Ting
In generative modeling, tokenization simplifies complex data into compact, structured representations, creating a more efficient, learnable space. For high-dimensional visual data, it reduces redundancy and emphasizes key features for high-quality ge
Externí odkaz:
http://arxiv.org/abs/2410.04081
Diffusion transformers have shown significant effectiveness in both image and video synthesis at the expense of huge computation costs. To address this problem, feature caching methods have been introduced to accelerate diffusion transformers by cach
Externí odkaz:
http://arxiv.org/abs/2410.05317
Autor:
Ruan, Jiacheng, Yuan, Wenzhen, Lin, Zehao, Liao, Ning, Li, Zhiyu, Xiong, Feiyu, Liu, Ting, Fu, Yuzhuo
Large visual-language models (LVLMs) have achieved great success in multiple applications. However, they still encounter challenges in complex scenes, especially those involving camouflaged objects. This is primarily due to the lack of samples relate
Externí odkaz:
http://arxiv.org/abs/2409.16084
LLMs' performance on complex tasks is still unsatisfactory. A key issue is that presently LLMs learn in a data-driven schema, while the instructions about these complex tasks are both scarce and hard to collect or construct. On the contrary, a promin
Externí odkaz:
http://arxiv.org/abs/2409.15820
Referring Expression Comprehension (REC), which aims to ground a local visual region via natural language, is a task that heavily relies on multimodal alignment. Most existing methods utilize powerful pre-trained models to transfer visual/linguistic
Externí odkaz:
http://arxiv.org/abs/2409.13609
Autor:
Liu, Ting-Ru, Yang, Hsuan-Kung, Liu, Jou-Min, Huang, Chun-Wei, Chiang, Tsung-Chih, Kong, Quan, Kobori, Norimasa, Lee, Chun-Yi
Scene coordinate regression (SCR) methods have emerged as a promising area of research due to their potential for accurate visual localization. However, many existing SCR approaches train on samples from all image regions, including dynamic objects a
Externí odkaz:
http://arxiv.org/abs/2409.04178
Autor:
Wang, Yu, Zhao, Shiwan, Wang, Zhihu, Huang, Heyuan, Fan, Ming, Zhang, Yubo, Wang, Zhixing, Wang, Haijun, Liu, Ting
The Chain-of-Thought (CoT) paradigm has emerged as a critical approach for enhancing the reasoning capabilities of large language models (LLMs). However, despite their widespread adoption and success, CoT methods often exhibit instability due to thei
Externí odkaz:
http://arxiv.org/abs/2409.03271