Výsledky vyhledávání

Report

Understanding Robustness of Parameter-Efficient Tuning for Image Classification

Autor: Ruan, Jiacheng, Gao, Xian, Xiang, Suncheng, Xie, Mingye, Liu, Ting, Fu, Yuzhuo

Parameter-efficient tuning (PET) techniques calibrate the model's predictions on downstream tasks by freezing the pre-trained models and introducing a small number of learnable parameters. However, despite the numerous PET methods proposed, their rob

Externí odkaz: http://arxiv.org/abs/2410.09845

Zobrazit plný text záznamu

Report

Towards Secure Tuning: Mitigating Security Risks Arising from Benign Instruction Fine-Tuning

Autor: Du, Yanrui, Zhao, Sendong, Cao, Jiawei, Ma, Ming, Zhao, Danyang, Fan, Fenglei, Liu, Ting, Qin, Bing

Instruction Fine-Tuning (IFT) has become an essential method for adapting base Large Language Models (LLMs) into variants for professional and private use. However, researchers have raised concerns over a significant decrease in LLMs' security follow

Externí odkaz: http://arxiv.org/abs/2410.04524

Zobrazit plný text záznamu

Report

Lens: Rethinking Multilingual Enhancement for Large Language Models

Autor: Zhao, Weixiang, Hu, Yulin, Guo, Jiahe, Sui, Xingyu, Wu, Tongtong, Deng, Yang, Zhao, Yanyan, Qin, Bing, Che, Wanxiang, Liu, Ting

Despite the growing global demand for large language models (LLMs) that serve users from diverse linguistic backgrounds, most cutting-edge LLMs remain predominantly English-centric. This creates a performance gap across languages, restricting access

Externí odkaz: http://arxiv.org/abs/2410.04407

Zobrazit plný text záznamu

Report

$\epsilon$-VAE: Denoising as Visual Decoding

Autor: Zhao, Long, Woo, Sanghyun, Wan, Ziyu, Li, Yandong, Zhang, Han, Gong, Boqing, Adam, Hartwig, Jia, Xuhui, Liu, Ting

In generative modeling, tokenization simplifies complex data into compact, structured representations, creating a more efficient, learnable space. For high-dimensional visual data, it reduces redundancy and emphasizes key features for high-quality ge

Externí odkaz: http://arxiv.org/abs/2410.04081

Zobrazit plný text záznamu

Report

Accelerating Diffusion Transformers with Token-wise Feature Caching

Autor: Zou, Chang, Liu, Xuyang, Liu, Ting, Huang, Siteng, Zhang, Linfeng

Diffusion transformers have shown significant effectiveness in both image and video synthesis at the expense of huge computation costs. To address this problem, feature caching methods have been introduced to accelerate diffusion transformers by cach

Externí odkaz: http://arxiv.org/abs/2410.05317

Zobrazit plný text záznamu

Report

MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios

Autor: Ruan, Jiacheng, Yuan, Wenzhen, Lin, Zehao, Liao, Ning, Li, Zhiyu, Xiong, Feiyu, Liu, Ting, Fu, Yuzhuo

Large visual-language models (LVLMs) have achieved great success in multiple applications. However, they still encounter challenges in complex scenes, especially those involving camouflaged objects. This is primarily due to the lack of samples relate

Externí odkaz: http://arxiv.org/abs/2409.16084

Zobrazit plný text záznamu

Report

Supervised Fine-Tuning Achieve Rapid Task Adaption Via Alternating Attention Head Activation Patterns

Autor: Zhao, Yang, Du, Li, Ding, Xiao, Xiong, Kai, Liu, Ting, Qin, Bing

LLMs' performance on complex tasks is still unsatisfactory. A key issue is that presently LLMs learn in a data-driven schema, while the instructions about these complex tasks are both scarce and hard to collect or construct. On the contrary, a promin

Externí odkaz: http://arxiv.org/abs/2409.15820

Zobrazit plný text záznamu

Report

MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension

Autor: Liu, Ting, Xu, Zunnan, Hu, Yue, Shi, Liangtao, Wang, Zhiqiang, Yin, Quanjun

Referring Expression Comprehension (REC), which aims to ground a local visual region via natural language, is a task that heavily relies on multimodal alignment. Most existing methods utilize powerful pre-trained models to transfer visual/linguistic

Externí odkaz: http://arxiv.org/abs/2409.13609

Zobrazit plný text záznamu

Report

Reprojection Errors as Prompts for Efficient Scene Coordinate Regression

Autor: Liu, Ting-Ru, Yang, Hsuan-Kung, Liu, Jou-Min, Huang, Chun-Wei, Chiang, Tsung-Chih, Kong, Quan, Kobori, Norimasa, Lee, Chun-Yi

Scene coordinate regression (SCR) methods have emerged as a promising area of research due to their potential for accurate visual localization. However, many existing SCR approaches train on samples from all image regions, including dynamic objects a

Externí odkaz: http://arxiv.org/abs/2409.04178

Zobrazit plný text záznamu

Report

Strategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitation

Autor: Wang, Yu, Zhao, Shiwan, Wang, Zhihu, Huang, Heyuan, Fan, Ming, Zhang, Yubo, Wang, Zhixing, Wang, Haijun, Liu, Ting

The Chain-of-Thought (CoT) paradigm has emerged as a critical approach for enhancing the reasoning capabilities of large language models (LLMs). However, despite their widespread adoption and success, CoT methods often exhibit instability due to thei

Externí odkaz: http://arxiv.org/abs/2409.03271

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání