GIST: Improving Parameter Efficient Fine Tuning via Knowledge Interaction

Autor:	Ruan, Jiacheng, Gao, Jingsheng, Xie, Mingye, Xiang, Suncheng, Yu, Zefang, Liu, Ting, Fu, Yuzhuo
Rok vydání:	2023
Předmět:	Computer Science - Computation and Language Computer Science - Computer Vision and Pattern Recognition
Druh dokumentu:	Working Paper
Popis:	The Parameter-Efficient Fine-Tuning (PEFT) method, which adjusts or introduces fewer trainable parameters to calibrate pre-trained models on downstream tasks, has become a recent research interest. However, existing PEFT methods within the traditional fine-tiuning framework have two main shortcomings: 1) They overlook the explicit association between trainable parameters and downstream task knowledge. 2) They neglect the interaction between the intrinsic task-agnostic knowledge of pre-trained models and the task-specific knowledge in downstream tasks. To address this gap, we propose a novel fine-tuning framework, named GIST, in a plug-and-play manner. Specifically, our framework first introduces a trainable token, called the Gist token, when applying PEFT methods on downstream tasks. This token serves as an aggregator of the task-specific knowledge learned by the PEFT methods and forms an explicit association with downstream knowledge. Furthermore, to facilitate explicit interaction between task-agnostic and task-specific knowledge, we introduce the concept of Knowledge Interaction via a Bidirectional Kullback-Leibler Divergence objective. As a result, PEFT methods within our framework can make the pre-trained model understand downstream tasks more comprehensively by leveraging the knowledge interaction. Extensive experiments demonstrate the universality and scalability of our framework. Notably, on the VTAB-1K benchmark, we employ the Adapter (a prevalent PEFT method) within our GIST framework and achieve a performance boost of 2.25%, with an increase of only 0.8K parameters. The Code will be released. Comment: 17pages, 8 figures, 22 tables, Work in progress
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2312.07255 Zobrazit plný text záznamu View this record from Arxiv