Beyond IID: Optimizing Instruction Learning from the Perspective of Instruction Interaction and Dependency

Autor:	Zhao, Hanyu, Du, Li, Ju, Yiming, Wu, Chengwei, Pan, Tengfei
Rok vydání:	2024
Předmět:	Computer Science - Computation and Language Computer Science - Artificial Intelligence
Druh dokumentu:	Working Paper
Popis:	With the availability of various instruction datasets, a pivotal challenge is how to effectively select and integrate these instructions to fine-tune large language models (LLMs). Previous research mainly focuses on selecting individual high-quality instructions. However, these works overlooked the joint interactions and dependencies between different categories of instructions, leading to suboptimal selection strategies. Moreover, the nature of these interaction patterns remains largely unexplored, let alone optimize the instruction set with regard to them. To fill these gaps, in this paper, we: (1) systemically investigate interaction and dependency patterns between different categories of instructions, (2) manage to optimize the instruction set concerning the interaction patterns using a linear programming-based method, and optimize the learning schema of SFT using an instruction dependency taxonomy guided curriculum learning. Experimental results across different LLMs demonstrate improved performance over strong baselines on widely adopted benchmarks.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2409.07045 Zobrazit plný text záznamu View this record from Arxiv