Výsledky vyhledávání

Report

Kendall's $\tau$ Coefficient for Logits Distillation

Autor: Guan, Yuchen, Cheng, Runxi, Liu, Kang, Yuan, Chun

Knowledge distillation typically employs the Kullback-Leibler (KL) divergence to constrain the student model's output to match the soft labels provided by the teacher model exactly. However, sometimes the optimization direction of the KL divergence l

Externí odkaz: http://arxiv.org/abs/2409.17823

Zobrazit plný text záznamu

Report

Optically Coherent Nitrogen-Vacancy Centers in HPHT Treated Diamonds

Autor: Tang, Yuan-Han, Zhang, Xiaoran, Liu, Kang-Yuan, Xia, Fan, Zheng, Huijie, Liu, Xiaobing, Pan, Xin-Yu, Fan, Heng, Liu, Gang-Qin

As a point defect with unique spin and optical properties, nitrogen-vacancy (NV) center in diamond has attracted much attention in the fields of quantum sensing, quantum simulation, and quantum networks. The optical properties of an NV center are cru

Externí odkaz: http://arxiv.org/abs/2409.17442

Zobrazit plný text záznamu

Report

Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks

Autor: Liao, Huanxuan, He, Shizhu, Xu, Yao, Zhang, Yuanzhe, Liu, Kang, Zhao, Jun

In this paper, we propose $\textbf{Ne}$ural-$\textbf{Sy}$mbolic $\textbf{C}$ollaborative $\textbf{D}$istillation ($\textbf{NesyCD}$), a novel knowledge distillation method for learning the complex reasoning abilities of Large Language Models (LLMs, e

Externí odkaz: http://arxiv.org/abs/2409.13203

Zobrazit plný text záznamu

Report

CITI: Enhancing Tool Utilizing Ability in Large Language Models without Sacrificing General Performance

Autor: Hao, Yupu, Cao, Pengfei, Jin, Zhuoran, Liao, Huanxuan, Chen, Yubo, Liu, Kang, Zhao, Jun

Tool learning enables the Large Language Models (LLMs) to interact with the external environment by invoking tools, enriching the accuracy and capability scope of LLMs. However, previous works predominantly focus on improving model's tool-utilizing a

Externí odkaz: http://arxiv.org/abs/2409.13202

Zobrazit plný text záznamu

Report

$\textit{SKIntern}$: Internalizing Symbolic Knowledge for Distilling Better CoT Capabilities into Small Language Models

Autor: Liao, Huanxuan, He, Shizhu, Hao, Yupu, Li, Xiang, Zhang, Yuanzhe, Liu, Kang, Zhao, Jun

Small Language Models (SLMs) are attracting attention due to the high computational demands and privacy concerns of Large Language Models (LLMs). Some studies fine-tune SLMs using Chains of Thought (CoT) data distilled from LLMs, aiming to enhance th

Externí odkaz: http://arxiv.org/abs/2409.13183

Zobrazit plný text záznamu

Report

Does Knowledge Localization Hold True? Surprising Differences Between Entity and Relation Perspectives in Language Models

Autor: Wei, Yifan, Yu, Xiaoyan, Weng, Yixuan, Ma, Huanhuan, Zhang, Yuanzhe, Zhao, Jun, Liu, Kang

Large language models encapsulate knowledge and have demonstrated superior performance on various natural language processing tasks. Recent studies have localized this knowledge to specific model parameters, such as the MLP weights in intermediate la

Externí odkaz: http://arxiv.org/abs/2409.00617

Zobrazit plný text záznamu

Report

Large Language Models as Foundations for Next-Gen Dense Retrieval: A Comprehensive Empirical Assessment

Autor: Luo, Kun, Qin, Minghao, Liu, Zheng, Xiao, Shitao, Zhao, Jun, Liu, Kang

Pretrained language models like BERT and T5 serve as crucial backbone encoders for dense retrieval. However, these models often exhibit limited generalization capabilities and face challenges in improving in domain accuracy. Recent research has explo

Externí odkaz: http://arxiv.org/abs/2408.12194

Zobrazit plný text záznamu

Report

Towards Robust Knowledge Unlearning: An Adversarial Framework for Assessing and Improving Unlearning Robustness in Large Language Models

Autor: Yuan, Hongbang, Jin, Zhuoran, Cao, Pengfei, Chen, Yubo, Liu, Kang, Zhao, Jun

LLM have achieved success in many fields but still troubled by problematic content in the training corpora. LLM unlearning aims at reducing their influence and avoid undesirable behaviours. However, existing unlearning methods remain vulnerable to ad

Externí odkaz: http://arxiv.org/abs/2408.10682

Zobrazit plný text záznamu

Report

ONSEP: A Novel Online Neural-Symbolic Framework for Event Prediction Based on Large Language Model

Autor: Yu, Xuanqing, Sun, Wangtao, Li, Jingwei, Liu, Kang, Liu, Chengbao, Tan, Jie

In the realm of event prediction, temporal knowledge graph forecasting (TKGF) stands as a pivotal technique. Previous approaches face the challenges of not utilizing experience during testing and relying on a single short-term history, which limits a

Externí odkaz: http://arxiv.org/abs/2408.07840

Zobrazit plný text záznamu

Report

Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models

Autor: Hu, Chenhui, Cao, Pengfei, Chen, Yubo, Liu, Kang, Zhao, Jun

Knowledge editing aims to update outdated or incorrect knowledge in large language models (LLMs). However, current knowledge editing methods have limited scalability for lifelong editing. This study explores the fundamental reason why knowledge editi

Externí odkaz: http://arxiv.org/abs/2408.07413

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání