Výsledky vyhledávání

Report

Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons

Autor: Leng, Yongqi, Xiong, Deyi

While large language models (LLMs) have demonstrated superior multi-task capabilities, understanding the learning mechanisms behind this is still a challenging problem. In this paper, we attempt to understand such mechanisms from the perspective of n

Externí odkaz: http://arxiv.org/abs/2407.06488

Zobrazit plný text záznamu

Report

Planning with Large Language Models for Conversational Agents

Autor: Li, Zhigen, Peng, Jianxiang, Wang, Yanmeng, Shen, Tianhao, Zhang, Minghui, Su, Linxi, Wu, Shang, Wu, Yihang, Wang, Yuqian, Wang, Ye, Hu, Wei, Li, Jianfeng, Wang, Shaojun, Xiao, Jing, Xiong, Deyi

Controllability and proactivity are crucial properties of autonomous conversational agents (CAs). Controllability requires the CAs to follow the standard operating procedures (SOPs), such as verifying identity before activating credit cards. Proactiv

Externí odkaz: http://arxiv.org/abs/2407.03884

Zobrazit plný text záznamu

Report

DART: Deep Adversarial Automated Red Teaming for LLM Safety

Autor: Jiang, Bojian, Jing, Yi, Shen, Tianhao, Yang, Qing, Xiong, Deyi

Manual Red teaming is a commonly-used method to identify vulnerabilities in large language models (LLMs), which, is costly and unscalable. In contrast, automated red teaming uses a Red LLM to automatically generate adversarial prompts to the Target L

Externí odkaz: http://arxiv.org/abs/2407.03876

Zobrazit plný text záznamu

Report

IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons

Autor: Shi, Dan, Jin, Renren, Shen, Tianhao, Dong, Weilong, Wu, Xinwei, Xiong, Deyi

It is widely acknowledged that large language models (LLMs) encode a vast reservoir of knowledge after being trained on mass data. Recent studies disclose knowledge conflicts in LLM generation, wherein outdated or incorrect parametric knowledge (i.e.

Externí odkaz: http://arxiv.org/abs/2406.18406

Zobrazit plný text záznamu

Report

MoE-CT: A Novel Approach For Large Language Models Training With Resistance To Catastrophic Forgetting

Autor: Li, Tianhao, Li, Shangjie, Xie, Binbin, Xiong, Deyi, Yang, Baosong

The advent of large language models (LLMs) has predominantly catered to high-resource languages, leaving a disparity in performance for low-resource languages. Conventional Continual Training (CT) approaches to bridge this gap often undermine a model

Externí odkaz: http://arxiv.org/abs/2407.00875

Zobrazit plný text záznamu

Report

Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning

Autor: Cui, Menglong, Du, Jiangcun, Zhu, Shaolin, Xiong, Deyi

Large language models (LLMs) exhibit outstanding performance in machine translation via in-context learning. In contrast to sentence-level translation, document-level translation (DOCMT) by LLMs based on in-context learning faces two major challenges

Externí odkaz: http://arxiv.org/abs/2406.07081

Zobrazit plný text záznamu

Report

CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models

Autor: Shi, Ling, Xiong, Deyi

Large language models (LLMs) are possessed of numerous beneficial capabilities, yet their potential inclination harbors unpredictable risks that may materialize in the future. We hence propose CRiskEval, a Chinese dataset meticulously designed for ga

Externí odkaz: http://arxiv.org/abs/2406.04752

Zobrazit plný text záznamu

Report

Benchmarks Underestimate the Readiness of Multi-lingual Dialogue Agents

Autor: Lee, Andrew H., Semnani, Sina J., Castillo-López, Galo, de Chalendar, Gäel, Choudhury, Monojit, Dua, Ashna, Kavitha, Kapil Rajesh, Kim, Sungkyun, Kodali, Prashant, Kumaraguru, Ponnurangam, Lombard, Alexis, Moradshahi, Mehrad, Park, Gihyun, Semmar, Nasredine, Seo, Jiwon, Shen, Tianhao, Shrivastava, Manish, Xiong, Deyi, Lam, Monica S.

Creating multilingual task-oriented dialogue (TOD) agents is challenging due to the high cost of training data acquisition. Following the research trend of improving training data efficiency, we show for the first time, that in-context learning is su

Externí odkaz: http://arxiv.org/abs/2405.17840

Zobrazit plný text záznamu

Report

Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs

Autor: Sun, Chenxi, Zhang, Hongzhi, Lin, Zijia, Zhang, Jingyuan, Zhang, Fuzheng, Wang, Zhongyuan, Chen, Bin, Song, Chengru, Zhang, Di, Gai, Kun, Xiong, Deyi

Large language models have demonstrated exceptional capability in natural language understanding and generation. However, their generation speed is limited by the inherently sequential nature of their decoding process, posing challenges for real-time

Externí odkaz: http://arxiv.org/abs/2405.15208

Zobrazit plný text záznamu

Report

ConTrans: Weak-to-Strong Alignment Engineering via Concept Transplantation

Autor: Dong, Weilong, Wu, Xinwei, Jin, Renren, Xu, Shaoyang, Xiong, Deyi

Ensuring large language models (LLM) behave consistently with human goals, values, and intentions is crucial for their safety but yet computationally expensive. To reduce the computational cost of alignment training of LLMs, especially for those with

Externí odkaz: http://arxiv.org/abs/2405.13578

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání