Zobrazeno 1 - 10
of 141
pro vyhledávání: '"Chen, Qiguang"'
The development of large language models (LLMs) has significantly expanded model sizes, resulting in substantial GPU memory requirements during inference. The key and value storage of the attention map in the KV (key-value) cache accounts for more th
Externí odkaz:
http://arxiv.org/abs/2410.18517
Chain-of-Thought (CoT) reasoning has emerged as a promising approach for enhancing the performance of large language models (LLMs) on complex reasoning tasks. Recently, a series of studies attempt to explain the mechanisms underlying CoT, aiming to d
Externí odkaz:
http://arxiv.org/abs/2410.05695
Autor:
Zhang, Yongheng, Chen, Qiguang, Zhou, Jingxuan, Wang, Peng, Si, Jiasheng, Wang, Jin, Lu, Wenpeng, Qin, Libo
Chain-of-Thought (CoT) has become a vital technique for enhancing the performance of Large Language Models (LLMs), attracting increasing attention from researchers. One stream of approaches focuses on the iterative enhancement of LLMs by continuously
Externí odkaz:
http://arxiv.org/abs/2410.04463
Autor:
Wang, Dingzirui, Zhang, Xuangliang, Chen, Qiguang, Dou, Longxu, Xu, Xiao, Cao, Rongyu, Ma, Yingwei, Zhu, Qingfu, Che, Wanxiang, Li, Binhua, Huang, Fei, Li, Yongbin
In-context learning (ICL) is an effective approach to help large language models (LLMs) adapt to various tasks by providing demonstrations of the target task. Considering the high cost of labeling demonstrations, many methods propose synthesizing dem
Externí odkaz:
http://arxiv.org/abs/2410.01548
Autor:
Chen, Zhi, Chen, Qiguang, Qin, Libo, Guo, Qipeng, Lv, Haijun, Zou, Yicheng, Che, Wanxiang, Yan, Hang, Chen, Kai, Lin, Dahua
Recent advancements in large language models (LLMs) with extended context windows have significantly improved tasks such as information extraction, question answering, and complex planning scenarios. In order to achieve success in long context tasks,
Externí odkaz:
http://arxiv.org/abs/2409.01893
Large Language Model (LLM)-based agents exhibit significant potential across various domains, operating as interactive systems that process environmental observations to generate executable actions for target tasks. The effectiveness of these agents
Externí odkaz:
http://arxiv.org/abs/2408.09559
Autor:
Chen, Qiguang, Pan, Ya-Jun
Industry 4.0 proposes the integration of artificial intelligence (AI) into manufacturing and other industries to create smart collaborative systems which enhance efficiency. The aim of this paper is to develop a flexible and adaptive framework to gen
Externí odkaz:
http://arxiv.org/abs/2407.08534
Cross-lingual chain-of-thought can effectively complete reasoning tasks across languages, which gains increasing attention. Recently, dominant approaches in the literature improve cross-lingual alignment capabilities by integrating reasoning knowledg
Externí odkaz:
http://arxiv.org/abs/2406.13940
Autor:
Qin, Libo, Wei, Fuxuan, Chen, Qiguang, Zhou, Jingxuan, Huang, Shijue, Si, Jiasheng, Lu, Wenpeng, Che, Wanxiang
Slot filling and intent detection are two highly correlated tasks in spoken language understanding (SLU). Recent SLU research attempts to explore zero-shot prompting techniques in large language models to alleviate the data scarcity problem. Neverthe
Externí odkaz:
http://arxiv.org/abs/2406.10505
Multi-modal Chain-of-Thought (MCoT) requires models to leverage knowledge from both textual and visual modalities for step-by-step reasoning, which gains increasing attention. Nevertheless, the current MCoT benchmark still faces some challenges: (1)
Externí odkaz:
http://arxiv.org/abs/2405.16473