Zobrazeno 1 - 10
of 14
pro vyhledávání: '"Fu, Dayuan"'
Publikováno v:
EMNLP 2024 Main
Long-term memory is significant for agents, in which insights play a crucial role. However, the emergence of irrelevant insight and the lack of general insight can greatly undermine the effectiveness of insight. To solve this problem, in this paper,
Externí odkaz:
http://arxiv.org/abs/2409.16686
Autor:
Wang, Yejie, He, Keqing, Fu, Dayuan, Gongque, Zhuoma, Xu, Heyang, Chen, Yanxu, Wang, Zhexu, Fu, Yujia, Dong, Guanting, Diao, Muxi, Wang, Jingang, Zhang, Mengdi, Cai, Xunliang, Xu, Weiran
Recently, there has been a growing interest in studying how to construct better code instruction tuning data. However, we observe Code models trained with these datasets exhibit high performance on HumanEval but perform worse on other benchmarks such
Externí odkaz:
http://arxiv.org/abs/2409.03810
Autor:
Song, Xiaoshuai, Diao, Muxi, Dong, Guanting, Wang, Zhengyang, Fu, Yujia, Qiao, Runqi, Wang, Zhexu, Fu, Dayuan, Wu, Huangxuan, Liang, Bin, Zeng, Weihao, Wang, Yejie, GongQue, Zhuoma, Yu, Jianing, Tan, Qiuna, Xu, Weiran
Computer Science (CS) stands as a testament to the intricacies of human intelligence, profoundly advancing the development of artificial intelligence and modern society. However, the current community of large language models (LLMs) overly focuses on
Externí odkaz:
http://arxiv.org/abs/2406.08587
Language models pre-trained on general text have achieved impressive results in diverse fields. Yet, the distinct linguistic characteristics of task-oriented dialogues (TOD) compared to general text limit the practical utility of existing language mo
Externí odkaz:
http://arxiv.org/abs/2404.00557
Autor:
Jiang, Che, Qi, Biqing, Hong, Xiangyu, Fu, Dayuan, Cheng, Yang, Meng, Fandong, Yu, Mo, Zhou, Bowen, Zhou, Jie
Large language models are successful in answering factoid questions but are also prone to hallucination. We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics, an
Externí odkaz:
http://arxiv.org/abs/2403.20009
Pre-trained language models have been successful in many scenarios. However, their usefulness in task-oriented dialogues is limited due to the intrinsic linguistic differences between general text and task-oriented dialogues. Current task-oriented di
Externí odkaz:
http://arxiv.org/abs/2403.01163
Addressing the discrepancies between predictions and actual outcomes often aids individuals in expanding their thought processes and engaging in reflection, thereby facilitating reasoning in the correct direction. In this paper, we introduce $\textbf
Externí odkaz:
http://arxiv.org/abs/2402.11534
Autor:
Dong, Guanting, Wang, Zechen, Zhao, Jinxu, Zhao, Gang, Guo, Daichi, Fu, Dayuan, Hui, Tingfeng, Zeng, Chen, He, Keqing, Li, Xuefeng, Wang, Liwen, Cui, Xinyue, Xu, Weiran
The objective of few-shot named entity recognition is to identify named entities with limited labeled instances. Previous works have primarily focused on optimizing the traditional token-wise classification framework, while neglecting the exploration
Externí odkaz:
http://arxiv.org/abs/2308.14533
Autor:
Dong, Guanting, Wang, Zechen, Wang, Liwen, Guo, Daichi, Fu, Dayuan, Wu, Yuxiang, Zeng, Chen, Li, Xuefeng, Hui, Tingfeng, He, Keqing, Cui, Xinyue, Gao, Qixiang, Xu, Weiran
Few-shot named entity recognition (NER) aims at identifying named entities based on only few labeled instances. Most existing prototype-based sequence labeling models tend to memorize entity mentions which would be easily confused by close prototypes
Externí odkaz:
http://arxiv.org/abs/2302.13610
Autor:
Guo, Daichi, Dong, Guanting, Fu, Dayuan, Wu, Yuxiang, Zeng, Chen, Hui, Tingfeng, Wang, Liwen, Li, Xuefeng, Wang, Zechen, He, Keqing, Cui, Xinyue, Xu, Weiran
In real dialogue scenarios, the existing slot filling model, which tends to memorize entity patterns, has a significantly reduced generalization facing Out-of-Vocabulary (OOV) problems. To address this issue, we propose an OOV robust slot filling mod
Externí odkaz:
http://arxiv.org/abs/2302.13584