Zobrazeno 1 - 10
of 69
pro vyhledávání: '"Cui, Leyang"'
Large Language Models (LLMs) have demonstrated impressive capabilities in a wide range of natural language processing tasks when leveraging in-context learning. To mitigate the additional computational and financial costs associated with in-context l
Externí odkaz:
http://arxiv.org/abs/2410.11786
Autor:
Zhang, Yu, Yang, Songlin, Zhu, Ruijie, Zhang, Yue, Cui, Leyang, Wang, Yiqiao, Wang, Bolun, Shi, Freda, Wang, Bailin, Bi, Wei, Zhou, Peng, Fu, Guohong
Linear attention Transformers and their gated variants, celebrated for enabling parallel training and efficient recurrent inference, still fall short in recall-intensive tasks compared to traditional Transformers and demand significant resources for
Externí odkaz:
http://arxiv.org/abs/2409.07146
Iterative preference learning, though yielding superior performances, requires online annotated preference labels. In this work, we study strategies to select worth-annotating response pairs for cost-efficient annotation while achieving competitive o
Externí odkaz:
http://arxiv.org/abs/2406.17312
Autor:
Cai, Deng, Li, Huayang, Fu, Tingchen, Li, Siheng, Xu, Weiwen, Li, Shuaiyi, Cao, Bowen, Zhang, Zhisong, Huang, Xinting, Cui, Leyang, Wang, Yan, Liu, Lemao, Watanabe, Taro, Shi, Shuming
Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications. In this paper, we demonstrate the interchangeability of three popular and distinct adaptation too
Externí odkaz:
http://arxiv.org/abs/2406.16377
AI-generated text detection has attracted increasing attention as powerful language models approach human-level generation. Limited work is devoted to detecting (partially) AI-paraphrased texts. However, AI paraphrasing is commonly employed in variou
Externí odkaz:
http://arxiv.org/abs/2405.12689
Autor:
Huang, Jianheng, Cui, Leyang, Wang, Ante, Yang, Chengyi, Liao, Xinting, Song, Linfeng, Yao, Junfeng, Su, Jinsong
Large language models (LLMs) suffer from catastrophic forgetting during continual learning. Conventional rehearsal-based methods rely on previous training data to retain the model's ability, which may not be feasible in real-world applications. When
Externí odkaz:
http://arxiv.org/abs/2403.01244
Large language models (LLMs) have achieved impressive performance across various mathematical reasoning benchmarks. However, there are increasing debates regarding whether these models truly understand and apply mathematical knowledge or merely rely
Externí odkaz:
http://arxiv.org/abs/2402.19255
Standard language models generate text by selecting tokens from a fixed, finite, and standalone vocabulary. We introduce a novel method that selects context-aware phrases from a collection of supporting documents. One of the most significant challeng
Externí odkaz:
http://arxiv.org/abs/2402.17532
While large language models (LLMs) have demonstrated exceptional performance across various tasks following human alignment, they may still generate responses that sound plausible but contradict factual knowledge, a phenomenon known as hallucination.
Externí odkaz:
http://arxiv.org/abs/2401.10768
Despite their impressive capabilities, large language models (LLMs) have been observed to generate responses that include inaccurate or fabricated information, a phenomenon commonly known as ``hallucination''. In this work, we propose a simple \texti
Externí odkaz:
http://arxiv.org/abs/2312.15710