Zobrazeno 1 - 8
of 8
pro vyhledávání: '"Xiong, Limao"'
Autor:
Zhou, Enyu, Zheng, Guodong, Wang, Binghai, Xi, Zhiheng, Dou, Shihan, Bao, Rong, Shen, Wei, Xiong, Limao, Fan, Jessica, Mou, Yurong, Zheng, Rui, Gui, Tao, Zhang, Qi, Huang, Xuanjing
Reward models (RMs) guide the alignment of large language models (LLMs), steering them toward behaviors preferred by humans. Evaluating RMs is the key to better aligning LLMs. However, the current evaluation of RMs may not directly correspond to thei
Externí odkaz:
http://arxiv.org/abs/2410.09893
Autor:
Dou, Shihan, Liu, Yan, Zhou, Enyu, Li, Tianlong, Jia, Haoxiang, Xiong, Limao, Zhao, Xin, Ye, Junjie, Zheng, Rui, Gui, Tao, Zhang, Qi, Huang, Xuanjing
The success of Reinforcement Learning from Human Feedback (RLHF) in language model alignment is critically dependent on the capability of the reward model (RM). However, as the training process progresses, the output distribution of the policy model
Externí odkaz:
http://arxiv.org/abs/2405.00438
Autor:
Zhou, Weikang, Wang, Xiao, Xiong, Limao, Xia, Han, Gu, Yingshuang, Chai, Mingxu, Zhu, Fukang, Huang, Caishuang, Dou, Shihan, Xi, Zhiheng, Zheng, Rui, Gao, Songyang, Zou, Yicheng, Yan, Hang, Le, Yifan, Wang, Ruohui, Li, Lijun, Shao, Jing, Gui, Tao, Zhang, Qi, Huang, Xuanjing
Jailbreak attacks are crucial for identifying and mitigating the security vulnerabilities of Large Language Models (LLMs). They are designed to bypass safeguards and elicit prohibited outputs. However, due to significant differences among various jai
Externí odkaz:
http://arxiv.org/abs/2403.12171
Autor:
Dou, Shihan, Liu, Yan, Jia, Haoxiang, Xiong, Limao, Zhou, Enyu, Shen, Wei, Shan, Junjie, Huang, Caishuang, Wang, Xiao, Fan, Xiaoran, Xi, Zhiheng, Zhou, Yuhao, Ji, Tao, Zheng, Rui, Zhang, Qi, Huang, Xuanjing, Gui, Tao
The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code generation qu
Externí odkaz:
http://arxiv.org/abs/2402.01391
Autor:
Xi, Zhiheng, Chen, Wenxiang, Guo, Xin, He, Wei, Ding, Yiwen, Hong, Boyang, Zhang, Ming, Wang, Junzhe, Jin, Senjie, Zhou, Enyu, Zheng, Rui, Fan, Xiaoran, Wang, Xiao, Xiong, Limao, Zhou, Yuhao, Wang, Weiran, Jiang, Changhao, Zou, Yicheng, Liu, Xiangyang, Yin, Zhangyue, Dou, Shihan, Weng, Rongxiang, Cheng, Wensen, Zhang, Qi, Qin, Wenjuan, Zheng, Yongyan, Qiu, Xipeng, Huang, Xuanjing, Gui, Tao
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decis
Externí odkaz:
http://arxiv.org/abs/2309.07864
Autor:
Zheng, Rui, Dou, Shihan, Gao, Songyang, Hua, Yuan, Shen, Wei, Wang, Binghai, Liu, Yan, Jin, Senjie, Liu, Qin, Zhou, Yuhao, Xiong, Limao, Chen, Lu, Xi, Zhiheng, Xu, Nuo, Lai, Wenbin, Zhu, Minghao, Chang, Cheng, Yin, Zhangyue, Weng, Rongxiang, Cheng, Wensen, Huang, Haoran, Sun, Tianxiang, Yan, Hang, Gui, Tao, Zhang, Qi, Qiu, Xipeng, Huang, Xuanjing
Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. Its primary objective is to function as a human-centric (helpful, honest, and harmless) assistant. Alignment with humans assumes paramoun
Externí odkaz:
http://arxiv.org/abs/2307.04964
Autor:
Xiong, Limao, Zhou, Jie, Zhu, Qunxi, Wang, Xiao, Wu, Yuanbin, Zhang, Qi, Gui, Tao, Huang, Xuanjing, Ma, Jin, Shan, Ying
Existing models for named entity recognition (NER) are mainly based on large-scale labeled datasets, which always obtain using crowdsourcing. However, it is hard to obtain a unified and correct label via majority voting from multiple annotators for N
Externí odkaz:
http://arxiv.org/abs/2305.12485
Autor:
Wang, Xiao, Dou, Shihan, Xiong, Limao, Zou, Yicheng, Zhang, Qi, Gui, Tao, Qiao, Liang, Cheng, Zhanzhan, Huang, Xuanjing
NER model has achieved promising performance on standard NER benchmarks. However, recent studies show that previous approaches may over-rely on entity mention information, resulting in poor performance on out-of-vocabulary (OOV) entity recognition. I
Externí odkaz:
http://arxiv.org/abs/2204.04391