Zobrazeno 1 - 10
of 898
pro vyhledávání: '"Li, Mengdi"'
Autor:
Hu, Lijie, Liu, Liang, Yang, Shu, Chen, Xin, Xiao, Hongru, Li, Mengdi, Zhou, Pan, Ali, Muhammad Asif, Wang, Di
Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models (LLMs). While some studies focus on improving CoT accuracy through methods like retrieval enhancement, yet a rigorous explanation for w
Externí odkaz:
http://arxiv.org/abs/2406.12255
Although there has been rapid progress in endowing robots with the ability to solve complex manipulation tasks, generating control policies for bimanual robots to solve tasks involving two hands is still challenging because of the difficulties in eff
Externí odkaz:
http://arxiv.org/abs/2404.02018
Autor:
Yang, Shu, Su, Jiayuan, Jiang, Han, Li, Mengdi, Cheng, Keyuan, Ali, Muhammad Asif, Hu, Lijie, Wang, Di
With the rise of large language models (LLMs), ensuring they embody the principles of being helpful, honest, and harmless (3H), known as Human Alignment, becomes crucial. While existing alignment methods like RLHF, DPO, etc., effectively fine-tune LL
Externí odkaz:
http://arxiv.org/abs/2404.00486
Autor:
Lu, Wenhao, Zhao, Xufeng, Fryen, Thilo, Lee, Jae Hee, Li, Mengdi, Magg, Sven, Wermter, Stefan
Reinforcement learning (RL) is a powerful technique for training intelligent agents, but understanding why these agents make specific decisions can be quite challenging. This lack of transparency in RL models has been a long-standing problem, making
Externí odkaz:
http://arxiv.org/abs/2401.00104
Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models
Reinforcement Learning (RL) plays an important role in the robotic manipulation domain since it allows self-learning from trial-and-error interactions with the environment. Still, sample efficiency and reward specification seriously limit its potenti
Externí odkaz:
http://arxiv.org/abs/2311.02379
Autor:
Zhao, Xufeng, Li, Mengdi, Lu, Wenhao, Weber, Cornelius, Lee, Jae Hee, Chu, Kun, Wermter, Stefan
Recent advancements in large language models have showcased their remarkable generalizability across various domains. However, their reasoning abilities still have significant room for improvement, especially when confronted with scenarios requiring
Externí odkaz:
http://arxiv.org/abs/2309.13339
Explaining the behaviour of intelligent agents learned by reinforcement learning (RL) to humans is challenging yet crucial due to their incomprehensible proprioceptive states, variational intermediate goals, and resultant unpredictability. Moreover,
Externí odkaz:
http://arxiv.org/abs/2304.12958
Programming robot behavior in a complex world faces challenges on multiple levels, from dextrous low-level skills to high-level planning and reasoning. Recent pre-trained Large Language Models (LLMs) have shown remarkable reasoning ability in few-sho
Externí odkaz:
http://arxiv.org/abs/2303.08268
We study a class of reinforcement learning problems where the reward signals for policy learning are generated by an internal reward model that is dependent on and jointly optimized with the policy. This interdependence between the policy and the rew
Externí odkaz:
http://arxiv.org/abs/2302.00270
Autor:
Yao, Yuan, Yu, Tianyu, Zhang, Ao, Li, Mengdi, Xie, Ruobing, Weber, Cornelius, Liu, Zhiyuan, Zheng, Hai-Tao, Wermter, Stefan, Chua, Tat-Seng, Sun, Maosong
Large-scale commonsense knowledge bases empower a broad range of AI applications, where the automatic extraction of commonsense knowledge (CKE) is a fundamental and challenging problem. CKE from text is known for suffering from the inherent sparsity
Externí odkaz:
http://arxiv.org/abs/2211.12054