Zobrazeno 1 - 10
of 543
pro vyhledávání: '"ZHAO Xufeng"'
Aligning large language models (LLMs) to human preferences is a crucial step in building helpful and safe AI tools, which usually involve training on supervised datasets. Popular algorithms such as Direct Preference Optimization rely on pairs of AI-g
Externí odkaz:
http://arxiv.org/abs/2409.17169
Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remain
Externí odkaz:
http://arxiv.org/abs/2406.18505
The state of an object reflects its current status or condition and is important for a robot's task planning and manipulation. However, detecting an object's state and generating a state-sensitive plan for robots is challenging. Recently, pre-trained
Externí odkaz:
http://arxiv.org/abs/2406.09988
Language-conditioned robotic skills make it possible to apply the high-level reasoning of Large Language Models (LLMs) to low-level robotic control. A remaining challenge is to acquire a diverse set of fundamental skills. Existing approaches either m
Externí odkaz:
http://arxiv.org/abs/2405.15019
Although there has been rapid progress in endowing robots with the ability to solve complex manipulation tasks, generating control policies for bimanual robots to solve tasks involving two hands is still challenging because of the difficulties in eff
Externí odkaz:
http://arxiv.org/abs/2404.02018
Autor:
Lu, Wenhao, Zhao, Xufeng, Fryen, Thilo, Lee, Jae Hee, Li, Mengdi, Magg, Sven, Wermter, Stefan
Reinforcement learning (RL) is a powerful technique for training intelligent agents, but understanding why these agents make specific decisions can be quite challenging. This lack of transparency in RL models has been a long-standing problem, making
Externí odkaz:
http://arxiv.org/abs/2401.00104
Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models
Reinforcement Learning (RL) plays an important role in the robotic manipulation domain since it allows self-learning from trial-and-error interactions with the environment. Still, sample efficiency and reward specification seriously limit its potenti
Externí odkaz:
http://arxiv.org/abs/2311.02379
Autor:
Zhao, Xufeng, Li, Mengdi, Lu, Wenhao, Weber, Cornelius, Lee, Jae Hee, Chu, Kun, Wermter, Stefan
Recent advancements in large language models have showcased their remarkable generalizability across various domains. However, their reasoning abilities still have significant room for improvement, especially when confronted with scenarios requiring
Externí odkaz:
http://arxiv.org/abs/2309.13339
Explaining the behaviour of intelligent agents learned by reinforcement learning (RL) to humans is challenging yet crucial due to their incomprehensible proprioceptive states, variational intermediate goals, and resultant unpredictability. Moreover,
Externí odkaz:
http://arxiv.org/abs/2304.12958