Effective Policy Adjustment via Meta-Learning for Complex Manipulation Tasks

Autor:	Tong Wang, Xue-song Tang, Kuangrong Hao, Xin Cai, Binghong Wu
Rok vydání:	2018
Předmět:	Meta learning (computer science) Artificial neural network Computer science business.industry 05 social sciences 010501 environmental sciences Machine learning computer.software_genre 01 natural sciences Task (project management) 0502 economics and business Key (cryptography) Reinforcement learning Artificial intelligence 050207 economics business computer Robotic arm 0105 earth and related environmental sciences
Zdroj:	2018 Chinese Automation Congress (CAC).
Popis:	The ability of adjusting policy is the key to learning decision making when completing complex manipulation tasks for agents. To solve this problem with the consideration of both exploration and exploitation, we propose a novel deep reinforcement learning algorithm by combining the Hindsight Experience Replay (HER) with the Model-Agnostic Meta-Learning (MAML). To solve the complex manipulation tasks, HER could provide a relatively effective exploration by converting the single-goal task to the multiple goals in such an environment where rewards are sparse and binary, enhancing the ability to search better policies according to not only the successful the transition trajectories but also the failures, and the MAML could promote the ability of exploitation, which means the proposed algorithm could learn faster and adjust the policy model from limited experience within few iterations. Plenty of simulation results on the complex tasks of manipulating objects with a robotic arm have been done, and results show that HER integrated with MAML could accelerate fine-tuning for the original policy gradient reinforcement learning with neural network policy, and also improve the performance on the success rate.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::f7adfc2c8e2c2b9e9a59e410921af0c6 https://doi.org/10.1109/cac.2018.8623652 Zobrazit plný text záznamu