Exploring reinforcement learning techniques for discrete and continuous control tasks in the MuJoCo environment

Autor:	Rahul, Vaddadi Sai, Chakraborty, Debajyoti
Rok vydání:	2023
Předmět:	Computer Science - Machine Learning Computer Science - Artificial Intelligence
Druh dokumentu:	Working Paper
Popis:	We leverage the fast physics simulator, MuJoCo to run tasks in a continuous control environment and reveal details like the observation space, action space, rewards, etc. for each task. We benchmark value-based methods for continuous control by comparing Q-learning and SARSA through a discretization approach, and using them as baselines, progressively moving into one of the state-of-the-art deep policy gradient method DDPG. Over a large number of episodes, Qlearning outscored SARSA, but DDPG outperformed both in a small number of episodes. Lastly, we also fine-tuned the model hyper-parameters expecting to squeeze more performance but using lesser time and resources. We anticipated that the new design for DDPG would vastly improve performance, yet after only a few episodes, we were able to achieve decent average rewards. We expect to improve the performance provided adequate time and computational resources. Comment: Released @ Dec 2021. For associated project files, see https://github.com/chakrabortyde/mujoco-control-tasks
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2307.11166 Zobrazit plný text záznamu View this record from Arxiv