Exploring reinforcement learning techniques for discrete and continuous control tasks in the MuJoCo environment

Autor: Rahul, Vaddadi Sai, Chakraborty, Debajyoti
Rok vydání: 2023
Předmět:
Druh dokumentu: Working Paper
Popis: We leverage the fast physics simulator, MuJoCo to run tasks in a continuous control environment and reveal details like the observation space, action space, rewards, etc. for each task. We benchmark value-based methods for continuous control by comparing Q-learning and SARSA through a discretization approach, and using them as baselines, progressively moving into one of the state-of-the-art deep policy gradient method DDPG. Over a large number of episodes, Qlearning outscored SARSA, but DDPG outperformed both in a small number of episodes. Lastly, we also fine-tuned the model hyper-parameters expecting to squeeze more performance but using lesser time and resources. We anticipated that the new design for DDPG would vastly improve performance, yet after only a few episodes, we were able to achieve decent average rewards. We expect to improve the performance provided adequate time and computational resources.
Comment: Released @ Dec 2021. For associated project files, see https://github.com/chakrabortyde/mujoco-control-tasks
Databáze: arXiv