Double Deep Q Network with Huber Reward Function for Cart-Pole Balancing Problem.

Autor: Mishra, Shaili, Arora, Anuja
Předmět:
Zdroj: International Journal of Performability Engineering; Sep2022, Vol. 18 Issue 9, p644-653, 10p
Abstrakt: The emergence of reinforcement learning defines a new research direction in control theory where feedback influences the system behavior in order to achieve the desired output. To date, this research work has focused on the cart pole balancing problem using deep reinforcement learning (Deep RL) algorithms. Deep RL is a comprehensive learning framework to study the interplay in environmental input parameters and corresponding output as feedback and further decision making to design a new parameter set to get better output validated in terms of an achieved reward. In this research paper, deep Q network (DQN) and Double deep Q network (DDQN) have been applied to the cart pole balancing problem and reward is measured using a novel loss function -- Huber Loss. Comparison results of DQN with MSE and Huber show the fast convergence performance of the Huber loss function. Thereafter, DQN and Double DQN performance is validated by Huber loss itself. Performance outcome shows that DDQN reduced Huber loss and also converged much faster than DQN. [ABSTRACT FROM AUTHOR]
Databáze: Supplemental Index