A Safety Modulator Actor-Critic Method in Model-Free Safe Reinforcement Learning and Application in UAV Hovering

Autor:	Qi, Qihan, Yang, Xinsong, Xia, Gang, Ho, Daniel W. C., Tang, Pengyang
Rok vydání:	2024
Předmět:	Computer Science - Artificial Intelligence Computer Science - Machine Learning Computer Science - Robotics
Druh dokumentu:	Working Paper
Popis:	This paper proposes a safety modulator actor-critic (SMAC) method to address safety constraint and overestimation mitigation in model-free safe reinforcement learning (RL). A safety modulator is developed to satisfy safety constraints by modulating actions, allowing the policy to ignore safety constraint and focus on maximizing reward. Additionally, a distributional critic with a theoretical update rule for SMAC is proposed to mitigate the overestimation of Q-values with safety constraints. Both simulation and real-world scenarios experiments on Unmanned Aerial Vehicles (UAVs) hovering confirm that the SMAC can effectively maintain safety constraints and outperform mainstream baseline algorithms.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2410.06847 Zobrazit plný text záznamu View this record from Arxiv