Model-Based Actor-Critic with Chance Constraint for Stochastic System

Autor:	Peng, Baiyu, Mu, Yao, Guan, Yang, Li, Shengbo Eben, Yin, Yuming, Chen, Jianyu
Rok vydání:	2021
Předmět:	FOS: Computer and information sciences Computer Science - Machine Learning Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence FOS: Electrical engineering electronic engineering information engineering Systems and Control (eess.SY) Electrical Engineering and Systems Science - Systems and Control Machine Learning (cs.LG)
Zdroj:	2021 60th IEEE Conference on Decision and Control (CDC).
DOI:	10.1109/cdc45484.2021.9683748
Popis:	Safety is essential for reinforcement learning (RL) applied in real-world situations. Chance constraints are suitable to represent the safety requirements in stochastic systems. Previous chance-constrained RL methods usually have a low convergence rate, or only learn a conservative policy. In this paper, we propose a model-based chance constrained actor-critic (CCAC) algorithm which can efficiently learn a safe and non-conservative policy. Different from existing methods that optimize a conservative lower bound, CCAC directly solves the original chance constrained problems, where the objective function and safe probability is simultaneously optimized with adaptive weights. In order to improve the convergence rate, CCAC utilizes the gradient of dynamic model to accelerate policy optimization. The effectiveness of CCAC is demonstrated by a stochastic car-following task. Experiments indicate that compared with previous RL methods, CCAC improves the performance while guaranteeing safety, with a five times faster convergence rate. It also has 100 times higher online computation efficiency than traditional safety techniques such as stochastic model predictive control.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b3bd69972026886b19bc1941404dcf62 https://doi.org/10.1109/cdc45484.2021.9683748 Zobrazit plný text záznamu