Adaptive Reward for CAV Action Planning using Monte Carlo Tree Search

Autor:	Rym Zalila-Wenkstern, Dhruvkumar Patel
Rok vydání:	2021
Předmět:	Computer science business.industry media_common.quotation_subject Reliability (computer networking) Monte Carlo method Monte Carlo tree search Traffic simulation Precondition Task (project management) Reinforcement learning Artificial intelligence Function (engineering) business media_common
Zdroj:	ITSC
DOI:	10.1109/itsc48978.2021.9564688
Popis:	Cooperative action planning for Connected and Autonomous Vehicles (CAVs) in an emergency scenario is an important task in the autonomous driving domain. Reinforcement learning algorithms such as Monte Carlo Tree Search (MCTS) have popularly been used to solve this problem with some success. MCTS rely on performing many simulations of CAV actions to learn expected reward values for CAV actions. A refined reward function design is a necessary precondition for better success rates in MCTS. Traditionally, predefined reward functions with fixed reward parameters are used in all CAVs scenarios by most MCTS-based algorithms. This paper presents a novel Monte Carlo Tree Search (MCTS) based algorithm that dynamically modifies the reward function parameters to encourage or discourage particular CAV actions. Our proposed algorithm with a dynamic reward function significantly improves the reliability of MCTS having a fixed reward function. We evaluate the proposed algorithm in a large-scale multi-agent-based traffic simulation system. Experimental results show that our algorithm significantly improves upon current state-of-the-art centralized and decentralized algorithms.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::e4d5c26d2b8ed464a68374a1d9abe908 https://doi.org/10.1109/itsc48978.2021.9564688 Zobrazit plný text záznamu