End-to-End AUV Local Motion Planning Method Based on Deep Reinforcement Learning

Autor:	Xi Lyu, Yushan Sun, Lifeng Wang, Jiehui Tan, Liwen Zhang
Jazyk:	angličtina
Rok vydání:	2023
Předmět:	autonomous underwater vehicle (AUV) deep deterministic policy gradient (DDPG) deep reinforcement learning (DRL) local motion planning Naval architecture. Shipbuilding. Marine engineering VM1-989 Oceanography GC1-1581
Zdroj:	Journal of Marine Science and Engineering, Vol 11, Iss 9, p 1796 (2023)
Druh dokumentu:	article
ISSN:	2077-1312
DOI:	10.3390/jmse11091796
Popis:	This study aims to solve the problems of sparse reward, single policy, and poor environmental adaptability in the local motion planning task of autonomous underwater vehicles (AUVs). We propose a two-layer deep deterministic policy gradient algorithm-based end-to-end perception–planning–execution method to overcome the challenges associated with training and learning in end-to-end approaches that directly output control forces. In this approach, the state set is established based on the environment information, the action set is established based on the motion characteristics of the AUV, and the control execution force set is established based on the control constraints. The mapping relations between each set are trained using deep reinforcement learning, enabling the AUV to perform the corresponding action in the current state, thereby accomplishing tasks in an end-to-end manner. Furthermore, we introduce the hindsight experience replay (HER) method in the perception planning mapping process to enhance stability and sample efficiency during training. Finally, we conduct simulation experiments encompassing planning, execution, and end-to-end performance evaluation. Simulation training demonstrates that our proposed method exhibits improved decision-making capabilities and real-time obstacle avoidance during planning. Compared to global planning, the end-to-end algorithm comprehensively considers constraints in the AUV planning process, resulting in more realistic AUV actions that are gentler and more stable, leading to controlled tracking errors.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/71da53bb21264dc0afa815ed9701a7c9 Zobrazit plný text záznamu View record in DOAJ