A Deep Reinforcement Learning Approach for the Patrolling Problem of Water Resources Through Autonomous Surface Vehicles: The Ypacarai Lake Case

Autor:	Daniel Gutiérrez Reina, Samuel Yanes Luis, Sergio Luis Toral Marín
Přispěvatelé:	Universidad de Sevilla. Departamento de Ingeniería Electrónica
Jazyk:	angličtina
Rok vydání:	2020
Předmět:	0209 industrial biotechnology Mathematical optimization autonomous surface vehicle General Computer Science Computer science Context (language use) 02 engineering and technology 020901 industrial engineering & automation Genetic algorithm 0202 electrical engineering electronic engineering information engineering Redundancy (engineering) Reinforcement learning General Materials Science Motion planning Electrical and Electronic Engineering path planning Hyperparameter Deep reinforcement learning Artificial neural network patrolling Patrolling General Engineering complete coverage monitoring 020201 artificial intelligence & image processing Markov decision process lcsh:Electrical engineering. Electronics. Nuclear engineering Heuristics lcsh:TK1-9971
Zdroj:	IEEE Access, Vol 8, Pp 204076-204093 (2020) idUS. Depósito de Investigación de la Universidad de Sevilla instname idUS: Depósito de Investigación de la Universidad de Sevilla Universidad de Sevilla (US)
ISSN:	2169-3536
Popis:	Autonomous Surfaces Vehicles (ASV) are incredibly useful for the continuous monitoring and exploring task of water resources due to their autonomy, mobility, and relative low cost. In the path planning context, the patrolling problem is usually addressed with heuristics approaches, such as Genetic Algorithms (GA) or Reinforcement Learning (RL) because of the complexity and high dimensionality of the problem. In this paper, the patrolling problem of Ypacarai Lake (Asunción, Paraguay) has been formulated as a Markov Decision Process (MDP) for two possible cases: the homogeneous and the nonhomogeneous scenarios. A tailored reward function has been designed for the non-homogeneous case. Two Deep Reinforcement Learning algorithms such as Deep Q-Learning (DQL) and Double Deep Q-Learning (DDQL) have been evaluated to solve the patrolling problem. Furthermore, due to the high number of parameters and hyperparameters involved in the algorithms, a thorough search has been conducted to nd the best values for training the neural networks and the proposed reward function. According to the results, a suitable con guration of the parameters allows better results for coverage, obtaining more than the 93% of the lake surface on average. In addition, the proposed approach achieves higher sample redundancy of important zones than other common-used algorithms for non-homogeneous coverage path planning such as Policy Gradient, lawnmower algorithm or random exploration, achieving an 64% improvement of the mean time between visits. Ministerio de Ciencia, innovación y Universidades RTI2018-098964-B-I00 Junta de Andalucía US-1257508 Junta de Andalucía PY18-RE0009
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c688c89582bdba6d13204f24948d3060 https://ieeexplore.ieee.org/document/9252944/ Zobrazit plný text záznamu