Planificación de múltiples trayectorias de un solo robot móvil en un entorno de obstáculos dinámicos basado en el aprendizaje de refuerzo = Using reinforcement learning for multiple way-points path planning of single mobile robot in the dynamic obstacle environment

Autor: Liu, Yuzhou
Přispěvatelé: Bresser, Andreas, Piho, Laura, Campoy Cervera, Pascual
Rok vydání: 2022
Předmět:
Zdroj: Archivo Digital UPM
Universidad Politécnica de Madrid
Popis: Nowadays, mobile robots are applied to increasingly complex scenarios. Whether it is a closed scenario, such as the transportation of goods or medical supplies in large supermarkets, factories and hospitals, or an open scenario, such as the delivery of unmanned vehicle delivery service in urban roads, all of them put forward higher requirements for the multi-destination(waypoint) path planning performance of the mobile robot. The purpose of this thesis is to design and prove a multi-waypoint path planning algorithm for the mobile robot, which combines reinforcement learning and heuristic search algorithm. so that the mobile robot can carry out continuous multiwaypoint path planning in a dynamic environment with multiple moving obstacles. The algorithm first selects the next waypoint through the Q-learning algorithm, and then performs path planning between the current position of the robot and the next waypoint through the Anytime Repairing A* (ARA*) algorithm as the global planner. At the same time, the Two-Step Vector Field Histogram with Look-Ahead Verification with Recovery method (2S-VFH*-R) algorithm is used as the local planner to plan the avoidance or escape path when the mobile robot encounters a moving obstacle. This thesis builds a simple Q-learning problem model to solve the Traveling Salesman Problem (TSP). The performance of Q-learning algorithm is evaluated, which proves the feasibility of using Q-learning combined with heuristic search algorithm to solve the multi waypoint path planning problem in static simulation environment. Finally, the thesis analyses the performance of the multi-waypoint path planning algorithm and draws a conclusion that in the unknown dynamic environment, the algorithm has better performance than the traditional greedy method (local optimal solution) with the minimum overall time taken as the optimization objective. This thesis is written in English and is 48 pages long, including 7 chapters, 21 figures and 3 tables.
Databáze: OpenAIRE