A deep reinforcement learning approach for path following on a quadrotor
Autor: | Bartomeu Rubí, Ramon Pérez, Bernardo Morcego |
---|---|
Přispěvatelé: | Universitat Politècnica de Catalunya. Doctorat en Automàtica, Robòtica i Visió, Universitat Politècnica de Catalunya. Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial |
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
0209 industrial biotechnology
Learning (artificial intelligence) Informàtica::Intel·ligència artificial::Aprenentatge automàtic [Àrees temàtiques de la UPC] Computer science Path following Avions no tripulats 02 engineering and technology Attitude control Unmanned aerial vehicles law.invention 020901 industrial engineering & automation Control theory law Machine learning Aprenentatge automàtic 0202 electrical engineering electronic engineering information engineering Reinforcement learning Training Heuristic algorithms Prediction algorithms computer.programming_language Drone aircraft Structure (mathematical logic) 020208 electrical & electronic engineering Python (programming language) Equivalent control Autopilot Multirotor computer |
Zdroj: | UPCommons. Portal del coneixement obert de la UPC Universitat Politècnica de Catalunya (UPC) Scopus-Elsevier ECC |
ISSN: | 2017-8840 |
Popis: | © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. This paper proposes the Deep Deterministic Policy Grandient (DDPG) reinforcement learning algorithm to solve the path following problem in a quadrotor vehicle. This agent is implemented using a separated control and guidance structure with an autopilot tracking the attitude and velocity commands. The DDPG agent is implemented in python and it is trained and tested in the RotorS-Gazebo environment, a realistic multirotor simulator integrated in ROS. Performance is compared with Adaptive NLGL, a geometric algorithm that implements an equivalent control structure. Results show how the DDPG agent is able to outperform the Adaptive NLGL approach while reducing its complexity. This work has been partially funded by the Spanish State Research Agency (AEI) and the European Regional Development Fund (ERDF) through the SCAV project (ref. MINECO DPI2017-88403-R), and by SMART project (ref. EFA 153/16 Interreg Cooperation Program POCTEFA 2014- 2020). Bartomeu Rubí is also supported by the Secretaria d’Universitats i Recerca de la Generalitat de Catalunya, the European Social Fund (ESF) and AGAUR under a FI grant (ref. 2017FI B 00212). |
Databáze: | OpenAIRE |
Externí odkaz: |