Intelligent control of a quadrotor with proximal policy optimization reinforcement learning

Lopes, Guilherme Cano; Ferreira, Murillo [UNESP]; Da Silva Simoes, Alexandre [UNESP]; Colombini, Esther Luna [UNESP]

Intelligent control of a quadrotor with proximal policy optimization reinforcement learning

Data

2018-12-24

Autores

Lopes, Guilherme Cano

Ferreira, Murillo [UNESP]

Da Silva Simoes, Alexandre [UNESP]

Colombini, Esther Luna [UNESP]

Resumo

Aerial platforms, such as quadrotors, are inherently unstable systems. Generally, the task of stabilizing the flight of a quadrotor is approached by techniques based on classic and modern control algorithms. However, recent model-free reinforcement learning algorithms have been successfully used for controlling drones. In this work we show the feasibility of applying reinforcement learning methods to optimize a stochastic control policy (during training), in order to perform the position control of the 'model-free' quadrotor. This process is achieved while maintaining a good sampling efficiency, allowing fast convergence even when using computationally expensive off-The-shelf simulators for robotics and without the necessity of any additional exploration strategy. We used the Proximal Policy Optimization (PPO) algorithm to make the agent learn a reliable control policy. The experiments for the resultant intelligent controller were performed using the V-REP simulator and the Vortex physics engine.

Palavras-chave

Control, Proximal Policy Optimization, Quadrotor, Reinforcement Learning

Como citar

Proceedings - 15th Latin American Robotics Symposium, 6th Brazilian Robotics Symposium and 9th Workshop on Robotics in Education, LARS/SBR/WRE 2018, p. 509-514.