Lopes, Guilherme CanoFerreira, Murillo [UNESP]Da Silva Simoes, Alexandre [UNESP]Colombini, Esther Luna [UNESP]2019-10-062019-10-062018-12-24Proceedings - 15th Latin American Robotics Symposium, 6th Brazilian Robotics Symposium and 9th Workshop on Robotics in Education, LARS/SBR/WRE 2018, p. 509-514.http://hdl.handle.net/11449/187338Aerial platforms, such as quadrotors, are inherently unstable systems. Generally, the task of stabilizing the flight of a quadrotor is approached by techniques based on classic and modern control algorithms. However, recent model-free reinforcement learning algorithms have been successfully used for controlling drones. In this work we show the feasibility of applying reinforcement learning methods to optimize a stochastic control policy (during training), in order to perform the position control of the 'model-free' quadrotor. This process is achieved while maintaining a good sampling efficiency, allowing fast convergence even when using computationally expensive off-The-shelf simulators for robotics and without the necessity of any additional exploration strategy. We used the Proximal Policy Optimization (PPO) algorithm to make the agent learn a reliable control policy. The experiments for the resultant intelligent controller were performed using the V-REP simulator and the Vortex physics engine.509-514engControlProximal Policy OptimizationQuadrotorReinforcement LearningIntelligent control of a quadrotor with proximal policy optimization reinforcement learningTrabalho apresentado em evento10.1109/LARS/SBR/WRE.2018.00094Acesso restrito2-s2.0-85061334198