Publicação: Experience generalization for multi-agent reinforcement learning
Nenhuma Miniatura disponível
Data
2001-01-01
Autores
Orientador
Coorientador
Pós-graduação
Curso de graduação
Título da Revista
ISSN da Revista
Título de Volume
Editor
Institute of Electrical and Electronics Engineers (IEEE), Computer Soc
Tipo
Trabalho apresentado em evento
Direito de acesso
Acesso aberto

Resumo
On-line learning methods have been applied successfully in multi-agent systems to achieve coordination among agents. Learning in multi-agent systems implies in a non-stationary scenario perceived by the agents, since the behavior of other agents may change as they simultaneously learn how to improve their actions. Non-stationary scenarios can be modeled as Markov Games, which can be solved using the Minimax-Q algorithm a combination of Q-learning (a Reinforcement Learning (RL) algorithm which directly learns an optimal control policy) and the Minimax algorithm. However, finding optimal control policies using any RL algorithm (Q-learning and Minimax-Q included) can be very time consuming. Trying to improve the learning time of Q-learning, we considered the QS-algorithm. in which a single experience can update more than a single action value by using a spreading function. In this paper, we contribute a Minimax-QS algorithm which combines the Minimax-Q algorithm and the QS-algorithm. We conduct a series of empirical evaluation of the algorithm in a simplified simulator of the soccer domain. We show that even using a very simple domain-dependent spreading function, the performance of the learning algorithm can be improved.
Descrição
Palavras-chave
Idioma
Inglês
Como citar
Sccc 2001: Xxi International Conference of the Chilean Computer Science Society, Proceedings. Los Alamitos: IEEE Computer Soc, p. 233-239, 2001.