Option-Critic in Cooperative Multi-agent Systems

Autor:	Chakravorty, Jhelum, Ward, Nadeem, Roy, Julien, Chevalier-Boisvert, Maxime, Basu, Sumana, Lupu, Andrei, Precup, Doina
Rok vydání:	2019
Předmět:	Computer Science - Artificial Intelligence Computer Science - Multiagent Systems Electrical Engineering and Systems Science - Systems and Control Mathematics - Optimization and Control
Druh dokumentu:	Working Paper
Popis:	In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems, using the options framework (Sutton et al, 1999). First, we address the planning problem for the decentralized POMDP represented by the multi-agent system, by introducing a \emph{common information approach}. We use the notion of \emph{common beliefs} and broadcasting to solve an equivalent centralized POMDP problem. Then, we propose the Distributed Option Critic (DOC) algorithm, which uses centralized option evaluation and decentralized intra-option improvement. We theoretically analyze the asymptotic convergence of DOC and build a new multi-agent environment to demonstrate its validity. Our experiments empirically show that DOC performs competitively against baselines and scales with the number of agents.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/1911.12825 Zobrazit plný text záznamu View this record from Arxiv