Leader-follower MDP models with factored state space and many followers - followers abstraction, structured dynamics and state aggregation
Autor: | Sabbadin, Régis, Viet, Anne France |
---|---|
Přispěvatelé: | Unité de Mathématiques et Informatique Appliquées de Toulouse (MIAT INRA), Institut National de la Recherche Agronomique (INRA), Biologie, Epidémiologie et analyse de risque en Santé Animale (BIOEPAR) |
Jazyk: | angličtina |
Rok vydání: | 2016 |
Předmět: | |
Zdroj: | Proceedings of the Twenty-second European Conference on Artificial Intelligence (ECAI 2016). (285)2016; 22nd European Conference on Artificial Intelligence (ECAI), La Haye, NLD, 2016-08-29-2016-09-02, 116-124 Frontiers in Artificial Intelligence and Applications 22nd European Conference on Artificial Intelligence (ECAI) 22nd European Conference on Artificial Intelligence (ECAI), Aug 2016, La Haye, Netherlands. pp.9, ⟨10.3233/978-1-61499-672-9-116⟩ |
DOI: | 10.3233/978-1-61499-672-9-116⟩ |
Popis: | International audience; The Leader-Follower Markov Decision Processes (LF-MDP) framework extends both Markov Decision Processes (MDP) and Stochastic Games. It provides a model where an agent (the leader) can influence a set of other agents (the followers) which are playing a stochastic game, by modifying their immediate reward functions, but not their dynamics. It is assumed that all agents act selfishly and try to optimize their own long-term expected reward. Finding equilibrium strategies in a LF-MDP is hard, especially when the joint state space of followers is factored. In this case, it takes exponential time in the number of followers. Our theoretical contribution is threefold. First, we analyze a natural assumption (substitutability of followers), which holds in many applications. Under this assumption, we show that a LF-MDP can be solved exactly in polynomial time, when deterministic equilibria exist for all games encountered in the LF-MDP. Second, we show that an additional assumption of sparsity of the problem dynamics allows us to decrease the exponent of the polynomial. Finally, we present a state-aggregation approximation, which decreases further the exponent and allows us to approximately solve large problems. We empirically validate the LF-MDP approach on a class of realistic animal disease control problems. For problems of this class, we find deterministic equilibria for all games. Using our first two results, we are able to solve the exact LF-MDP problem with 15 followers (compared to 6 or 7 in the original model). Using state-aggregation, problems with up to 50 followers can be solved approximately. The approximation quality is evaluated by comparison with the exact approach on problems with 12 and 15 followers. |
Databáze: | OpenAIRE |
Externí odkaz: |