Autor: |
Niiyama, Tomoaki, Furuhata, Genki, Uchida, Atsushi, Naruse, Makoto, Sunada, Satoshi |
Zdroj: |
Journal of the Physical Society of Japan; 1/15/2020, Vol. 89 Issue 1, p1-6, 6p |
Abstrakt: |
Decision making is a fundamental capability of living organisms, and has recently been gaining increasing importance in many engineering applications. Here, we consider a simple decision-making principle to identify an optimal choice in multi-armed bandit (MAB) problems, which is fundamental in the context of reinforcement learning. We demonstrate that the identification mechanism of the method is well described by using a competitive ecosystem model, i.e., the competitive Lotka–Volterra (LV) model. Based on the "winner-take-all" mechanism in the competitive LV model, we demonstrate that non-best choices are eliminated and only the best choice survives; the failure of the non-best choices exponentially decreases while repeating the choice trials. Furthermore, we apply a mean-field approximation to the proposed decision-making method and show that the method has an excellent scalability of O(log N) with respect to the number of choices N. These results allow for a new perspective on optimal search capabilities in competitive systems. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|