Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Jafferjee, Taher"'
Autor:
Mguni, David, Chen, Haojun, Jafferjee, Taher, Wang, Jianhong, Fei, Long, Feng, Xidong, McAleer, Stephen, Tong, Feifei, Wang, Jun, Yang, Yaodong
In multi-agent reinforcement learning (MARL), independent learning (IL) often shows remarkable performance and easily scales with the number of agents. Yet, using IL can be inefficient and runs the risk of failing to successfully train, particularly
Externí odkaz:
http://arxiv.org/abs/2302.05910
Autor:
Jafferjee, Taher, Ziomek, Juliusz, Yang, Tianpei, Dai, Zipeng, Wang, Jianhong, Taylor, Matthew, Shao, Kun, Wang, Jun, Mguni, David
Centralised training with decentralised execution (CT-DE) serves as the foundation of many leading multi-agent reinforcement learning (MARL) algorithms. Despite its popularity, it suffers from a critical drawback due to its reliance on learning from
Externí odkaz:
http://arxiv.org/abs/2209.01054
We consider a context-dependent Reinforcement Learning (RL) setting, which is characterized by: a) an unknown finite number of not directly observable contexts; b) abrupt (discontinuous) context changes occurring during an episode; and c) Markovian c
Externí odkaz:
http://arxiv.org/abs/2202.06557
Autor:
Sootla, Aivar, Cowen-Rivers, Alexander I., Jafferjee, Taher, Wang, Ziyan, Mguni, David, Wang, Jun, Bou-Ammar, Haitham
Satisfying safety constraints almost surely (or with probability one) can be critical for the deployment of Reinforcement Learning (RL) in real-life applications. For example, plane landing and take-off should ideally occur with probability one. We a
Externí odkaz:
http://arxiv.org/abs/2202.06558
Autor:
Mguni, David Henry, Jafferjee, Taher, Wang, Jianhong, Slumbers, Oliver, Perez-Nieves, Nicolas, Tong, Feifei, Yang, Li, Zhu, Jiangcheng, Yang, Yaodong, Wang, Jun
Efficient exploration is important for reinforcement learners to achieve high rewards. In multi-agent systems, coordinated exploration and behaviour is critical for agents to jointly achieve optimal outcomes. In this paper, we introduce a new general
Externí odkaz:
http://arxiv.org/abs/2112.02618
Autor:
Mguni, David, Jafferjee, Taher, Wang, Jianhong, Perez-Nieves, Nicolas, Yang, Tianpei, Taylor, Matthew, Song, Wenbin, Tong, Feifei, Chen, Hui, Zhu, Jiangcheng, Wang, Jun, Yang, Yaodong
Reward shaping (RS) is a powerful method in reinforcement learning (RL) for overcoming the problem of sparse or uninformative rewards. However, RS typically relies on manually engineered shaping-reward functions whose construction is time-consuming a
Externí odkaz:
http://arxiv.org/abs/2103.09159
Dyna-style reinforcement learning (RL) agents improve sample efficiency over model-free RL agents by updating the value function with simulated experience generated by an environment model. However, it is often difficult to learn accurate models of e
Externí odkaz:
http://arxiv.org/abs/2006.04363
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.