Zobrazeno 1 - 10
of 1 696
pro vyhledávání: '"MANNION, PATRICK"'
Multi-objective reinforcement learning (MORL) is increasingly relevant due to its resemblance to real-world scenarios requiring trade-offs between multiple objectives. Catering to diverse user preferences, traditional reinforcement learning faces amp
Externí odkaz:
http://arxiv.org/abs/2404.03997
Autor:
Röpke, Willem, Reymond, Mathieu, Mannion, Patrick, Roijers, Diederik M., Nowé, Ann, Rădulescu, Roxana
A significant challenge in multi-objective reinforcement learning is obtaining a Pareto front of policies that attain optimal performance under different preferences. We introduce Iterated Pareto Referent Optimisation (IPRO), a principled algorithm t
Externí odkaz:
http://arxiv.org/abs/2402.07182
Autor:
Vamplew, Peter, Foale, Cameron, Hayes, Conor F., Mannion, Patrick, Howley, Enda, Dazeley, Richard, Johnson, Scott, Källström, Johan, Ramos, Gabriel, Rădulescu, Roxana, Röpke, Willem, Roijers, Diederik M.
Research in multi-objective reinforcement learning (MORL) has introduced the utility-based paradigm, which makes use of both environmental rewards and a function that defines the utility derived by the user from those rewards. In this paper we extend
Externí odkaz:
http://arxiv.org/abs/2402.02665
Evolutionary Algorithms and Deep Reinforcement Learning have both successfully solved control problems across a variety of domains. Recently, algorithms have been proposed which combine these two methods, aiming to leverage the strengths and mitigate
Externí odkaz:
http://arxiv.org/abs/2306.11535
Autor:
Röpke, Willem, Hayes, Conor F., Mannion, Patrick, Howley, Enda, Nowé, Ann, Roijers, Diederik M.
For effective decision support in scenarios with conflicting objectives, sets of potentially optimal solutions can be presented to the decision maker. We explore both what policies these sets should contain and how such sets can be computed efficient
Externí odkaz:
http://arxiv.org/abs/2305.05560
Many decision-making problems feature multiple objectives. In such problems, it is not always possible to know the preferences of a decision-maker for different objectives. However, it is often possible to observe the behavior of decision-makers. In
Externí odkaz:
http://arxiv.org/abs/2304.14115
In many risk-aware and multi-objective reinforcement learning settings, the utility of the user is derived from a single execution of a policy. In these settings, making decisions based on the average future returns is not suitable. For example, in a
Externí odkaz:
http://arxiv.org/abs/2211.13032
Many real-world problems contain multiple objectives and agents, where a trade-off exists between objectives. Key to solving such problems is to exploit sparse dependency structures that exist between agents. For example, in wind farm control a trade
Externí odkaz:
http://arxiv.org/abs/2207.00368