Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Buening, Thomas Kleine"'
Motivated by the phenomenon of strategic agents gaming a recommender system to maximize the number of times they are recommended to users, we study a strategic variant of the linear contextual bandit problem, where the arms can strategically misrepor
Externí odkaz:
http://arxiv.org/abs/2406.00551
Autor:
Buening, Thomas Kleine, Dimitrakakis, Christos, Eriksson, Hannes, Grover, Divya, Jorge, Emilio
While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution. One idea is to employ a worst-case prior. However, thi
Externí odkaz:
http://arxiv.org/abs/2302.10831
Learning a reward function from demonstrations suffers from low sample-efficiency. Even with abundant data, current inverse reinforcement learning methods that focus on learning from a single environment can fail to handle slight changes in the envir
Externí odkaz:
http://arxiv.org/abs/2210.14972
Autor:
Buening, Thomas Kleine, Saha, Aadirupa
We study the problem of non-stationary dueling bandits and provide the first adaptive dynamic regret algorithm for this problem. The only two existing attempts in this line of work fall short across multiple dimensions, including pessimistic measures
Externí odkaz:
http://arxiv.org/abs/2210.14322
We study the problem of designing autonomous agents that can learn to cooperate effectively with a potentially suboptimal partner while having no access to the joint reward function. This problem is modeled as a cooperative episodic two-agent Markov
Externí odkaz:
http://arxiv.org/abs/2111.04698
Autor:
Buening, Thomas Kleine, Segal, Meirav, Basu, Debabrota, Dimitrakakis, Christos, George, Anne-Marie
Typically, merit is defined with respect to some intrinsic measure of worth. We instead consider a setting where an individual's worth is \emph{relative}: when a Decision Maker (DM) selects a set of individuals from a population to maximise expected
Externí odkaz:
http://arxiv.org/abs/2102.11932