Zobrazeno 1 - 10
of 64
pro vyhledávání: '"Kirschner, Johannes"'
Autor:
Kirschner, Johannes, Bakhtiari, Seyed Alireza, Chandak, Kushagra, Tkachuk, Volodymyr, Szepesvári, Csaba
A long line of works characterizes the sample complexity of regret minimization in sequential decision-making by min-max programs. In the corresponding saddle-point game, the min-player optimizes the sampling distribution against an adversarial max-p
Externí odkaz:
http://arxiv.org/abs/2403.10379
Autor:
Tkachuk, Volodymyr, Bakhtiari, Seyed Alireza, Kirschner, Johannes, Jusup, Matej, Bogunovic, Ilija, Szepesvári, Csaba
A practical challenge in reinforcement learning are combinatorial action spaces that make planning computationally demanding. For example, in cooperative multi-agent reinforcement learning, a potentially large number of agents jointly optimize a glob
Externí odkaz:
http://arxiv.org/abs/2302.04376
Linear Partial Monitoring for Sequential Decision-Making: Algorithms, Regret Bounds and Applications
Partial monitoring is an expressive framework for sequential decision-making with an abundance of applications, including graph-structured and dueling bandits, dynamic pricing and transductive feedback models. We survey and extend recent results on t
Externí odkaz:
http://arxiv.org/abs/2302.03683
Autor:
Li, Xiang, Mehta, Viraj, Kirschner, Johannes, Char, Ian, Neiswanger, Willie, Schneider, Jeff, Krause, Andreas, Bogunovic, Ilija
Many real-world reinforcement learning tasks require control of complex dynamical systems that involve both costly data acquisition processes and large state spaces. In cases where the transition dynamics can be readily evaluated at specified states
Externí odkaz:
http://arxiv.org/abs/2212.09510
Autor:
Zhang, Zichen, Kirschner, Johannes, Zhang, Junxi, Zanini, Francesco, Ayoub, Alex, Dehghan, Masood, Schuurmans, Dale
A default assumption in reinforcement learning (RL) and optimal control is that observations arrive at discrete time points on a fixed clock cycle. Yet, many applications involve continuous-time systems where the time discretization, in principle, ca
Externí odkaz:
http://arxiv.org/abs/2212.08949
Autor:
Kirschner, Johannes, Mutný, Mojmir, Krause, Andreas, de Portugal, Jaime Coello, Hiller, Nicole, Snuverink, Jochem
Tuning machine parameters of particle accelerators is a repetitive and time-consuming task that is challenging to automate. While many off-the-shelf optimization algorithms are available, in practice their use is limited because most methods do not a
Externí odkaz:
http://arxiv.org/abs/2203.13968
Autor:
Kirschner, Johannes, Krause, Andreas
We consider Bayesian optimization in settings where observations can be adversarially biased, for example by an uncontrolled hidden confounder. Our first contribution is a reduction of the confounded setting to the dueling bandit model. Then we propo
Externí odkaz:
http://arxiv.org/abs/2105.11802
Combinatorial bandits with semi-bandit feedback generalize multi-armed bandits, where the agent chooses sets of arms and observes a noisy reward for each arm contained in the chosen set. The action set satisfies a given structure such as forming a ba
Externí odkaz:
http://arxiv.org/abs/2101.08534
We introduce a simple and efficient algorithm for stochastic linear bandits with finitely many actions that is asymptotically optimal and (nearly) worst-case optimal in finite time. The approach is based on the frequentist information-directed sampli
Externí odkaz:
http://arxiv.org/abs/2011.05944
Partial monitoring is a rich framework for sequential decision making under uncertainty that generalizes many well known bandit models, including linear, combinatorial and dueling bandits. We introduce information directed sampling (IDS) for stochast
Externí odkaz:
http://arxiv.org/abs/2002.11182