Výsledky vyhledávání - "Kirschner, Johannes"

Report

Regret Minimization via Saddle Point Optimization

Autor: Kirschner, Johannes, Bakhtiari, Seyed Alireza, Chandak, Kushagra, Tkachuk, Volodymyr, Szepesvári, Csaba

A long line of works characterizes the sample complexity of regret minimization in sequential decision-making by min-max programs. In the corresponding saddle-point game, the min-player optimizes the sampling distribution against an adversarial max-p

Externí odkaz: http://arxiv.org/abs/2403.10379

Zobrazit plný text záznamu

Report

Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning

Autor: Tkachuk, Volodymyr, Bakhtiari, Seyed Alireza, Kirschner, Johannes, Jusup, Matej, Bogunovic, Ilija, Szepesvári, Csaba

A practical challenge in reinforcement learning are combinatorial action spaces that make planning computationally demanding. For example, in cooperative multi-agent reinforcement learning, a potentially large number of agents jointly optimize a glob

Externí odkaz: http://arxiv.org/abs/2302.04376

Zobrazit plný text záznamu

Report

Linear Partial Monitoring for Sequential Decision-Making: Algorithms, Regret Bounds and Applications

Autor: Kirschner, Johannes, Lattimore, Tor, Krause, Andreas

Partial monitoring is an expressive framework for sequential decision-making with an abundance of applications, including graph-structured and dueling bandits, dynamic pricing and transductive feedback models. We survey and extend recent results on t

Externí odkaz: http://arxiv.org/abs/2302.03683

Zobrazit plný text záznamu

Report

Near-optimal Policy Identification in Active Reinforcement Learning

Autor: Li, Xiang, Mehta, Viraj, Kirschner, Johannes, Char, Ian, Neiswanger, Willie, Schneider, Jeff, Krause, Andreas, Bogunovic, Ilija

Many real-world reinforcement learning tasks require control of complex dynamical systems that involve both costly data acquisition processes and large state spaces. In cases where the transition dynamics can be readily evaluated at specified states

Externí odkaz: http://arxiv.org/abs/2212.09510

Zobrazit plný text záznamu

Report

Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off

Autor: Zhang, Zichen, Kirschner, Johannes, Zhang, Junxi, Zanini, Francesco, Ayoub, Alex, Dehghan, Masood, Schuurmans, Dale

A default assumption in reinforcement learning (RL) and optimal control is that observations arrive at discrete time points on a fixed clock cycle. Yet, many applications involve continuous-time systems where the time discretization, in principle, ca

Externí odkaz: http://arxiv.org/abs/2212.08949

Zobrazit plný text záznamu

Report

Tuning Particle Accelerators with Safety Constraints using Bayesian Optimization

Autor: Kirschner, Johannes, Mutný, Mojmir, Krause, Andreas, de Portugal, Jaime Coello, Hiller, Nicole, Snuverink, Jochem

Tuning machine parameters of particle accelerators is a repetitive and time-consuming task that is challenging to automate. While many off-the-shelf optimization algorithms are available, in practice their use is limited because most methods do not a

Externí odkaz: http://arxiv.org/abs/2203.13968

Zobrazit plný text záznamu

Report

Bias-Robust Bayesian Optimization via Dueling Bandits

Autor: Kirschner, Johannes, Krause, Andreas

We consider Bayesian optimization in settings where observations can be adversarially biased, for example by an uncontrolled hidden confounder. Our first contribution is a reduction of the confounded setting to the dueling bandit model. Then we propo

Externí odkaz: http://arxiv.org/abs/2105.11802

Zobrazit plný text záznamu

Report

Efficient Pure Exploration for Combinatorial Bandits with Semi-Bandit Feedback

Autor: Jourdan, Marc, Mutný, Mojmír, Kirschner, Johannes, Krause, Andreas

Combinatorial bandits with semi-bandit feedback generalize multi-armed bandits, where the agent chooses sets of arms and observes a noisy reward for each arm contained in the chosen set. The action set satisfies a given structure such as forming a ba

Externí odkaz: http://arxiv.org/abs/2101.08534

Zobrazit plný text záznamu

Report

Asymptotically Optimal Information-Directed Sampling

Autor: Kirschner, Johannes, Lattimore, Tor, Vernade, Claire, Szepesvári, Csaba

We introduce a simple and efficient algorithm for stochastic linear bandits with finitely many actions that is asymptotically optimal and (nearly) worst-case optimal in finite time. The approach is based on the frequentist information-directed sampli

Externí odkaz: http://arxiv.org/abs/2011.05944

Zobrazit plný text záznamu

Report

Information Directed Sampling for Linear Partial Monitoring

Autor: Kirschner, Johannes, Lattimore, Tor, Krause, Andreas

Partial monitoring is a rich framework for sequential decision making under uncertainty that generalizes many well known bandit models, including linear, combinatorial and dueling bandits. We introduce information directed sampling (IDS) for stochast

Externí odkaz: http://arxiv.org/abs/2002.11182

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání