Výsledky vyhledávání - "Schapire, Robert E."

Report

Lexicographic Optimization: Algorithms and Stability

Autor: Abernethy, Jacob, Schapire, Robert E., Syed, Umar

A lexicographic maximum of a set $X \subseteq \mathbb{R}^n$ is a vector in $X$ whose smallest component is as large as possible, and subject to that requirement, whose second smallest component is as large as possible, and so on for the third smalles

Externí odkaz: http://arxiv.org/abs/2405.01387

Zobrazit plný text záznamu

Report

Provable Interactive Learning with Hindsight Instruction Feedback

Autor: Misra, Dipendra, Pacchiano, Aldo, Schapire, Robert E.

We study interactive learning in a setting where the agent has to generate a response (e.g., an action or trajectory) given a context and an instruction. In contrast, to typical approaches that train the system using reward or expert supervision on r

Externí odkaz: http://arxiv.org/abs/2404.09123

Zobrazit plný text záznamu

Kniha

Boosting [elektronicky zdroj] : foundations and algorithms / Robert E. Schapire, Yoav Freund.

Autor: Schapire, Robert E.

Externí odkaz: Kolekce e-knih KNAV

Report

Provably Sample-Efficient RL with Side Information about Latent Dynamics

Autor: Liu, Yao, Misra, Dipendra, Dudík, Miro, Schapire, Robert E.

We study reinforcement learning (RL) in settings where observations are high-dimensional, but where an RL agent has access to abstract knowledge about the structure of the state space, as is the case, for example, when a robot is tasked to go to a sp

Externí odkaz: http://arxiv.org/abs/2205.14237

Zobrazit plný text záznamu

Report

Convex Analysis at Infinity: An Introduction to Astral Space

Autor: Dudík, Miroslav, Schapire, Robert E., Telgarsky, Matus

Not all convex functions on $\mathbb{R}^n$ have finite minimizers; some can only be minimized by a sequence as it heads to infinity. In this work, we aim to develop a theory for understanding such minimizers at infinity. We study astral space, a comp

Externí odkaz: http://arxiv.org/abs/2205.03260

Zobrazit plný text záznamu

Report

Bayesian decision-making under misspecified priors with applications to meta-learning

Autor: Simchowitz, Max, Tosh, Christopher, Krishnamurthy, Akshay, Hsu, Daniel, Lykouris, Thodoris, Dudík, Miroslav, Schapire, Robert E.

Thompson sampling and other Bayesian sequential decision-making algorithms are among the most popular approaches to tackle explore/exploit trade-offs in (contextual) bandits. The choice of prior in these algorithms offers flexibility to encode domain

Externí odkaz: http://arxiv.org/abs/2107.01509

Zobrazit plný text záznamu

Report

Gradient descent follows the regularization path for general losses

Autor: Ji, Ziwei, Dudík, Miroslav, Schapire, Robert E., Telgarsky, Matus

Recent work across many machine learning disciplines has highlighted that standard descent methods, even without explicit regularization, do not merely minimize the training error, but also exhibit an implicit bias. This bias is typically towards a c

Externí odkaz: http://arxiv.org/abs/2006.11226

Zobrazit plný text záznamu

Report

Practical Contextual Bandits with Regression Oracles

Autor: Foster, Dylan J., Agarwal, Alekh, Dudík, Miroslav, Luo, Haipeng, Schapire, Robert E.

A major challenge in contextual bandits is to design general-purpose algorithms that are both practically useful and theoretically well-founded. We present a new technique that has the empirical and computational advantages of realizability-based app

Externí odkaz: http://arxiv.org/abs/1803.01088

Zobrazit plný text záznamu

Report

On Oracle-Efficient PAC RL with Rich Observations

Autor: Dann, Christoph, Jiang, Nan, Krishnamurthy, Akshay, Agarwal, Alekh, Langford, John, Schapire, Robert E.

We study the computational tractability of PAC reinforcement learning with rich observations. We present new provably sample-efficient algorithms for environments with deterministic hidden state dynamics and stochastic rich observations. These method

Externí odkaz: http://arxiv.org/abs/1803.00606

Zobrazit plný text záznamu

Report

Corralling a Band of Bandit Algorithms

Autor: Agarwal, Alekh, Luo, Haipeng, Neyshabur, Behnam, Schapire, Robert E.

We study the problem of combining multiple bandit algorithms (that is, online learning algorithms with partial feedback) with the goal of creating a master algorithm that performs almost as well as the best base algorithm if it were to be run on its

Externí odkaz: http://arxiv.org/abs/1612.06246

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání