Zobrazeno 1 - 10
of 417
pro vyhledávání: '"Gaillard , Pierre"'
We address the online unconstrained submodular maximization problem (Online USM), in a setting with stochastic bandit feedback. In this framework, a decision-maker receives noisy rewards from a nonmonotone submodular function, taking values in a know
Externí odkaz:
http://arxiv.org/abs/2410.08578
We study boosting for adversarial online nonparametric regression with general convex losses. We first introduce a parameter-free online gradient boosting (OGB) algorithm and show that its application to chaining trees achieves minimax optimal regret
Externí odkaz:
http://arxiv.org/abs/2410.03363
We study a theoretical and algorithmic framework for structured prediction in the online learning setting. The problem of structured prediction, i.e. estimating function where the output space lacks a vectorial structure, is well studied in the liter
Externí odkaz:
http://arxiv.org/abs/2406.12366
We explore online learning in episodic loop-free Markov decision processes on non-stationary environments (changing losses and probability transitions). Our focus is on the Concave Utility Reinforcement Learning problem (CURL), an extension of classi
Externí odkaz:
http://arxiv.org/abs/2405.19807
Autor:
Saha, Aadirupa, Gaillard, Pierre
We address the problem of active online assortment optimization problem with preference feedback, which is a framework for modeling user choices and subsetwise utility maximization. The framework is useful in various real-world applications including
Externí odkaz:
http://arxiv.org/abs/2402.18917
We address the problem of stochastic combinatorial semi-bandits, where a player selects among P actions from the power set of a set containing d base items. Adaptivity to the problem's structure is essential in order to obtain optimal regret upper bo
Externí odkaz:
http://arxiv.org/abs/2402.15171
We introduce an online mathematical framework for survival analysis, allowing real time adaptation to dynamic environments and censored data. This framework enables the estimation of event time distributions through an optimal second order online con
Externí odkaz:
http://arxiv.org/abs/2402.05145
We study the classical problem of approximating a non-decreasing function $f: \mathcal{X} \to \mathcal{Y}$ in $L^p(\mu)$ norm by sequentially querying its values, for known compact real intervals $\mathcal{X}$, $\mathcal{Y}$ and a known probability m
Externí odkaz:
http://arxiv.org/abs/2309.07530
Counterfactual Risk Minimization (CRM) is a framework for dealing with the logged bandit feedback problem, where the goal is to improve a logging policy using offline data. In this paper, we explore the case where it is possible to deploy learned pol
Externí odkaz:
http://arxiv.org/abs/2302.12120
Integrating renewable energy into the power grid while balancing supply and demand is a complex issue, given its intermittent nature. Demand side management (DSM) offers solutions to this challenge. We propose a new method for DSM, in particular the
Externí odkaz:
http://arxiv.org/abs/2302.08190