Zobrazeno 1 - 10
of 24
pro vyhledávání: '"Ashwin Pananjady"'
Publikováno v:
IEEE Transactions on Information Theory. 68:1851-1885
Publikováno v:
The Annals of Statistics. 51
Publikováno v:
Mathematics of Operations Research.
Linear fixed-point equations in Hilbert spaces arise in a variety of settings, including reinforcement learning, and computational methods for solving differential and integral equations. We study methods that use a collection of random observations
Publikováno v:
2022 IEEE 61st Conference on Decision and Control (CDC).
Autor:
Dean P. Foster, Ashwin Pananjady
Publikováno v:
IEEE Transactions on Information Theory. 67:4092-4124
A single-index model is given by $y = g^{*}(\langle x, \theta ^{*} \rangle) + \epsilon $ : The scalar response $y$ depends on the covariate vector $x$ both through an unknown (vector) parameter ${\theta ^{*}}$ as well as an unknown, non-parametric, u
Autor:
Martin J. Wainwright, Ashwin Pananjady
Publikováno v:
IEEE Transactions on Information Theory. 67:566-585
Markov reward processes (MRPs) are used to model stochastic phenomena arising in operations research, control engineering, robotics, and artificial intelligence, as well as communication and transportation networks. In many of these cases, such as in
We study the problem of policy evaluation with linear function approximation and present efficient and practical algorithms that come with strong optimality guarantees. We begin by proving lower bounds that establish baselines on both the determinist
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4e2f8dc63697e8a8a2fc427c4c69c243
http://arxiv.org/abs/2112.13109
http://arxiv.org/abs/2112.13109
Publikováno v:
ISIT
We study the max-affine regression model, where the unknown regression function is modeled as a maximum of a fixed number of affine functions. In recent work [1], we showed that end-to-end parameter estimates were obtainable using this model with an
Publikováno v:
Ann. Statist. 48, no. 2 (2020), 1072-1097
Pairwise comparison data arises in many domains, including tournament rankings, web search and preference elicitation. Given noisy comparisons of a fixed subset of pairs of items, we study the problem of estimating the underlying comparison probabili
We address the problem of policy evaluation in discounted Markov decision processes, and provide instance-dependent guarantees on the $\ell_\infty$-error under a generative model. We establish both asymptotic and non-asymptotic versions of local mini
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9ec55dea340dfbe29aa595409204987d