Zobrazeno 1 - 10
of 38
pro vyhledávání: '"Katariya, Sumeet"'
Autor:
Atsidakou, Alexia, Kveton, Branislav, Katariya, Sumeet, Caramanis, Constantine, Sanghavi, Sujay
We derive the first finite-time logarithmic Bayes regret upper bounds for Bayesian bandits. In a multi-armed bandit, we obtain $O(c_\Delta \log n)$ and $O(c_h \log^2 n)$ upper bounds for an upper confidence bound algorithm, where $c_h$ and $c_\Delta$
Externí odkaz:
http://arxiv.org/abs/2306.09136
Autor:
Krishnamurthy, Sanath Kumar, Modi, Shrey, Gangwani, Tanmay, Katariya, Sumeet, Kveton, Branislav, Rangi, Anshuka
We consider the finite-horizon offline reinforcement learning (RL) setting, and are motivated by the challenge of learning the policy at any step h in dynamic programming (DP) algorithms. To learn this, it is sufficient to evaluate the treatment effe
Externí odkaz:
http://arxiv.org/abs/2302.00284
Many practical applications, such as recommender systems and learning to rank, involve solving multiple similar tasks. One example is learning of recommendation policies for users with similar movie preferences, where the users may still rank the ind
Externí odkaz:
http://arxiv.org/abs/2212.04720
Fixed-budget best-arm identification (BAI) is a bandit problem where the agent maximizes the probability of identifying the optimal arm within a fixed budget of observations. In this work, we study this problem in the Bayesian setting. We propose a B
Externí odkaz:
http://arxiv.org/abs/2211.08572
A contextual bandit is a popular framework for online learning to act under uncertainty. In practice, the number of actions is huge and their expected rewards are correlated. In this work, we introduce a general framework for capturing such correlati
Externí odkaz:
http://arxiv.org/abs/2205.15124
We develop a meta-learning framework for simple regret minimization in bandits. In this framework, a learning agent interacts with a sequence of bandit tasks, which are sampled i.i.d.\ from an unknown prior distribution, and learns its meta-parameter
Externí odkaz:
http://arxiv.org/abs/2202.12888
Autor:
Xie, Yaochen, Katariya, Sumeet, Tang, Xianfeng, Huang, Edward, Rao, Nikhil, Subbian, Karthik, Ji, Shuiwang
Graph Neural Networks (GNNs) have emerged as powerful tools to encode graph-structured data. Due to their broad applications, there is an increasing need to develop tools to explain how GNNs make decisions given graph-structured data. Existing learni
Externí odkaz:
http://arxiv.org/abs/2202.08335
Mean rewards of actions are often correlated. The form of these correlations may be complex and unknown a priori, such as the preferences of a user for recommended products and their categories. To maximize statistical efficiency, it is important to
Externí odkaz:
http://arxiv.org/abs/2202.01454
Autor:
Zheng, Wenqing, Huang, Edward W, Rao, Nikhil, Katariya, Sumeet, Wang, Zhangyang, Subbian, Karthik
Graph Neural Networks (GNNs) have achieved state-of-the-art performance in node classification, regression, and recommendation tasks. GNNs work well when rich and high-quality connections are available. However, their effectiveness is often jeopardiz
Externí odkaz:
http://arxiv.org/abs/2111.04840
Logical reasoning over Knowledge Graphs (KGs) is a fundamental technique that can provide efficient querying mechanism over large and incomplete databases. Current approaches employ spatial geometries such as boxes to learn query representations that
Externí odkaz:
http://arxiv.org/abs/2110.13522