Výsledky vyhledávání - "Dimakopoulou, Maria"

Report

Autor: Roberts, Claudia, Dimakopoulou, Maria, Qiao, Qifeng, Chandrashekhar, Ashok, Jebara, Tony

Contextual bandits are widely used in industrial personalization systems. These online learning frameworks learn a treatment assignment policy in the presence of treatment effects that vary with the observed contextual features of the users. While pe

Externí odkaz: http://arxiv.org/abs/2205.04528

Zobrazit plný text záznamu

Report

Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Autor: Bibaut, Aurélien, Chambaz, Antoine, Dimakopoulou, Maria, Kallus, Nathan, van der Laan, Mark

Empirical risk minimization (ERM) is the workhorse of machine learning, whether for classification and regression or for off-policy policy learning, but its model-agnostic guarantees can fail when we use adaptively collected data, such as the result

Externí odkaz: http://arxiv.org/abs/2106.01723

Zobrazit plný text záznamu

Report

Post-Contextual-Bandit Inference

Autor: Bibaut, Aurélien, Chambaz, Antoine, Dimakopoulou, Maria, Kallus, Nathan, van der Laan, Mark

Contextual bandit algorithms are increasingly replacing non-adaptive A/B tests in e-commerce, healthcare, and policymaking because they can both improve outcomes for study participants and increase the chance of identifying good or even best policies

Externí odkaz: http://arxiv.org/abs/2106.00418

Zobrazit plný text záznamu

Report

Online Multi-Armed Bandits with Adaptive Inference

Autor: Dimakopoulou, Maria, Ren, Zhimei, Zhou, Zhengyuan

During online decision making in Multi-Armed Bandits (MAB), one needs to conduct inference on the true mean reward of each arm based on data collected so far at each step. However, since the arms are adaptively selected--thereby yielding non-iid data

Externí odkaz: http://arxiv.org/abs/2102.13202

Zobrazit plný text záznamu

Report

Sequential causal inference in a single world of connected units

Autor: Bibaut, Aurelien, Petersen, Maya, Vlassis, Nikos, Dimakopoulou, Maria, van der Laan, Mark

We consider adaptive designs for a trial involving N individuals that we follow along T time steps. We allow for the variables of one individual to depend on its past and on the past of other individuals. Our goal is to learn a mean outcome, averaged

Externí odkaz: http://arxiv.org/abs/2101.07380

Zobrazit plný text záznamu

Report

Doubly robust off-policy evaluation with shrinkage

Autor: Su, Yi, Dimakopoulou, Maria, Krishnamurthy, Akshay, Dudík, Miroslav

Publikováno v: International Conference on Machine Learning (2020)

We propose a new framework for designing estimators for off-policy evaluation in contextual bandits. Our approach is based on the asymptotically optimal doubly robust estimator, but we shrink the importance weights to minimize a bound on the mean squ

Externí odkaz: http://arxiv.org/abs/1907.09623

Zobrazit plný text záznamu

Report

Balanced Linear Contextual Bandits

Autor: Dimakopoulou, Maria, Zhou, Zhengyuan, Athey, Susan, Imbens, Guido

Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation prob

Externí odkaz: http://arxiv.org/abs/1812.06227

Zobrazit plný text záznamu

Report

Scalable Coordinated Exploration in Concurrent Reinforcement Learning

Autor: Dimakopoulou, Maria, Osband, Ian, Van Roy, Benjamin

We consider a team of reinforcement learning agents that concurrently operate in a common environment, and we develop an approach to efficient coordinated exploration that is suitable for problems of practical scale. Our approach builds on seed sampl

Externí odkaz: http://arxiv.org/abs/1805.08948

Zobrazit plný text záznamu

Report

Coordinated Exploration in Concurrent Reinforcement Learning

Autor: Dimakopoulou, Maria, Van Roy, Benjamin

Publikováno v: Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 1271-1279, Stockholmsm\"assan, Stockholm Sweden, 10-15 Jul 2018

We consider a team of reinforcement learning agents that concurrently learn to operate in a common environment. We identify three properties - adaptivity, commitment, and diversity - which are necessary for efficient coordinated exploration and demon

Externí odkaz: http://arxiv.org/abs/1802.01282

Zobrazit plný text záznamu

Report

Estimation Considerations in Contextual Bandits

Autor: Dimakopoulou, Maria, Zhou, Zhengyuan, Athey, Susan, Imbens, Guido

Externí odkaz: http://arxiv.org/abs/1711.07077

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání