Contextual Bandits in Payment Processing: Non-uniform Exploration and Supervised Learning at Adyen

Autor:	Vangara, Akhila, Egg, Alex
Rok vydání:	2024
Předmět:	Computer Science - Machine Learning Computer Science - Information Retrieval
Druh dokumentu:	Working Paper
Popis:	Uniform random exploration in decision-making systems supports off-policy learning via supervision but incurs high regret, making it impractical for many applications. Conversely, non-uniform exploration offers better immediate performance but lacks support for off-policy learning. Recent research suggests that regression oracles can bridge this gap by combining non-uniform exploration with supervised learning. In this paper, we analyze these approaches within a real-world industrial context at Adyen, a large global payments processor characterized by batch logged delayed feedback, short-term memory, and dynamic action spaces under the Empirical Risk Minimization (ERM) framework. Our analysis reveals that while regression oracles significantly improve performance, they introduce challenges due to rigid algorithmic assumptions. Specifically, we observe that as a policy improves, subsequent generations may perform worse due to shifts in the reward distribution and increased class imbalance in the training data. This degradation occurs de spite improvements in other aspects of the training data, leading to decreased performance in successive policy iterations. We further explore the long-term impact of regression oracles, identifying a potential "oscillation effect." This effect arises when regression oracles influence probability estimates and the realizability of subsequent policy models, leading to fluctuations in performance across iterations. Our findings highlight the need for more adaptable algorithms that can leverage the benefits of regression oracles without introducing instability in policy performance over time. Comment: 7 pages, 10 figures, submitted to WWW '25
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2412.00569 Zobrazit plný text záznamu View this record from Arxiv