Zobrazeno 1 - 10
of 646
pro vyhledávání: '"Ozdaglar, Asuman"'
We study a linear contextual optimization problem where a decision maker has access to historical data and contextual features to learn a cost prediction model aimed at minimizing decision error. We adopt the predict-then-optimize framework for this
Externí odkaz:
http://arxiv.org/abs/2409.10479
In this paper, we consider two-player zero-sum matrix and stochastic games and develop learning dynamics that are payoff-based, convergent, rational, and symmetric between the two players. Specifically, the learning dynamics for matrix games are base
Externí odkaz:
http://arxiv.org/abs/2409.01447
Policy gradient methods have become a staple of any single-agent reinforcement learning toolbox, due to their combination of desirable properties: iterate convergence, efficient use of stochastic trajectory feedback, and theoretically-sound avoidance
Externí odkaz:
http://arxiv.org/abs/2408.00751
LiteEFG is an efficient library with easy-to-use Python bindings, which can solve multiplayer extensive-form games (EFGs). LiteEFG enables the user to express computation graphs in Python to define updates on the game tree structure. The graph is the
Externí odkaz:
http://arxiv.org/abs/2407.20351
We study best-response type learning dynamics for two player zero-sum matrix games. We consider two settings that are distinguished by the type of information that each player has about the game and their opponent's strategy. The first setting is the
Externí odkaz:
http://arxiv.org/abs/2407.20128
Inverse Reinforcement Learning (IRL) and Reinforcement Learning from Human Feedback (RLHF) are pivotal methodologies in reward learning, which involve inferring and shaping the underlying reward function of sequential decision-making problems based o
Externí odkaz:
http://arxiv.org/abs/2405.12421
In adversarial machine learning, neural networks suffer from a significant issue known as robust overfitting, where the robust test accuracy decreases over epochs (Rice et al., 2020). Recent research conducted by Xing et al.,2021; Xiao et al., 2022 h
Externí odkaz:
http://arxiv.org/abs/2405.01817
Reinforcement learning from human feedback (RLHF) has been an effective technique for aligning AI systems with human values, with remarkable successes in fine-tuning large-language models recently. Most existing RLHF paradigms make the underlying ass
Externí odkaz:
http://arxiv.org/abs/2405.00254
Large language models (LLMs) have been increasingly employed for (interactive) decision-making, via the development of LLM-based autonomous agents. Despite their emerging successes, the performance of LLM agents in decision-making has not been fully
Externí odkaz:
http://arxiv.org/abs/2403.16843
Many online platforms of today, including social media sites, are two-sided markets bridging content creators and users. Most of the existing literature on platform recommendation algorithms largely focuses on user preferences and decisions, and does
Externí odkaz:
http://arxiv.org/abs/2401.00313