Zobrazeno 1 - 10
of 40
pro vyhledávání: '"Min, Seungki"'
Autor:
Jeong, Woojin, Min, Seungki
Publikováno v:
Reinforcement Learning Journal, vol. 1, no. 1, 2024, pp. TBD
We consider a Bayesian budgeted multi-armed bandit problem, in which each arm consumes a different amount of resources when selected and there is a budget constraint on the total amount of resources that can be used. Budgeted Thompson Sampling (BTS)
Externí odkaz:
http://arxiv.org/abs/2408.15535
Autor:
Min, Seungki, Russo, Daniel
In nonstationary bandit learning problems, the decision-maker must continually gather information and adapt their action selection as the latent state of the environment evolves. In each time period, some latent optimal action maximizes expected rewa
Externí odkaz:
http://arxiv.org/abs/2302.04452
Autor:
Min, Seungki
Dynamic programming (DP) has long been an essential framework for solving sequential decision-making problems. However, when the state space is intractably large or the objective contains a risk term, the conventional DP framework often fails to work
We consider a liquidation problem in which a risk-averse trader tries to liquidate a fixed quantity of an asset in the presence of market impact and random price fluctuations. The trader encounters a trade-off between the transaction costs incurred d
Externí odkaz:
http://arxiv.org/abs/2201.11962
We study the use of policy gradient algorithms to optimize over a class of generalized Thompson sampling policies. Our central insight is to view the posterior parameter sampled by Thompson sampling as a kind of pseudo-action. Policy gradient methods
Externí odkaz:
http://arxiv.org/abs/2006.16507
We study the competition for partners in two-sided matching markets with heterogeneous agent preferences, with a focus on how the equilibrium outcomes depend on the connectivity in the market. We model random partially connected markets, with each ag
Externí odkaz:
http://arxiv.org/abs/2006.14653
We consider a finite-horizon multi-armed bandit (MAB) problem in a Bayesian setting, for which we propose an information relaxation sampling framework. With this framework, we define an intuitive family of control policies that include Thompson sampl
Externí odkaz:
http://arxiv.org/abs/1902.04251
The composition of natural liquidity has been changing over time. An analysis of intraday volumes for the S&P500 constituent stocks illustrates that (i) volume surprises, i.e., deviations from their respective forecasts, are correlated across stocks,
Externí odkaz:
http://arxiv.org/abs/1811.05524
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.