Zobrazeno 1 - 10
of 925
pro vyhledávání: '"GAO Xuefeng"'
We propose a new reinforcement learning (RL) formulation for training continuous-time score-based diffusion models for generative AI to generate samples that maximize reward functions while keeping the generated distributions close to the unknown tar
Externí odkaz:
http://arxiv.org/abs/2409.04832
Risk-sensitive linear quadratic regulator is one of the most fundamental problems in risk-sensitive optimal control. In this paper, we study online adaptive control of risk-sensitive linear quadratic regulator in the finite horizon episodic setting.
Externí odkaz:
http://arxiv.org/abs/2406.05366
Intensity control is a type of continuous-time dynamic optimization problems with many important applications in Operations Research including queueing and revenue management. In this study, we adapt the reinforcement learning framework to intensity
Externí odkaz:
http://arxiv.org/abs/2406.05358
We study continuous-time reinforcement learning (RL) for stochastic control in which system dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized exploratory control problem with stochastic policies to capture the exp
Externí odkaz:
http://arxiv.org/abs/2405.16449
When two players are engaged in a repeated game with unknown payoff matrices, they may be completely unaware of the existence of each other and use multi-armed bandit algorithms to choose the actions, which is referred to as the ``blindfolded game''
Externí odkaz:
http://arxiv.org/abs/2405.17463
Autor:
Gao, Xuefeng, Zhu, Lingjiong
Score-based generative modeling with probability flow ordinary differential equations (ODEs) has achieved remarkable success in a variety of applications. While various fast ODE-based samplers have been proposed in the literature and employed in prac
Externí odkaz:
http://arxiv.org/abs/2401.17958
Score-based generative models (SGMs) is a recent class of deep generative models with state-of-the-art performance in many applications. In this paper, we establish convergence guarantees for a general class of SGMs in 2-Wasserstein distance, assumin
Externí odkaz:
http://arxiv.org/abs/2311.11003
The optimized certainty equivalent (OCE) is a family of risk measures that cover important examples such as entropic risk, conditional value-at-risk and mean-variance models. In this paper, we propose a new episodic risk-sensitive reinforcement learn
Externí odkaz:
http://arxiv.org/abs/2301.12601
Autor:
Gao, Xuefeng, Zhou, Xun Yu
We study reinforcement learning for continuous-time Markov decision processes (MDPs) in the finite-horizon episodic setting. In contrast to discrete-time MDPs, the inter-transition times of a continuous-time MDP are exponentially distributed with rat
Externí odkaz:
http://arxiv.org/abs/2210.00832
Autor:
Gao, Xuefeng, Zhou, Xun Yu
We consider reinforcement learning for continuous-time Markov decision processes (MDPs) in the infinite-horizon, average-reward setting. In contrast to discrete-time MDPs, a continuous-time process moves to a state and stays there for a random holdin
Externí odkaz:
http://arxiv.org/abs/2205.11168