Výsledky vyhledávání - "Gao, XueFeng"

Report

Reward-Directed Score-Based Diffusion Models via q-Learning

Autor: Gao, Xuefeng, Zha, Jiale, Zhou, Xun Yu

We propose a new reinforcement learning (RL) formulation for training continuous-time score-based diffusion models for generative AI to generate samples that maximize reward functions while keeping the generated distributions close to the unknown tar

Externí odkaz: http://arxiv.org/abs/2409.04832

Zobrazit plný text záznamu

Report

Regret Bounds for Episodic Risk-Sensitive Linear Quadratic Regulator

Autor: Xu, Wenhao, Gao, Xuefeng, He, Xuedong

Risk-sensitive linear quadratic regulator is one of the most fundamental problems in risk-sensitive optimal control. In this paper, we study online adaptive control of risk-sensitive linear quadratic regulator in the finite horizon episodic setting.

Externí odkaz: http://arxiv.org/abs/2406.05366

Zobrazit plný text záznamu

Report

Reinforcement Learning for Intensity Control: An Application to Choice-Based Network Revenue Management

Autor: Meng, Huiling, Chen, Ningyuan, Gao, Xuefeng

Intensity control is a type of continuous-time dynamic optimization problems with many important applications in Operations Research including queueing and revenue management. In this study, we adapt the reinforcement learning framework to intensity

Externí odkaz: http://arxiv.org/abs/2406.05358

Zobrazit plný text záznamu

Report

Reinforcement Learning for Jump-Diffusions, with Financial Applications

Autor: Gao, Xuefeng, Li, Lingfei, Zhou, Xun Yu

We study continuous-time reinforcement learning (RL) for stochastic control in which system dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized exploratory control problem with stochastic policies to capture the exp

Externí odkaz: http://arxiv.org/abs/2405.16449

Zobrazit plný text záznamu

Report

No Algorithmic Collusion in Two-Player Blindfolded Game with Thompson Sampling

Autor: Chen, Ningyuan, Gao, Xuefeng, Xiong, Yi

When two players are engaged in a repeated game with unknown payoff matrices, they may be completely unaware of the existence of each other and use multi-armed bandit algorithms to choose the actions, which is referred to as the ``blindfolded game''

Externí odkaz: http://arxiv.org/abs/2405.17463

Zobrazit plný text záznamu

Report

Convergence Analysis for General Probability Flow ODEs of Diffusion Models in Wasserstein Distances

Autor: Gao, Xuefeng, Zhu, Lingjiong

Score-based generative modeling with probability flow ordinary differential equations (ODEs) has achieved remarkable success in a variety of applications. While various fast ODE-based samplers have been proposed in the literature and employed in prac

Externí odkaz: http://arxiv.org/abs/2401.17958

Zobrazit plný text záznamu

Report

Wasserstein Convergence Guarantees for a General Class of Score-Based Generative Models

Autor: Gao, Xuefeng, Nguyen, Hoang M., Zhu, Lingjiong

Score-based generative models (SGMs) is a recent class of deep generative models with state-of-the-art performance in many applications. In this paper, we establish convergence guarantees for a general class of SGMs in 2-Wasserstein distance, assumin

Externí odkaz: http://arxiv.org/abs/2311.11003

Zobrazit plný text záznamu

Report

Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents

Autor: Xu, Wenhao, Gao, Xuefeng, He, Xuedong

The optimized certainty equivalent (OCE) is a family of risk measures that cover important examples such as entropic risk, conditional value-at-risk and mean-variance models. In this paper, we propose a new episodic risk-sensitive reinforcement learn

Externí odkaz: http://arxiv.org/abs/2301.12601

Zobrazit plný text záznamu

Report

Square-root regret bounds for continuous-time episodic Markov decision processes

Autor: Gao, Xuefeng, Zhou, Xun Yu

We study reinforcement learning for continuous-time Markov decision processes (MDPs) in the finite-horizon episodic setting. In contrast to discrete-time MDPs, the inter-transition times of a continuous-time MDP are exponentially distributed with rat

Externí odkaz: http://arxiv.org/abs/2210.00832

Zobrazit plný text záznamu

Elektronická kniha

Oxidation & 1, 5-hydride shift of sulfoximine derivatives

Autor: Gao, Xuefeng, Harmata, Michael

Publikováno v: Restricted access.

Title from PDF of title page (University of Missouri--Columbia, viewed on March 10, 2010). The entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in th

Externí odkaz: http://hdl.handle.net/10355/6646

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání