Policy Iteration for Exploratory Hamilton--Jacobi--Bellman Equations
Autor: | Tran, Hung Vinh, Wang, Zhenhua, Zhang, Yuming Paul |
---|---|
Rok vydání: | 2024 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | We study the policy iteration algorithm (PIA) for entropy-regularized stochastic control problems on an infinite time horizon with a large discount rate, focusing on two main scenarios. First, we analyze PIA with bounded coefficients where the controls applied to the diffusion term satisfy a smallness condition. We demonstrate the convergence of PIA based on a uniform $\mathcal{C}^{2,\alpha}$ estimate for the value sequence generated by PIA, and provide a quantitative convergence analysis for this scenario. Second, we investigate PIA with unbounded coefficients but no control over the diffusion term. In this scenario, we first provide the well-posedness of the exploratory Hamilton--Jacobi--Bellman equation with linear growth coefficients and polynomial growth reward function. By such a well-posedess result we achieve PIA's convergence by establishing a quantitative locally uniform $\mathcal{C}^{1,\alpha}$ estimates for the generated value sequence. Comment: 21 pages |
Databáze: | arXiv |
Externí odkaz: |