Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis

Autor:	Achref Bachouch, Côme Huré, Huyên Pham, Nicolas Langrené
Rok vydání:	2018
Předmět:	FOS: Computer and information sciences Machine Learning (stat.ML) 010103 numerical & computational mathematics 01 natural sciences Approximation error Statistics - Machine Learning Bellman equation FOS: Mathematics Reinforcement learning 0101 mathematics Mathematics - Optimization and Control Mathematics Stochastic control Numerical Analysis Artificial neural network business.industry Applied Mathematics Deep learning Probability (math.PR) Recursion (computer science) Computational Mathematics Rate of convergence Optimization and Control (math.OC) Artificial intelligence business Algorithm 65C05 90C39 93E35 68T07 Mathematics - Probability
DOI:	10.48550/arxiv.1812.04300
Popis:	This paper develops algorithms for high-dimensional stochastic control problems based on deep learning and dynamic programming. Unlike classical approximate dynamic programming approaches, we first approximate the optimal policy by means of neural networks in the spirit of deep reinforcement learning, and then the value function by Monte Carlo regression. This is achieved in the dynamic programming recursion by performance or hybrid iteration, and regress now methods from numerical probabilities. We provide a theoretical justification of these algorithms. Consistency and rate of convergence for the control and value function estimates are analyzed and expressed in terms of the universal approximation error of the neural networks, and of the statistical error when estimating network function, leaving aside the optimization error. Numerical results on various applications are presented in a companion paper (arxiv.org/abs/1812.05916) and illustrate the performance of the proposed algorithms. Comment: To appear in SIAM Journal on Numerical Analysis
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::eb7d1ef0eb616e4634d21d763b3a0314 Zobrazit plný text záznamu