Model-free safe reinforcement learning for chemical processes using Gaussian processes

Autor: Dongda Zhang, Thomas R. Savage, Max Mowbray, Ehecatl Antonio del Rio Chanona
Rok vydání: 2021
Předmět:
Zdroj: IFAC-PapersOnLine. 54:504-509
ISSN: 2405-8963
DOI: 10.1016/j.ifacol.2021.08.292
Popis: Model-free reinforcement learning has been recently investigated for use in chemical process control. Through the iterative creation of an approximate process model, control actions are able to be explored and optimal policies generated. Typically, this approximate process model has taken the form of a neural network that is continuously updated. However when small quantities of historical data are available, for example in novel processes, neural networks tend to over-fit to data providing poor performance. In this paper Gaussian processes are used as a method of function approximation to describe the action-value function of a non-isothermal semi-batch reactor. Through the use of analytical uncertainty obtained from Gaussian process predictions, trade off between exploration and exploitation is enabled, allowing for efficient generation of effective policies. Importantly Gaussian processes also enable probabilistic constraint violation to be modelled, ensuring safe constraint satisfaction throughout the learning procedure. On application to the in-silico case study, a safe, effective policy was generated utilising only 100 evaluations of process trajectory with no prior knowledge of the process dynamics. A result that would require significantly more trajectory evaluations when compared to a neural network based approach.
Databáze: OpenAIRE