Control of a bioreactor using a new partially supervised reinforcement learning algorithm

Autor:	B. Jaganatha Pandian, Mathew Mithra Noel
Rok vydání:	2018
Předmět:	0209 industrial biotechnology Artificial neural network Computer science business.industry Control (management) Inverse 02 engineering and technology Construct (python library) Optimal control Industrial and Manufacturing Engineering Computer Science Applications Nonlinear system 020901 industrial engineering & automation Control and Systems Engineering Modeling and Simulation 0202 electrical engineering electronic engineering information engineering Reinforcement learning 020201 artificial intelligence & image processing Markov decision process Artificial intelligence business
Zdroj:	Journal of Process Control. 69:16-29
ISSN:	0959-1524
DOI:	10.1016/j.jprocont.2018.07.013
Popis:	In recent years, researchers have explored the application of Reinforcement Learning (RL) and Artificial Neural Networks (ANNs) to the control of complex nonlinear and time varying industrial processes. However RL algorithms use exploratory actions to learn an optimal control policy and converge slowly while popular inverse model ANN based control strategies require extensive training data to learn the inverse model of complex nonlinear systems. In this paper a novel approach that avoids the need for extensive training data to construct an exact inverse model in the inverse ANN approach, the need for an exact and stable inverse to exist and the need for exhaustive and costly exploration in pure RL based strategies is proposed. In this approach an initial approximate control policy learnt by an artificial neural network is refined using a reinforcement learning strategy. This Partially Supervised Reinforcement Learning (PSRL) strategy is applied to the economically important problem of control of a semi-continuous batch-fed bioreactor used for yeast fermentation. The bioreactor control problem is formulated as a Markov Decision Process (MDP) and solved using pure RL and PSRL algorithms. Model based and model-free RL control experiments and simulations are used to demonstrate the superior performance of the PSRL strategy compared to the pure RL and inverse model ANN based control strategies on a variety of performance metrics.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::86d190ec1c828bb9350e076d0c3c7126 https://doi.org/10.1016/j.jprocont.2018.07.013 Zobrazit plný text záznamu Full Text from ScienceDirect