Optimal recovery of unsecured debt via interpretable reinforcement learning

Autor:	Michael Mark, Naveed Chehrazi, Huanxi Liu, Thomas A. Weber
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	Reinforcement learning Interpretable machine learning Deterministic policy gradient Monotonicity constrained learning Debt recovery Control of Hawkes processes Cybernetics Q300-390 Electronic computers. Computer science QA75.5-76.95
Zdroj:	Machine Learning with Applications, Vol 8, Iss , Pp 100280- (2022)
Druh dokumentu:	article
ISSN:	2666-8270
DOI:	10.1016/j.mlwa.2022.100280
Popis:	This paper addresses the issue of interpretability and auditability of reinforcement-learning agents employed in the recovery of unsecured consumer debt. To this end, we develop a deterministic policy-gradient method that allows for a natural integration of domain expertise into the learning procedure so as to encourage learning of consistent, and thus interpretable, policies. Domain knowledge can often be expressed in terms of policy monotonicity and/or convexity with respect to relevant state inputs. We augment the standard actor–critic policy approximator using a monotonically regularized loss function which integrates domain expertise into the learning. Our formulation overcomes the challenge of learning interpretable policies by constraining the search to policies satisfying structural-consistency properties. The resulting state-feedback control laws can be readily understood and implemented by human decision makers. This new domain-knowledge enhanced learning approach is applied to the problem of optimal debt recovery which features a controlled Hawkes process and an asynchronous action–feedback relationship.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/95cb6c9ac15d48dbb888625463147c8a Zobrazit plný text záznamu View record in DOAJ