Multi-Echelon Inventory Optimization Using Deep Reinforcement Learning

Autor:	Hammler, Patric, Riesterer, Nicolas, Mu, Gang, Braun, Torsten
Přispěvatelé:	Canci, Jung Kyu, Mekler, Philipp, Mu, Gang
Rok vydání:	2023
Předmět:	510 Mathematics 000 Computer science knowledge & systems
Zdroj:	Hammler, Patric; Riesterer, Nicolas; Mu, Gang; Braun, Torsten (2023). Multi-Echelon Inventory Optimization Using Deep Reinforcement Learning. In: Canci, Jung Kyu; Mekler, Philipp; Mu, Gang (eds.) Quantitative Models in Life Science Business. SpringerBriefs in economics (pp. 73-93). Springer 10.1007/978-3-031-11814-2_5 Quantitative Models in Life Science Business ISBN: 9783031118135
DOI:	10.48350/176625
Popis:	In this chapter, we provide an overview of inventory management within the pharmaceutical industry and how to model and optimize it. Inventory management is a highly relevant topic, as it causes high costs such as holding, shortage, and reordering costs. Especially the event of a stock-out can cause damage that goes beyond monetary damage in the form of lost sales. To minimize those costs is the task of an optimized reorder policy. A reorder policy is optimal when it minimizes the accumulated cost in every situation. However, finding an optimal policy is not trivial. First, the problem is highly stochastic as we need to consider variable demands and lead times. Second, the supply chain consists of several warehouses incl. the factory, global distribution warehouses, and local affiliate warehouses, whereby the reorder policy of each warehouse has an impact on the optimal reorder policy of related warehouses. In this context, we discuss the concept of multi-echelon inventory optimization and a methodology that is capable of capturing both, the stochastic behavior of the environment and how it is impacted by the reorder policy: Markov decision processes (MDPs). On this basis, we introduce the concept, its related benefits and weaknesses of a methodology named Reinforcement Learning (RL). RL is capable of finding (near-) optimal (reorder) policies for MDPs. Furthermore, some simulation-based results and current research directions are presented.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0a363bf6b33f9c23f0a7fb4eb0cc02fe Zobrazit plný text záznamu