Multi-Agent Deep Reinforcement Learning for Multi-Echelon Inventory Management

40 Pages Posted: 5 Dec 2022 Last revised: 11 Feb 2024

See all articles by Xiaotian Liu

Xiaotian Liu

Georgia Institute of Technology; Peking University

Ming Hu

University of Toronto - Rotman School of Management

Yijie Peng

Peking University

Yaodong Yang

Peking University

Date Written: October 30, 2022

Abstract

This work investigates the application of Multi-Agent Deep Reinforcement Learning (MADRL) on decentralized inventory management problems with multiple echelons. Specifically, we apply Heterogeneous Agent Proximal Policy Optimization (HAPPO) to the decentralized multi-echelon inventory management problems in both a serial supply chain and a supply chain network. We provide the formulation of decentralized multiechelon inventory management problems as Partially Observable Markov Games (POMGs) and investigate the effective design of reward functions for multiple actors. We find that the optimal objective for each actor is between being fully self-interested and being fully system-focused when considering the optimization of the overall performance of the system. Our numerical results show that policies constructed by HAPPO achieve lower overall costs than policies constructed by single-agent deep reinforcement learning and other heuristic policies. Also, the upfront-only information-sharing mechanism used in MADRL contributes to a less significant bullwhip effect than policies constructed by single-agent deep reinforcement learning where information is not shared among actors. Our results provide a new perspective on the benefit of information sharing in the supply chains that helps alleviate the bullwhip effect and improve the overall performance when applying MADRL. Our results also verify MADRL’s potential in solving various multi-echelon inventory management problems with complex supply chain structures and in non-stationary market environments.

Keywords: Multi-Echelon Inventory Management, Multi-Agent Deep Reinforcement Learning, Bullwhip Effect

Suggested Citation

Liu, Xiaotian and Hu, Ming and Peng, Yijie and Yang, Yaodong, Multi-Agent Deep Reinforcement Learning for Multi-Echelon Inventory Management (October 30, 2022). Rotman School of Management Working Paper No. 4262186, Available at SSRN: https://ssrn.com/abstract=4262186 or http://dx.doi.org/10.2139/ssrn.4262186

Xiaotian Liu

Georgia Institute of Technology ( email )

Atlanta, GA 30332
United States

Peking University ( email )

Ming Hu

University of Toronto - Rotman School of Management ( email )

105 St. George st
Toronto, ON M5S 3E6
Canada
416-946-5207 (Phone)

HOME PAGE: http://ming.hu

Yijie Peng (Contact Author)

Peking University ( email )

No 5 Yiheyuan Rd
Haidian District
Beijing, Beijing 100871
China

Yaodong Yang

Peking University ( email )

No. 38 Xueyuan Road
Haidian District
Beijing, 100871
China

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
486
Abstract Views
1,824
Rank
108,598
PlumX Metrics