A Theoretical Analysis of Cooperative Behavior in Multi-Agent Q-Learning

23 Pages Posted: 7 Feb 2006

See all articles by Ludo Waltman

Ludo Waltman

Erasmus University Rotterdam - Faculty of Economics and Business

U. Kaymak

Erasmus University Rotterdam (EUR) - Faculty of Economics - Department of Computer Science; Erasmus Research Institute of Management (ERIM)

Date Written: February 1, 2006

Abstract

A number of experimental studies have investigated whether cooperative behavior may emerge in multi-agent Q-learning. In some studies cooperative behavior did emerge, in others it did not. This report provides a theoretical analysis of this issue. The analysis focuses on multi-agent Q-learning in iterated prisoner’s dilemmas. It is shown that under certain assumptions cooperative behavior may emerge when multi-agent Q-learning is applied in an iterated prisoner’s dilemma. An important consequence of the analysis is that multi-agent Q-learning may result in non-Nash behavior. It is found experimentally that the theoretical results derived in this report are quite robust to violations of the underlying assumptions.

Keywords: Cooperation, Multi-Agent Q-Learning, Multi-Agent Reinforcement Learning, Nash Equilibrium, Prisoner’s Dilemma

Suggested Citation

Waltman, Ludo and Kaymak, Uzay, A Theoretical Analysis of Cooperative Behavior in Multi-Agent Q-Learning (February 1, 2006). ERIM Report Series Reference No. ERS-2006-006-LIS, Available at SSRN: https://ssrn.com/abstract=880523

Ludo Waltman (Contact Author)

Erasmus University Rotterdam - Faculty of Economics and Business ( email )

P.O. Box 1738
3000 DR Rotterdam, NL 3062 PA
Netherlands
+31 10 408 1182 (Phone)
+31 10 408 9640 (Fax)

Uzay Kaymak

Erasmus University Rotterdam (EUR) - Faculty of Economics - Department of Computer Science ( email )

P.O. Box 1738
3000 DR Rotterdam
Netherlands

Erasmus Research Institute of Management (ERIM)

P.O. Box 1738
3000 DR Rotterdam
Netherlands

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
191
Abstract Views
1,627
Rank
286,427
PlumX Metrics