Competitive Multi-agent RL with Coupling Constraints

Motivation. In multi-agent reinforcement learning, several agents interact in a shared environment that evolves over time based on the decisions of all agents. Each agent tries to maximize its own reward, which often depends not only on its own actions but also on the actions of others. In many real-world applications, agents must also satisfy additional constraints. For example, in autonomous driving, vehicles must avoid collisions and respect physical limits while navigating the road — goals that naturally involve coordination with other vehicles. In such settings, each agent must not only pursue its own objective but also take into account shared safety or resource constraints. Constrained Markov games [AS00] provide a mathematical framework for modeling these types of problems, where constraints are shared across agents.

Existing algorithms for learning in constrained Markov games focus on a restricted class of cooperative games [JBH24]. In this project, we aim to go beyond this setting, by empirically exploring applications of learning in competitive zero-sum Markov games with coupling constraints.

Outline. As a first step, you will start by familiarizing yourself with the literature on classical RL, and its multi-agent generalization in terms of Markov games. Next, we will explore practical settings in which RL agents interact in a zero-sum game in which they are subject to shared coupling constraints. One such example could be a simple gridworld race in which two “drivers” compete to reach a target while adhering to safety/collision constraints. In such a simple setting, we will explore different ideas for algorithmic approaches to learning competitive policies, and empirically evaluate their convergence.

Requirements. We seek for motivated students with a strong mathematical, or computer science background. This project offers the opportunity to delve into recent literature at the intersection of optimization, RL and game theory, and to apply novel algorithmic approaches to meaningful practical multi-agent learning scenarios. If you are interested, please send an email containing: a) One paragraph on your background and fit for the project, and b) your BS/MS transcripts to [email protected].

This project will be supervised by Prof. Maryam Kamgarpour ([email protected]), Anna Maddux ([email protected]), and Philip Jordan ([email protected]).

 

References

[AS00] Eitan Altman and Adam Shwartz. Constrained Markov Games: Nash Equilibria. In Jerzy A. Filar, Vladimir Gaitsgory, and Koichi Mizukami, editors, Advances in Dynamic Games and Applications, Annals of the International Society of Dynamic Games, pages 213–221, Boston, MA, 2000. Birkhäuser.

[JBH24] Philip Jordan, Anas Barakat, and Niao He. Independent Learning in Constrained Markov Potential Games. In Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, pages 4024–4032. PMLR, April 2024. ISSN: 2640-3498.