Motivation
Online learning problems involving multiple self-interested players with coupled objectives arise in applications across economics, computer science, and control systems. In standard game-theoretic frameworks used to model such scenarios, each player’s objective is represented by a single scalar-valued utility function. However, in many real-world settings, players face multiple, potentially conflicting objectives. For example, in autonomous driving, relevant objectives for each agent include travel time, energy efficiency, and safety. Reducing such problems to a game with scalar utilities formed as a linear combination of multiple objectives may fail to capture these conflicts. In this project, we aim to develop a mathematical framework for multi-objective games by studying questions of equilibrium existence and the convergence of (multi-objective) online learning algorithms to such equilibria.
Background
Online learning and game theory are rich fields with a long history of study, and their intersection has led to a number of fundamental connections [CBL06]. Online learning is often concerned with designing algorithms that achieve “no regret” [Haz16]: in the long run, the decision-maker would not have preferred any other fixed action in hindsight. A classical result connecting regret minimization to game theory states that, in a normal-form game, if each player follows a no-regret algorithm, then the empirical distribution of play converges to a coarse correlated equilibrium (a weaker notion than Nash equilibrium) [HMC00]. Recent work has extended notions of regret and learning algorithms to the multi-objective setting [ABH11, JZZ+23]. It is an interesting question whether, as in the single-objective case, these notions of regret lead to equilibria in multi-objective games.
Outline
The project will begin with a study of the classical literature on online learning, regret minimization, and equilibrium learning in games. Then, we will consider more recent work on multi-objective learning and explore how these ideas can be extended to a game settings. A wide range of questions can be investigated, from characterizing equilibria to designing algorithms and analyzing their behavior when deployed by all players in a game. The precise direction can be adapted to the students’ interests.
Requirements
We are looking for motivated Master students with a strong background in mathematics or computer science. This project can be done either as a Master thesis or semester project. It offers the opportunity to delve into both classical and recent literature at the intersection of online learning and game theory, and to explore interesting open questions.
Contact
he project/thesis will be supervised by Prof. Maryam Kamgarpour and Philip Jordan. If you are interested, please send an email including (a) a short paragraph describing your background and fit for the project, and (b) your BS/MS transcripts to [email protected].
References
[ABH11] Jacob Abernethy, Peter L Bartlett, and Elad Hazan. Blackwell approachability and no-regret learning are equivalent. In Proceedings of the 24th Annual Conference on Learning Theory, pages 27–46. JMLR Workshop and Conference Proceedings, 2011.
[CBL06] Nicolo Cesa-Bianchi and Gabor Lugosi. Prediction, learning, and games. Cambridge university press, 2006.
[Haz16] Elad Hazan. Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3-4):157–325, 2016.
[HMC00] Sergiu Hart and Andreu Mas-Colell. A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68(5):1127–1150, 2000.
[JZZ+23] Jiyan Jiang, Wenpeng Zhang, Shiji Zhou, Lihong Gu, Xiaodong Zeng, and Wenwu Zhu. Multi-objective online learning. In The Eleventh International Conference on Learning Representations, 2023.
