EPFL

Prof. Kaiqing Zhang, University of Maryland

Title: Principled Learning-to-Communicate with Quasi-Classical Information Structures

Abstract: Learning-to-communicate (LTC) in partially observable environments has emerged and received increasing attention in deep multi-agent reinforcement learning, where the control and communication strategies are jointly learned. On the other hand, the impact of communication has been extensively studied in control theory. In this paper, we seek to formalize and better understand LTC by bridging these two lines of work, through the lens of information structures (ISs). To this end, we formalize LTC in decentralized partially observable Markov decision processes (Dec-POMDPs) under the common-information-based (CIB) framework, and classify LTCs based on the ISs before additional information sharing. We first show that non-classical LTCs are computationally intractable in general, and thus focus on quasi-classical (QC) LTCs. We then propose a series of conditions for QC LTCs, violating which can cause computational hardness in general. Further, we develop provable planning and learning algorithms for QC LTCs, and show that examples of QC LTCs satisfying the above conditions can be solved without computationally intractable oracles. Along the way, we also establish some relationships between (strictly) QC IS and the condition of strategy-independent CIB beliefs (SI-CIB), as well as solving general Dec-POMDPs without computationally intractable oracles beyond those with the SI-CIB condition, which may be of independent interest.

Brief bio: Kaiqing Zhang is an Assistant Professor at the Department of Electrical and Computer Engineering (ECE) and the Institute for Systems Research (ISR), and also with appointment (affiliated) at the Department of Computer Science (CS), at the University of Maryland, College Park. He is also a member of the University of Maryland Institute for Advanced Computer Studies (UMIACS), the Center for Machine Learning, the Maryland Robotics Center (MRC), and the Applied Mathematics & Statistics, and Scientific Computation (AMSC) program. During the deferral time before joining Maryland, he was a postdoctoral scholar at LIDS and CSAIL at MIT, and a Research Fellow at Simons Institute for the Theory of Computing at Berkeley. He finished his Ph.D. from the Department of ECE and CSL at the University of Illinois at Urbana-Champaign (UIUC). He also received M.S. in both ECE and Applied Math from UIUC, and B.E. from Tsinghua University. His research interests lie broadly in Systems and Control Theory, Reinforcement/Machine Learning, Game Theory, Robotics, Computation, and their intersections.