Session: Recent Advances in Theory and Algorithms for Multiagent Systems
Chair: Andrew Liu
Cluster: Multi-agent Optimization and Games
Talk 1: Approximate Global Convergence of Independent Learning in Multi-Agent Systems
Speaker: Zaiwei Chen
Abstract: Independent learning (IL), despite being a popular approach in practice to achieve scalability in large-scale multi-agent systems, usually lacks global convergence guarantees. In this paper, we study two representative algorithms, independent Q-learning and independent natural actor-critic, within value-based and policy-based frameworks and provide the first finite-sample analysis for approximate global convergence. Our results indicate that IL can achieve global convergence up to a fixed error, which arises from the dependence among agents and characterizes the fundamental limit of IL in attaining global convergence. To establish the result, we develop a novel approach for analyzing IL by constructing a separable Markov decision process (MDP) for convergence analysis and then bounding the gap due to the model difference between the separable MDP and the original one. Moreover, we conduct numerical experiments using a synthetic MDP and an electric vehicle charging example to demonstrate our results and the practical applicability of IL.
Talk 2: Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies
Speaker: Alex Deweese
Abstract: Many multi-agent systems in practice are decentralized and have dynamically varying dependencies. There has been a lack of attempts in the literature to analyze these systems theoretically. In this paper, we propose and theoretically analyze a decentralized model with dynamically varying dependencies called the Locally Interdependent Multi-Agent MDP. This model can represent problems in many disparate domains such as cooperative navigation, obstacle avoidance, and formation control. Despite the intractability that general partially observable multi-agent systems suffer from, we propose three closed-form policies that are theoretically near-optimal in this setting and can be scalable to compute and store. Consequentially, we reveal a fundamental property of Locally Interdependent Multi-Agent MDP's that the partially observable decentralized solution is exponentially close to the fully observable solution with respect to the visibility radius. We then discuss extensions of our closed-form policies to further improve tractability. We also provide simulations to investigate some long horizon behaviors of our closed-form policies.
Talk 3: Hybrid Mean-Field Control and Mean-Field Equilibrium: Theories, Algorithms and Applications
Speaker: Andrew Liu
Abstract: In this talk, we introduce a hybrid multiagent modeling framework that combines Mean Field Control (MFC) and Mean Field Equilibrium (MFE). A perfect example of this framework is the operation of multiple virtual power plants (VPPs) or aggregators, each applying an MFC algorithm to manage the distributed energy resources (DERs) within their portfolios. These aggregators participate in the wholesale energy market by bidding on behalf of the DERs they represent, navigating the dynamic and uncertain market environment. Traditional game-theoretic approaches fall short in capturing the complexity of repeated and dynamic interactions under such uncertainties. Hence, we leverage the MFG approach to study these agent interactions and the resulting market dynamics. The MFC framework empowers each aggregator to determine optimal control policies despite uncertainties in solar output, demand fluctuations, and price volatility. Simultaneously, the MFE framework models strategic interactions between aggregators and other market participants, enabling a scalable approach for large systems. We establish the existence of a strong Nash equilibrium within this hybrid structure and propose a reinforcement learning-based algorithm to help aggregators learn and optimize their strategies over time. Crucially, this prescriptive approach facilitates control automation, enabling the integration of advanced AI and machine learning techniques at the grid edge, to optimize resource management and achieve system-wide benefits. We validate this framework through simulations of the Oahu Island electricity grid, showing that the combination of energy storage and mean-field learning significantly reduces price volatility and yields stable market outcomes. This work demonstrates the power and flexibility of the hybrid MFC-MFE approach, offering a robust foundation for scalable, automated decision-making in energy markets and beyond.