Loading…
Wednesday July 23, 2025 4:15pm - 5:30pm PDT
Session: On Hierarchical Optimization, Games, and Federated Learning
Chair: Farzad Yousefian
Cluster: Multi-agent Optimization and Games

Talk 1: Modelling the non-stationarity of capacity in continual learning
Speaker: Krishnan Raghavan
Abstract: Continual learning is the problem of learning on a sequence of tasks and the core issue in this domain is that of balancing catastrophic forgetting of prior knowledge with generalization over new tasks, known as the stability-plasticity dilemma. This work introduces CL's effective model capacity~(CLEMC) to theoretically formalize how this balance depends on the neural network (NN), the tasks, and the optimization procedure. In this talk, we demonstrate that CLEMC, and thus the balance point, is non-stationary and the interplay between tasks, neural network and the optimization procedure is an evolving dynamical game. We discuss the use of optimal control techniques to model this dynamical game and study the evolution of CLEMC. We hypothesize that regardless of the NN architecture and optimization method, the network's ability to represent new tasks diminishes if the new tasks' data distributions differ significantly from previous ones, i.e. the Nash equilibrium becomes more and more difficult to find. We cement this hypothesis theoretically and then using various NNs, from small feed-forward and convolutional networks to transformer-based language models with millions of parameters (8M and 134M).

Talk 2: Federated Simple Bilevel Optimization: A Universal Regularized Scheme with Guarantees
Speaker: Yuyang Qiu
Abstract: We study a class of bilevel federated learning (FL) problem, where clients cooperatively seek to find among multiple optimal solutions of a primary distributed learning problem, a solution that minimizes a secondary distributed global loss function. This problem has attracted increasing attention in machine learning, in particular, in over-parameterized learning and hyperparameter optimization. Despite some recent progress, communication-efficient FL methods equipped with complexity guarantees for resolving this problem are primarily absent. Motivated by this lacuna, we propose a universal regularized scheme and derive promising error bounds in terms of both the lower-level and upper-level loss functions. Leveraging this unifying theory, we then enable existing FL methods, including FedAvg and SCAFFOLD, to solve the corresponding bilevel FL problem, and derive novel communication complexity guarantees for each method. Intriguingly, the universal scheme can be employed to provably enable many other state-of-the-art optimization methods to address the bilevel problem. We validate the theoretical findings on EMNIST and CIFAR-10 datasets.

Talk 3: Improved guarantees for optimal Nash equilibrium seeking and bilevel variational inequalities
Speaker: Sepideh Samadi
Abstract: We consider a class of hierarchical variational inequality (VI) problems that subsumes VI-constrained optimization and several other important problem classes including the optimal solution selection problem and the optimal Nash equilibrium (NE) seeking problem. Our main contributions are threefold. (i) We consider bilevel VIs with monotone and Lipschitz continuous mappings and devise a single-timescale iteratively regularized extragradient method, named IR-EG(m,m). We improve the existing iteration complexity results for addressing both bilevel VI and VI-constrained convex optimization problems. (ii) Under the strong monotonicity of the outer level mapping, we develop a method named IR-EG(s,m) and derive faster guarantees than those in (i). We also study the iteration complexity of this method under a constant regularization parameter. These results appear to be new for both bilevel VIs and VI-constrained optimization. (iii) To our knowledge, complexity guarantees for computing the optimal NE in nonconvex settings do not exist. Motivated by this lacuna, we consider VI-constrained nonconvex optimization problems and devise an inexactly-projected gradient method, named IPR-EG, where the projection onto the unknown set of equilibria is performed using IR-EG(s,m) with a prescribed termination criterion and an adaptive regularization parameter. We obtain new complexity guarantees in terms of a residual map and an infeasibility metric for computing a stationary point. We validate the theoretical findings using preliminary numerical experiments for computing the best and the worst Nash equilibria.

Speakers
FY

Farzad Yousefian

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
KR

Krishnan Raghavan

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Yuyang Qiu

Yuyang Qiu

PhD from Rutgers U. Incoming postdoc at UCSB.
Hi, I'm Yuyang, currently a 5th-year Ph.D. candidate in the ISE department at Rutgers University. My advisor is Prof. Farzad Yousefian. My research spans federated learning, hierarchical optimization, as well as distributed optimization over networks. I am currently interested on... Read More →
Wednesday July 23, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 116 3501 Trousdale Pkwy, 116, Los Angeles, CA 90089

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link