Session: Federated optimization and learning algorithms
Chair: Laurent Condat
Cluster: Optimization For Data Science
Talk 1: Stabilized Proximal-Point Methods for Federated Optimization
Speaker: Sebastian Stich
Abstract: Federated learning has emerged as an important paradigm in modern large-scale machine learning. Unlike traditional centralized learning, where models are trained using large datasets stored on a central server, federated learning keeps the training data distributed across many clients, such as phones, network sensors, hospitals, or other local information sources. In this setting, communication-efficient optimization algorithms are crucial. In this talk, we introduce a generic framework based on a distributed proximal point algorithm. This framework consolidates many of our insights and allows for the adaptation of arbitrary centralized optimization algorithms to the convex federated setting (even with acceleration). Our theoretical analysis shows that the derived methods enjoy faster convergence if the similarity among clients is high.
Talk 2: Taming Heterogeneity in Federated Linear Stochastic Approximation
Speaker: Paul Mangold
Abstract: In federated learning, multiple agents collaboratively train a machine learning model without exchanging local data. To achieve this, each agent locally updates a global model, and the updated models are periodically aggregated. In this talk, I will focus on federated linear stochastic approximation (FedLSA), with a strong focus on agents heterogeneity. I will derive upper bounds on the sample and communication complexity of FedLSA, and present a method to reduce communication cost using control variates. Particular attention will be put on the "linear speed-up" phenomenon, showing that the sample complexity scales with the inverse of the number of agents in both methods.
Talk 3: Convergence results for some federated learning algorithms
Speaker: Ming Yan
Abstract: In Federated Learning (FL), multiple nodes collaborate to solve a shared problem while keeping their private data decentralized, ensuring privacy by not transferring raw data. This process is typically framed as minimizing the average of private functions across individual nodes. The FL algorithm FedDyn is particularly effective in handling heterogeneous and non-IID data. In this talk, I will present recent advancements in FedDyn, where we relax the strong convexity requirement from individual functions to the averaged function. I will also discuss the addition of nonsmooth convex functions, where the proximal operator can be computed efficiently.