Session: Communication-efficient distributed optimization
Chair: Laurent Condat
Cluster: Nonlinear Optimization
Talk 1: Local Exact Diffusion for Decentralized Stochastic Optimization
Speaker: Sulaiman Alghunaim
Abstract: Distributed optimization methods featuring local updates have gained significant attention for their potential to cut down the communication costs in distributed systems. In these algorithms, nodes perform multiple local updates based on their data before exchanging estimation information. Although many studies have explored distributed local methods with centralized network connections, fewer have focused on decentralized networks. In this study, we introduce and examine a locally updated decentralized method named Local Exact-Diffusion (LED). This method allows each node to perform either fixed or random local updates, with communication occurring randomly at each iteration. We establish the convergence of LED in both convex and nonconvex settings for the stochastic online environment. Our convergence rate surpasses those of existing decentralized methods. Specializing the network to a centralized setup, we achieve the state-of-the-art bound for centralized methods. Additionally, we connect LED to various other independently studied distributed methods, including Scaffnew, FedGate, and VRL-SGD. Our analysis extends to these methods, and notably, we demonstrate that Scaffnew achieves a linear speedup property, previously unattained, where the communication complexity is inversely proportional with the number of nodes. This finding shows that Scaffnew can achieve linear speedup and potentially reduce communication overhead. In the strongly convex setting, we further prove that Scaffnew can achieve linear speedup with network-independent step sizes. Lastly, we numerically explore the advantages of local updates in decentralized networks and validate the effectiveness of our proposed method.
Talk 2: In-depth Analysis of Low-rank Matrix Factorisation in a Federated Setting
Speaker: Constantin Philippenko
Abstract: We analyze a distributed algorithm to compute a low-rank matrix factorization on several clients, each holding a local dataset. Considering a power initialization, we rewrite the previous smooth non-convex problem into a smooth strongly-convex problem that we solve using a parallel Nesterov gradient descent potentially requiring a single step of communication at the initialization step. We obtain a global-local matrix factorization: one part contains information on features shared by all clients (item embedding), while the other part captures the unique characteristics of each client (user embeddings). We provide a linear rate of convergence of the excess loss which depends on the truncated condition number, this result improves the rates of convergence given in the literature, which depend on the condition number. We provide an upper bound on the Frobenius-norm error of reconstruction under the power initialization strategy. We complete our analysis with experiments on both synthetic and real data.
Talk 3: LoCoDL: Communication-Efficient Distributed Optimization with Local Training and Compression
Speaker: Laurent Condat
Abstract: In distributed optimization, and even more in federated learning, communication is the main bottleneck. We introduce LoCoDL, a communication-efficient algorithm that leverages the two techniques of Local training, which reduces the communication frequency, and Compression with a large class of unbiased compressors that includes sparsification and quantization strategies. LoCoDL provably benefits from the two mechanisms and enjoys a doubly-accelerated communication complexity, with respect to the condition number of the functions and the model dimension, in the general heterogenous regime with strongly convex functions.