Session: Privacy Preserving Collaborative Learning
Chair: Sai Praneeth Karimireddy
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)
Talk 1: Efficient Distributed Optimization under Heavy-Tailed Noise
Speaker: Tian Li
Abstract: In distributed learning, to mitigate communication overhead, local updates are often applied before global aggregation, resulting in a nested optimization approach with inner and outer steps. However, heavy-tailed stochastic gradient noise remains a significant challenge, particularly in attention-based models, hindering effective training. In this work, we propose TailOPT, an efficient framework designed to address heavy-tailed noise by leveraging adaptive optimization and novel clipping techniques. We establish convergence guarantees for the TailOPT framework under heavy-tailed noise with potentially unbounded gradient variance and local updates. Among its variants, we propose a memory- and communication-efficient instantiation (named Bi^2Clip) that performs coordinate-wise clipping from both above and below at both the inner and outer optimizers. Bi^2Clip brings about benefits of adaptive optimization (e.g., Adam) without the cost of maintaining or transmitting additional gradient statistics. Empirically, TailOPT, including Bi^2Clip, demonstrates superior performance on several language tasks and models compared with state-of-the-art methods.
Talk 2: Parameter-Efficient Federated Learning Algorithms for Large-Scale Models
Speaker: Kibaek Kim
Abstract: Federated learning (FL) is a new collaborative training paradigm for training large-scale models across distributed data sources without sharing raw data. However, applying FL to large models presents significant challenges in terms of communication efficiency and resource utilization. In this work, we introduce novel parameter-efficient algorithms tailored to FL with large models. Our approach optimizes model updates by reducing the number of parameters communicated across clients. We will present preliminary results based on the new algorithms.
Talk 3: Data-Centric ML need Statistics and Game Theory
Speaker: Sai Praneeth Karimireddy
Abstract: Data is the most important factor determining the quality of an AI system. Data centric ML is an emerging research direction which constructs metrics to quantify usefulness of data (data valuation / data attribution). However, we show that existing methods do not properly take into account the randomness inherent in ML training. We also show that they are not fit to be used as compensation in data markets since they may not be incentive-compatible.