Loading…
Tuesday July 22, 2025 4:15pm - 5:30pm PDT
Session: Machine Learning Algorithms
Chair: Mark Schmidt
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: Leveraging Variable Sparsity to Refine Pareto Stationarity in Multi-Objective Optimization
Speaker: Yaoliang Yu
Abstract: Gradient-based multi-objective optimization (MOO) is essential in modern machine learning, with applications in e.g., multi-task learning, federated learning, algorithmic fairness and reinforcement learning. In this work, we first reveal some limitations of Pareto stationarity, a widely accepted first-order condition for Pareto optimality, in the presence of sparse function-variable structures. Next, to account for such sparsity, we propose a novel solution concept termed Refined Pareto Stationarity (RPS), which we prove is always sandwiched between Pareto optimality and Pareto stationarity. We give an efficient partitioning algorithm to automatically mine the function-variable dependency and substantially trim non-optimal Pareto stationary solutions. Then, we show that gradient-based descent algorithms in MOO can be enhanced with our refined partitioning. In particular, we propose Multiple Gradient Descent Algorithm with Refined Partition (RP-MGDA) as an example method that converges to RPS, while still enjoying a similar per-step complexity and convergence rate. Lastly, we validate our approach through experiments on both synthetic examples and realistic application scenarios where distinct function-variable dependency structures appear. Our results highlight the importance of exploiting function-variable structure in gradient-based MOO and provide a seamless enhancement to existing approaches.

Talk 2: High-dimensional Optimization with Applications to Compute-Optimal Neural Scaling Laws
Speaker: Courtney Paquette
Abstract: Given the massive scale of modern ML models, we now only get a single shot to train them effectively. This restricts our ability to test multiple architectures and hyper-parameter configurations. Instead, we need to understand how these models scale, allowing us to experiment with smaller problems and then apply those insights to larger-scale models. In this talk, I will present a framework for analyzing scaling laws in stochastic learning algorithms using a power-law random features model, leveraging high-dimensional probability and random matrix theory. I will then use this scaling law to address the compute-optimal question: How should we choose model size and hyper-parameters to achieve the best possible performance in the most compute-efficient manner?

Talk 3: A Robustness Metric for Distribution Shifts
Speaker: John Duchi
Abstract: We revisit the stability of optimizers in statistical estimation and stochastic optimization problems, but instead of providing guarantees on the stability of the minimizers themselves, we investigate what shifts to the underlying data-generating process perturb solutions the most. To do so, we develop some new mathematical tools for stability analyses, with guarantees beyond typical differentiable problems. We also make connections with statistical hypothesis testing and discovery, showing how these new results provide certificates of validity---or potential invalidity---of statistical estimate.

Speakers
MS

Mark Schmidt

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YY

Yaoliang Yu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
CP

Courtney Paquette

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JD

John Duchi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Tuesday July 22, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 201 3501 Trousdale Pkwy, 201, Los Angeles, CA 90089

Attendees (1)


Log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link