Loading…
Session: Adaptive Methods
Chair: Ilyas Fatkhullin
Cluster: Optimization For Data Science

Talk 1: The Price of Adaptivity in Stochastic Convex Optimization
Speaker: Oliver Hinder
Abstract: We prove impossibility results for adaptivity in non-smooth stochastic convex optimization. Given a set of problem parameters we wish to adapt to, we define a ``price of adaptivity'' (PoA) that, roughly speaking, measures the multiplicative increase in suboptimality due to uncertainty in these parameters. When the initial distance to the optimum is unknown but a gradient norm bound is known, we show that the PoA is at least logarithmic for expected suboptimality, and double-logarithmic for median suboptimality. When there is uncertainty in both distance and gradient norm, we show that the PoA must be polynomial in the level of uncertainty. Our lower bounds nearly match existing upper bounds, and establish that there is no parameter-free lunch. En route, we also establish tight upper and lower bounds for (known-parameter) high-probability stochastic convex optimization with heavy-tailed and bounded noise, respectively.

Talk 2: Adaptive Online Learning and Optimally Scheduled Optimization
Speaker: Ashok Cutkosky
Abstract: In this talk I will describe some recent advances in online learning, and how these advances result in improved algorithms for stochastic optimization. We will first describe new online optimization algorithms that achieve optimal regret with neither prior knowledge of Lipschitz constants nor bounded domain assumptions, which imply stochastic optimization algorithms that perform as well as SGD with an optimally-tuned learning rate. We will then survey new and improved conversions from online to stochastic optimization that shed light on heuristic learning rate schedules popular in practice, and illustrate how this analysis allows us to begin investigation into identifying an optimal schedule of learning rates. This is in contrast to most literature on adaptive stochastic optimization that typically seeks to compete only with a single fixed learning rate. We will conclude by highlighting open problems in both online and stochastic optimization.

Talk 3: Unveiling the Power of Adaptive Methods Over SGD: A Parameter-Agnostic Perspective
Speaker: Junchi Yang
Abstract: Adaptive gradient methods are popular in optimizing modern machine learning models, yet their theoretical benefits over vanilla Stochastic Gradient Descent (SGD) remain unclear. We examines the convergence of SGD and adaptive methods when their hyperparameters are set without knowledge of problem-specific parameters. First, for smooth functions, we compare SGD to well-known adaptive methods like AdaGrad, Normalized SGD with Momentum (NSGD-M), and AMSGrad. While untuned SGD attains the optimal convergence rate, it comes at the expense of an unavoidable exponential dependence on the smoothness constant. In contrast, several adaptive methods reduce this exponential dependence to polynomial. Secondly, for a broader class of functions characterized by (L0, L1) smoothness, SGD fail without proper tuning. We show NSGD-M achieves a near-optimal rate, despite an exponential dependence on the L1 constant, which we show is unavoidable for a family of normalized momentum methods.

Speakers
IF

Ilyas Fatkhullin

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
OH

Oliver Hinder

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AC

Ashok Cutkosky

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JY

Junchi Yang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Tuesday July 22, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 156 3518 Trousdale Pkwy, 156, Los Angeles, CA 90089

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link