Loading…
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Adaptive and Accelerated First-Order Methods
Chair: Wenzhi Gao
Cluster: nan

Talk 1: Gradient Descent as a Collaborative Game
Speaker: Wenzhi Gao
Abstract: We introduce a framework to accelerate the convergence of gradient-based methods with online learning. The framework learns to update the stepsize in gradient descent with online learning and provably accelerates gradient-based methods. A key insight is to view gradient descent as a collaborative game between the stepsize scheduler and the optimization landscape -- both players working together for faster convergence. We also discuss implications of the framework, including global and local convergence properties and several extensions. Numerical experiments on deterministic convex and nonconvex problems demonstrate the promising performance of our method. Reference: https://arxiv.org/pdf/2411.01803

Talk 2: An Adaptive and Parameter-Free Nesterov's Accelerated Gradient Method
Speaker: Jaewook J. Suh
Abstract: In this talk, we introduce AdaNAG, an adaptive accelerated gradient method based on Nesterov's accelerated gradient (NAG). The algorithm is line-search-free, parameter-free, and achieves the accelerated convergence rates $f(x_k) - f_\star = O(1/k^2)$ and $\min_{i\in\{1, ... ,k\}} \|\nabla f(x_i)\|^2 = O(1/k^3)$ for an $L$-smooth convex function $f$. We provide a Lyapunov analysis for the convergence proof of AdaNAG, which additionally enables us to propose a novel adaptive gradient descent (GD) method, AdaGD. AdaGD achieves the non-ergodic convergence rate $f(x_k) - f_\star = O(1/k)$, like the original GD. Motivated by the relationship between the parameter choice and the convergence guarantee of AdaGD, we obtain a generalized AdaNAG that provides a practically useful variant of AdaNAG. We provide numerical results showing that our method outperforms other recently proposed adaptive methods in certain scenarios.

Talk 3: Stochastic gradient methodswithBlock Coordinate Optimistic Stepsizes
Speaker: Tao Jiang
Abstract: Ill-conditioning is a major challenge for optimization with first-order methods. This is especially the case for stochastic optimization, where preconditioners in the classical sense are hard to construct due to the nature of stochastic gradients. We propose a block-coordinate stepsize rule that can effectively combat ill-conditioning as well as inhomogeneous noise in the stochastic setting. Our method is motivated by minimizing the expected distance to an optimal point during each iteration. Specifically, we use the optimistic stepsizes as if the expected search directions (e.g., stochastic gradients with or without momentum) along each coordinate always point to the optimal point. These stepsizes rely on online estimates of the second-moments of the coordinate-wise search directions. The popular Adam algorithm can be interpreted as a heuristic for such an estimation. Compared with Adam, our method requires fewer hyperparameters, obtains similar or better performance, and is numerically more stable.

Speakers
WG

Wenzhi Gao

Ph.D. student, Stanford University
Name: Wenzhi GaoSecond year Ph.D. student at Stanford ICME, working on large-scale numerical optimization, first-order methods, and online decision-making problems.
avatar for Jaewook J. Suh

Jaewook J. Suh

Name: Jaewook J. SuhAffiliation: Rice University
TJ

Tao Jiang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 112 3501 Trousdale Pkwy, 112, Los Angeles, CA 90089

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link