Loading…
Session: Optimization For Data Science
Chair: Niao He
Cluster: Optimization For Data Science

Talk 1: Momentum & stochasticity — some insights from continuous time models
Speaker: Stephan Wojtowytsch
Abstract: Gradient descent and its variations (stochastic, with or without momentum) are the workhorse of machine learning. We give some examples where we gain insight into high-dimensional optimization problems in a machine learning context from continuous time models and possible effects of large step sizes compared to the continuous dynamics.

Talk 2: On a continuous time model of gradient descent dynamics and instability in deep learning
Speaker: Mihaela Rosca
Abstract: The recipe behind the success of deep learning has been the combination of neural networks and gradient-based optimization. Understanding the behavior of gradient descent however, and particularly its instability, has lagged behind its empirical success. To add to the theoretical tools available to study gradient descent we propose the principal flow (PF), a continuous time flow that approximates gradient descent dynamics. To our knowledge, the PF is the only continuous flow that captures the divergent and oscillatory behaviors of gradient descent, including escaping local minima and saddle points. Through its dependence on the eigendecomposition of the Hessian the PF sheds light on the recently observed edge of stability phenomena in deep learning. Using our new understanding of instability we propose a learning rate adaptation method which enables us to control the trade-off between training stability and test set evaluation performance.

Talk 3: A Hessian-Aware Stochastic Differential Equation for Modelling SGD
Speaker: Zebang Shen
Abstract: Continuous-time approximation of Stochastic Gradient Descent (SGD) is a crucial tool to study its escaping behaviors from stationary points. However, existing stochastic differential equation (SDE) models fail to fully capture these behaviors, even for simple quadratic objectives. Built on a novel stochastic backward error analysis framework, we derive the Hessian-Aware Stochastic Modified Equation (HA-SME), an SDE that incorporates Hessian information of the objective function into both its drift and diffusion terms. Our analysis shows that HA-SME matches the order-best approximation error guarantee among existing SDE models in the literature, while achieving a significantly reduced dependence on the smoothness parameter of the objective. Further, for quadratic objectives, under mild conditions, HA-SME is proved to be the first SDE model that recovers exactly the SGD dynamics in the distributional sense. Consequently, when the local landscape near a stationary point can be approximated by quadratics, HA-SME is expected to accurately predict the local escaping behaviors of SGD.

Speakers
avatar for Stephan Wojtowytsch

Stephan Wojtowytsch

Name: Stephan WojtowytschAffiliation: University of PittsburghBio: I am an assistant professor in the Department of Mathematics at the University of Pittsburgh. My research interests lie in the mathematics of machine learning and data science. Previously, I was an assistant profe... Read More →
MR

Mihaela Rosca

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZS

Zebang Shen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 258 3518 Trousdale Pkwy, 258, Los Angeles, CA 90089

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link