Loading…
arrow_back View All Dates
Monday, July 21
 

8:00am PDT

Auditorium opens for seating
Monday July 21, 2025 8:00am - 8:45am PDT
Monday July 21, 2025 8:00am - 8:45am PDT
USC Bovard Auditorium 3551 Trousdale Pkwy, Los Angeles, CA 90089

8:00am PDT

Check In
Monday July 21, 2025 8:00am - 8:45am PDT
Monday July 21, 2025 8:00am - 8:45am PDT
Taper Hall (THH) 3501 Trousdale Pkwy, Los Angeles, CA 90089

8:45am PDT

Opening Remarks
Monday July 21, 2025 8:45am - 9:00am PDT
Monday July 21, 2025 8:45am - 9:00am PDT
USC Bovard Auditorium 3551 Trousdale Pkwy, Los Angeles, CA 90089

9:00am PDT

Plenary 1
Monday July 21, 2025 9:00am - 10:00am PDT
Speakers
SJ

Stephen J. Wright

UW-Madison
Stephen J. Wright is the George B. Dantzig Professor of Computer Sciences, Sheldon Lubar Chair of Computer Sciences, and Hilldale Professor at the University of Wisconsin-Madison. He also serves as Chair of the Computer Sciences Department. His research is in computational optimization... Read More →
Monday July 21, 2025 9:00am - 10:00am PDT
USC Bovard Auditorium 3551 Trousdale Pkwy, Los Angeles, CA 90089

10:00am PDT

Coffee & Snack Break (Provided)
Monday July 21, 2025 10:00am - 10:30am PDT
Monday July 21, 2025 10:00am - 10:30am PDT
TBA

10:30am PDT

Parallel Sessions 1A: Optimization and Resilience in the Power Grid
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Optimization and Resilience in the Power Grid
Chair: Madeleine Udell
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: Leveraging predictions for optimal voltage control: an adaptive approach
Speaker: Wenqi Cui
Abstract: High variability of solar PV and sudden changes in load (e.g., electric vehicles and storage) can lead to large voltage fluctuations. In recent years, a number of advanced controllers have been designed to optimize voltage control. These controllers, however, almost always assume that the net load in the system remains constant over a sufficiently long time. Given the intermittent and uncertain nature of renewable resources, it is becoming important to explicitly consider net load that is time-varying. This talk will describe an adaptive approach to voltage control in power systems with a significant time-varying net load. We leverage the advances in short-term load forecasting, where the net load in the system can be predicted using local measurements. We integrate these predictions into the design of adaptive controllers, and prove that the overall control architecture achieves input-to-state stability in a decentralized manner. We optimize the control policy through a sample-efficient reinforcement learning framework, which update the control policy successively with online date collection . Case studies are conducted using time-varying load data from Caltech's campus grid.

Talk 2: Learning-enhanced Design and Optimization of Microgrids under Uncertainty
Speaker: Harsha Nagarajan
Abstract: To mitigate the vulnerability of distribution grids to severe weather events, some electric utilities use preemptive de-energization as a primary defense, leading to significant power outages. Networked microgrids can enhance resiliency and maximize load delivery, but challenges arise from modeling unbalanced three-phase networks and managing uncertainties in renewables and loads. We present a two-stage mixed-integer robust optimization approach to configure and operate networked microgrids, ensuring robustness to all net-load uncertainties while maximizing load delivery. To solve this problem, we propose an ML-accelerated cutting-plane algorithm with convergence guarantees, which approximates a recourse function with cuts predicted by an ML regressor for worst-case uncertain scenarios. A case study on the IEEE 37-bus system demonstrates the economic benefits of networking microgrids to maximize load delivery.

Talk 3: A Reliability Puzzle for Large Scale Batteries
Speaker: Steven Diamond
Abstract: Large scale batteries are playing an increasingly prominent role in electricity markets. Battery operators earn revenue through two main sources: energy arbitrage and ancillary services. Ancillary services are commitments to support the grid through actions outside of the standard market mechanism. For example, a battery might promise to provide extra power to the grid in an emergency. In this talk we explore the reliability opportunities and challenges posed by batteries selling ancillary services. We discuss the contrasting approaches taken by the California and Texas electricity markets and propose a new mechanism that better aligns the interests of market regulators and battery operators.

Speakers
avatar for Madeleine Udell

Madeleine Udell

Postdoctoral Fellow, Caltech Center for the Mathematics of Information
Madeleine Udell is a postdoctoral fellow at Caltech's Center for the Mathematics of Information, hosted by Joel Tropp. She will be joining Cornell as an Assistant Professor in the School of Operations Research and Information Engineering in July 2016. Her research focus is on modeling... Read More →
avatar for Harsha Nagarajan

Harsha Nagarajan

Los Alamos National Laboratory
Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SD

Steven Diamond

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 101 3501 Trousdale Pkwy, 101, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1B: Regularization Methods for stochastic operator splitting dynamics
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Regularization Methods for stochastic operator splitting dynamics
Chair: Mathias Staudigl
Cluster: Nonsmooth Optimization

Talk 1: Tikhonov Regularization for Stochastic Non-Smooth Convex Optimization in Hilbert Spaces
Speaker: Rodrigo Maulen
Abstract: To solve convex optimization problems with a noisy gradient input, we analyze the global behavior of subgradient-like flows under stochastic errors. The objective function is composite, being equal to the sum of two convex functions, one being differentiable and the other potentially non-smooth. We then use stochastic differential inclusions where the drift term is minus the subgradient of the objective function, and the diffusion term is either bounded or square-integrable. In this context, under Lipschitz's continuity of the differentiable term and a growth condition of the non-smooth term, our first main result shows almost sure weak convergence of the trajectory process towards a minimizer of the objective function. Then, using Tikhonov regularization with a properly tuned vanishing parameter, we can obtain almost sure strong convergence of the trajectory towards the minimum norm solution. We find an explicit tuning of this parameter when our objective function satisfies a local error-bound inequality. We also provide a comprehensive complexity analysis by establishing several new pointwise and ergodic convergence rates in expectation for the convex and strongly convex case.

Talk 2: Accelerated Variance-Reduced Forward-Reflected Methods for Generalized Equations
Speaker: Quoc Tran-Dinh
Abstract: In this talk, we propose a novel class of Nesterov’s stochastic accelerated forward- reflected-based methods with variance reduction to solve [non]linear equations under a co-coerciveness assumption. Our algorithm is single-loop and leverages a new family of unbiased variance-reduced estimators specifically designed for a wider class of root-finding algorithms. It achieves both O(L^{2}/k^{2}) and o(1/k^2)-last-iterate convergence rates in terms of expected operator squared norm, where k denotes the iteration counter. We instantiate our framework for two prominent estimators: SVRG and SAGA. By an appropriate choice of parameters, both variants attain an oracle complexity of O(n+ Ln^{2/3}ϵ−1) to reach an ϵ-solution, where n represents the number of summands in the finite-sum operator. Furthermore, under µ-strong quasi-monotonicity, our method achieves a linear convergence rate and an oracle complexity of O(n+ max{n,n^{2/3}κ} log(ϵ−1)), where κ := L/µ. We extend our approach to solve a class of finite-sum monotone inclusions, demonstrating that our schemes retain the same theoretical guarantees as in the equation setting. Finally, we present some numerical experiments to validate our algorithms and demonstrate their promising performance compared to state-of-the-art methods.

Talk 3: Tikhonov regularized exterior penalty dynamics for constrained variational inequalities
Speaker: Mathias Staudigl
Abstract: A recent trend in optimization and variational analysis is the investigation of dynamical systems, either in discrete or continuous time, with explicit regularization of the trajectory using either Tikhonov of the Halpern method. These schemes exhibit a higher degree of stability and in some cases also exhibit interesting acceleration phenomena. Within an infinite-dimensional Hilbert space context, this paper is concerned with the study of a class of constrained variational inequalities in which the constrained domain can be represented as the zero set of a maximally monotone operator. The Tikhonov regularization parameter is assumed to tend to zero as time tends to infinity, which preserves equilibria and serves to enforce strong convergence of the trajectory. To steer the dynamical system towards the feasible set over which the original variational problem is to be solved, we append to the trajectory and exterior penalization term. The combination of the Tikhonov regularization and the penalization lead to a multiscale dynamical system, pioneered in the work of Attouch, Czarnecki, Peypouqet, and many others. In this paper we study the case of a cocoercive and non-cocoercive monotone operator separately. Full Splitting based dynamical systems, based on the Forward-Backward and Forward-Backward-Forward Splitting method are constructed. Proofs of existence and uniqueness of the solution trajectories of these new time-varying dynamical systems, as well as the strong convergence towards the least norm solution of the underlying variational inequality problem are derived. Numerical experiments conclude this work, showing the reconstruction power of the approach.

Speakers
RM

Rodrigo Maulen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
QT

Quoc Tran-Dinh

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MS

Mathias Staudigl

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 201 3501 Trousdale Pkwy, 201, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1C: Methods for Large-Scale Nonlinear Optimization I
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Methods for Large-Scale Nonlinear Optimization I
Chair: Albert Berahas
Cluster: Nonlinear Optimization

Talk 1: Advanced, Adaptive and Flexible Algorithms for Decentralized Optimization
Speaker: Albert Berahas
Abstract: The problem of optimizing an objective function by employing a decentralized procedure using multiple agents in a connected network has gained significant attention over the last decades. This is due to the wide applicability of decentralized optimization to many important science and engineering applications such as, optimal control, machine learning, robotics, sensor networks, and smart grids. Decentralized optimization problems come in diverse shapes and forms, and could have very different characteristics. In this talk, we discuss novel flexible approaches for solving decentralized optimization problems that adapt to problem characteristics. We present two unifying algorithmic frameworks that recover popular algorithms as special cases. We discuss the rationale behind our proposed techniques, convergence in expectation and complexity guarantees for our algorithms, and present encouraging numerical results.

Talk 2: Nearly Optimal L_p Risk Minimization
Speaker: Zhichao Jia
Abstract: Convex risk measures play a foundational role in the area of stochastic optimization. However, in contrast to risk neutral models, their applications are still limited due to the lack of efficient solution methods. In particular, the mean L_p semi-deviation is a classic risk minimization model, but its solution is highly challenging due to the composition of concave-convex functions and the lack of uniform Lipschitz continuity. In this talk, we discuss some progresses on the design of efficient algorithms for L_p risk minimization, including a novel lifting reformulation to handle the concave-convex composition, and a new stochastic approximation method to handle the non-Lipschitz continuity. We establish an upper bound on the sample complexity associated with this approach and show that this bound is not improvable for L_p risk minimization in general.

Talk 3: Higher order polynomial model-based derivative-free methods
Speaker: Abraar Chaudhry
Abstract: Coming sTraditional model-based derivative free methods, such as those pioneered by Powell, iteratively construct second-order interpolation models of the objective function and optimize these models over trust-regions. Higher order polynomials may provide better models, however, they are hard to optimize. We propose a new approach of constructing and optimizing higher degree polynomial models for derivative free optimization. This is done using techniques from polynomial optimization and sum of squares. We will discuss the practical and theoretical properties of our method. oon

Speakers
AB

Albert Berahas

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZJ

Zhichao Jia

PhD Student, Georgia Institute of Technology
Name: Zhichao JiaTitle: PhD StudentAffiliation: Georgia Institute of Technology
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 208 3501 Trousdale Pkwy, 208, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1D: GPU-Accelerated Mathematical Programming (Part 1)
Monday July 21, 2025 10:30am - 11:45am PDT
Session: GPU-Accelerated Mathematical Programming (Part 1)
Chair: Haihao Lu
Cluster: Computational Software

Talk 1: HPR-LP: An implementation of an HPR method for solving linear programming
Speaker: Defeng Sun
Abstract: In this talk, we aim to introduce an HPR-LP solver, an implementation of a Halpern Peaceman-Rachford (HPR) method with semi-proximal terms for solving linear programming (LP). We start with showing that the HPR method enjoys the highly desired iteration complexity of O(1/k) in terms of the Karush-Kuhn-Tucker residual and the objective error via the theory developed recently for accelerated degenerate proximal point methods. Based on the complexity results, we then design an adaptive strategy of restart and penalty parameter update to improve the efficiency and robustness of the HPR method. We conduct extensive numerical experiments on different LP benchmark datasets using NVIDIA A100-SXM4-80GB GPU in different stopping tolerances. Our solver's Julia version achieves a 2.39x to 5.70x speedup measured by SGM10 on benchmark datasets with presolve (2.03x to 4.06x without presolve) over the award-winning solver PDLP with the tolerance of 10^{-8}. Several practical techniques underlining the efficiency of solver will be highlighted.

Talk 2: Restarted Halpern PDHG for linear programming
Speaker: Jinwen Yang
Abstract: We propose and analyze a new matrix-free primal-dual algorithm, called restarted Halpern primal-dual hybrid gradient (rHPDHG), for solving linear programming (LP). We show that rHPDHG can achieve optimal accelerated linear convergence on feasible and bounded LP. Furthermore, we present a refined analysis that demonstrates an accelerated two-stage convergence of rHPDHG over the vanilla PDHG with an improved complexity for identification and an accelerated eventual linear convergence that does not depend on the conservative global Hoffman constant. Regarding infeasible LP, we show that rHPDHG can recover infeasibility certificates with an accelerated linear rate, improving the previous convergence rates. Furthermore, we discuss an extension of rHPDHG by adding reflection operation (which is dubbed as ), and demonstrate that it shares all theoretical guarantees of rHPDHG with an additional factor of 2 speedup in the complexity bound. Lastly, we build up a GPU-based LP solver, and the experiments showcase an improved numerical performance compared to cuPDLP.jl.

Talk 3: GPU-Accelerated Linear Programming and Beyond
Speaker: Haihao Lu
Abstract: In this talk, I will talk about the recent trend of research on new first-order methods for scaling up and speeding up linear programming (LP) and convex quadratic programming (QP) with GPUs. The state-of-the-art solvers for LP and QP are mature and reliable at delivering accurate solutions. However, these methods are not suitable for modern computational resources, particularly GPUs. The computational bottleneck of these methods is matrix factorization, which usually requires significant memory usage and is highly challenging to take advantage of the massive parallelization of GPUs. In contrast, first-order methods (FOMs) only require matrix-vector multiplications, which work well on GPUs and have already accelerated machine learning training during the last 15 years. This ongoing line of research aims to scale up and speed up LP and QP by using FOMs and GPUs. I’ll discuss how we can achieve this by explaining: (i) the behaviors of FOMs for LP; (ii) computational results on GPUs; (iii) theoretical results, including complexity theory and condition number theory, and how theory can lead to better computation and better understanding of the algorithm’s performance. If time permits, I’ll also talk about how to extend it to QP.

Speakers
DS

Defeng Sun

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JY

Jinwen Yang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HL

Haihao Lu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 210 3501 Trousdale Pkwy, 210, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1E: Large-scale Optimization Algorithms and Implementations
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Large-scale Optimization Algorithms and Implementations
Chair: Johannes Brust
Cluster: Computational Software

Talk 1: GPU Implementation of Algorithm NCL
Speaker: Michael Saunders
Abstract: For constrained optimization, LANCELOT solves about 10 subproblems that minimize an augmented Lagrangian subject to bounds and are immune to LICQ and MPEC difficulties. Algorithm NCL solves equivalent subproblems that are suited to nonlinear interior methods. We focus on reducing the associated KKT systems to smaller systems that can be solved by Cholesky-type factorizations. Our NCL implementation is based on MadNLP.jl (a nonlinear optimization solver working on GPU) and CUDSS.jl (a Julia interface to the NVIDIA library cuDSS). We present numerical results on large SCOPF problems (which are MPECS). (joint work with Alexis Montoison, François Pacaud, Sungho Shin, Dominique Orban)

Talk 2: A Two-Stage Optimization Based Algorithm for Tensor Decomposition
Speaker: Zequn Zheng
Abstract: Tensor canonical polyadic decomposition is important in exploring the multi-dimensional tensor structure. It has become an increasingly active area of research and has important applications in data science, statistics, and engineering. However, it is difficult to find a tensor decomposition when the tensor's rank is greater than the second largest dimension. In this case, traditional optimization methods like nonlinear least squares or alternative least squares methods usually fail to find the tensor decomposition. Direct methods typically suffer from high computation costs. We propose a novel two-stage optimization based algorithm for the general tensor decomposition problem when the rank is between the largest dimension and the second largest dimension. We will discuss the equivalence between tensor decompositions and the global minimizers of the two-stage optimization problems. We will also show promising numerical results of our algorithm compared with other state-of-the-art methods for tensor decomposition. (joint work with Hongchao Zhang)

Talk 3: A nonsmooth exact penalty method for equality-constrained optimization: Complexity and implementation
Speaker: Dominique Orban
Abstract: Penalty methods are a well known class of algorithms for constrained optimization. They transform a constrained problem into a sequence of unconstrained penalized problems in the hope that approximate solutions of the latter converge to a solution of the former. If Lagrange multipliers exist, exact penalty methods ensure that the penalty parameter only need increase a finite number of times, but are typically scorned in smooth optimization for the penalized problems are not smooth. This led researchers to consider the implementation of exact penalty methods inconvenient. Recently, advances in proximal methods have led to increasingly efficient solvers for nonsmooth optimization. We show that the exact ℓ2-penalty method for equality-constrained optimization can in fact be implemented efficiently by solving the penalized problem with a proximal-type algorithm. We study the convergence of our algorithm and establish a worst-case complexity bound of O(ϵ^{−2}) to bring a stationarity measure below ϵ > 0 under the Mangarasian-Fromowitz constraint qualification and Lipschitz continuity of the objective gradient and constraint Jacobian. In a degenerate scenario where the penalty parameter grows unbounded, the complexity becomes O(ϵ^{−8}), which is worse than another bound found in the literature. We justify the difference by arguing that our feasibility measure is properly scaled. Finally, we report numerical experience on small-scale problems from a standard collection and compare our solver with an augmented-Lagrangian and an SQP method. Our preliminary implementation is on par with the augmented Lagrangian in terms of robustness and efficiency. It is on par with the SQP method in terms of robustness, though the former remains ahead in terms of number of problem function evaluations.

Speakers
avatar for Michael Saunders

Michael Saunders

Professor (Research) Emeritus, Stanford University
Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
DO

Dominique Orban

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 212 3501 Trousdale Pkwy, 212, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1F: Optimization as the Engine of Generative AI - I
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Optimization as the Engine of Generative AI - I
Chair: Yinbin Han
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)

Talk 1: InfAlign: Inference-aware language model alignment
Speaker: Theertha Suresh
Abstract: Language model alignment is a critical step in training modern generative language models. Alignment targets to improve the win rate of a sample from the aligned model against the base model. Today, we are increasingly using inference-time algorithms (e.g., Best-of-N, controlled decoding, tree search) to decode from language models rather than standard sampling. In this talk, we will first overview different inference-time algorithms and the standard RLHF procedure. We then show that this train/test mismatch makes the standard RLHF framework sub-optimal in view of such inference-time methods. To this end, we propose a framework for inference-aware alignment (InfAlign), which aims to optimize inference-time win rate of the aligned policy against the base model. We prove that for any inference-time decoding procedure, the optimal aligned policy is the solution to the standard RLHF problem with a transformation of the reward. This motivates us to provide the calibrate-and-transform RL (InfAlign-CTRL) algorithm to solve this problem, which involves a reward calibration step and a KL-regularized reward maximization step with a transformation of the calibrated reward. For best-of-N sampling and best-of-N jailbreaking, we propose specific transformations offering up to 3-8% improvement on inference-time win rates. Finally, we also show that our proposed reward calibration method is a strong baseline for optimizing standard win rate.

Talk 2: LLMs for MILP Solver Configuration
Speaker: Connor Lawless
Abstract: Mixed integer linear programming (MILP) solvers ship with a staggering number of parameters that are challenging to select a priori for all but expert optimization users, but can have an outsized impact on the performance of the MILP solver. We introduce a new LLM-based framework to configure which cutting plane separators to use for a given MILP problem with little to no training data based on characteristics of the instance, such as a natural language description of the problem and the associated LaTeX formulation. Our LLM-based methodology requires no custom solver interface, can find a high-performing configuration by solving only a small number of MILPs, and can generate the configuration with simple API calls that run in under a second.

Talk 3: Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence
Speaker: Yinbin Han
Abstract: Diffusion models have emerged as powerful tools for generative modeling, demonstrating exceptional capability in capturing target data distributions from large datasets. However, fine-tuning these massive models for specific downstream tasks, constraints, and human preferences remains a critical challenge. While recent advances have leveraged reinforcement learning algorithms to tackle this problem, much of the progress has been empirical, with limited theoretical understanding. To bridge this gap, we propose a stochastic control framework for fine-tuning diffusion models. Building on denoising diffusion probabilistic models as the pre-trained reference dynamics, our approach integrates linear dynamics control with Kullback-Leibler regularization. We establish the well-posedness and regularity of the stochastic control problem and develop a policy iteration algorithm (PI-FT) for numerical solution. We show that PI-FT achieves global convergence at a linear rate. Unlike existing work that assumes regularities throughout training, we prove that the control and value sequences generated by the algorithm maintain the regularity. Additionally, we explore extensions of our framework to parametric settings and continuous-time formulations.

Monday July 21, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 156 3518 Trousdale Pkwy, 156, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1G: Robust decision making in dynamic environments
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Robust decision making in dynamic environments
Chair: Tobias Sutter
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: Towards Optimal Offline Reinforcement Learning
Speaker: Mengmeng Li
Abstract: We study offline reinforcement learning problems with a long-run average reward objective. The state-action pairs generated by any fixed behavioral policy thus follow a Markov chain, and the empirical state-action-next-state distribution satisfies a large deviations principle. We use the rate function of this large deviations principle to construct an uncertainty set for the unknown true state-action-next-state distribution. We also construct a distribution shift transformation that maps any distribution in this uncertainty set to a state-action-next-state distribution of the Markov chain generated by a fixed evaluation policy, which may differ from the unknown behavioral policy. We prove that the worst-case average reward of the evaluation policy with respect to all distributions in the shifted uncertainty set provides, in a rigorous statistical sense, the least conservative estimator for the average reward under the unknown true distribution. This guarantee is available even if one has only access to one single trajectory of serially correlated state-action pairs. The emerging robust optimization problem can be viewed as a robust Markov decision process with a non-rectangular uncertainty set. We adapt an efficient policy gradient algorithm to solve this problem. Numerical experiments show that our methods compare favorably against state-of-the-art methods.

Talk 2: Optimal Decision Making in Abstract Stochastic Environments
Speaker: Radek Salac
Abstract: Given data from an abstract stochastic process, we study how to construct an optimal decision for a general stochastic optimization problem in a statistically optimal manner. Our approach seeks to identify decisions, where the corresponding risk of the shifted regret, evaluated on the underlying stochastic data process, converges to zero at a specified exponential rate under a minimal shift. This optimal decision emerges as a solution to a specific multi-objective optimization problem driven by the properties of the data-generating process, particularly by a corresponding rate function—a generalization of the well-known concept from large deviation theory. Moreover, the regret of such decision itself is proven to converge to zero, providing a notion of consistency. Our findings are established within a highly abstract framework, of which the above interpretation is a mere instance. The characterization of risk, crucial to our main results, is grounded in the concept of a so-called maxitive integral and its properties, which resemble the less universal Varadhan Lemma in the context of asymptotic relative entropy. Several well-known results from the literature on data-driven decision-making under uncertainty can be recovered as special cases within our general framework.

Talk 3: Revisiting Model Selection for Math Programming-Based Approximate Dynamic Programming
Speaker: Negar Soheili
Abstract: Approximations of Markov decision processes are widely used to develop policies for large-scale sequential decision-making problems. Math-programming-based approximate dynamic programming (ADP) methods, such as approximate linear programming (ALP) and pathwise optimization (PO), are notable for providing both policies and upper bounds on policy performance. ALP and PO typically solve large-scale linear or convex programming models, with recent advances in first-order methods improving solvability. Traditionally, ADP models are compared, assuming fixed features and that models can be solved optimally, in which case PO outperforms ALP. However, with machine learning techniques like random features, both ALP and PO become asymptotically optimal as more features are added. We revisit model selection between ALP and PO under random features and introduce a new ALP relaxation that improves both quality and solvability compared to the original ALP for fixed features. When no clear dominance exists between the ALP relaxation and PO models, solution methods and model structure become critical. Our ALP relaxation has a separability structure, making it preferable to PO both theoretically and numerically when using block-coordinate descent, the state-of-the-art method for solving PO. These findings offer new insights into model selection for math-programming-based ADP, where feature architecture, model, and solution method must be considered collectively, which is potentially relevant beyond these specific approaches.

Speakers
TS

Tobias Sutter

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ML

Mengmeng Li

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RS

Radek Salac

Doctoral Student, University of Konstanz
Name: Radek SalacTitle: Doctoral Student at Chair of Machine Learning and OptimizationAffiliation: University of Konstanz
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 114 3501 Trousdale Pkwy, 114, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1H: Frontiers of Optimization for Machine Learning - Part I
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Frontiers of Optimization for Machine Learning - Part I
Chair: Fred Roosta
Cluster: Nonlinear Optimization

Talk 1: A KL-based Analysis Framework with Applications to Non-Descent Optimization Methods
Speaker: Andre Milzarek
Abstract: In this talk, we discuss a novel analysis framework for non-descent-type optimization methodologies in nonconvex scenarios based on the Kurdyka-Lojasiewicz property. The proposed framework allows covering a broad class of algorithms, including those commonly employed in stochastic and distributed optimization. Specifically, it enables the asymptotic analysis of first-order methods that lack a sufficient descent property and do not require access to full (deterministic) gradient information. We leverage this framework to establish, for the first time, iterate convergence and corresponding rates of convergence for the decentralized gradient method and federated averaging under mild assumptions. Additional applications of the developed analysis techniques and new convergence results will be shown (e.g., for the random reshuffling and the stochastic gradient descent method) to illustrate generality of the approach.

Talk 2: Rockafellian Relaxation and Stochastic Optimization Under Perturbations
Speaker: Johannes Royset
Abstract: In practice, optimization models are often prone to unavoidable inaccuracies because of dubious assumptions and corrupted data. Traditionally, this placed special emphasis on risk-based and robust formulations, and their focus on “conservative” decisions. We develop, in contrast, an “optimistic” framework based on Rockafellian relaxations in which optimization is conducted not only over the original decision space but also jointly with a choice of model perturbation. The framework enables us to address challenging problems with ambiguous probability distributions from the areas of two-stage stochastic optimization without relatively complete recourse, probability functions lacking continuity properties, expectation constraints, and outlier analysis. We are also able to circumvent the fundamental difficulty in stochastic optimization that convergence of distributions fails to guarantee convergence of expectations. The framework centers on the novel concepts of exact and limit-exact Rockafellians, with interpretations of “negative” regularization emerging in certain settings. We illustrate the role of Phi-divergence, examine rates of convergence under changing distributions, and explore extensions to first-order optimality conditions. The main development is free of assumptions about convexity, smoothness, and even continuity of objective functions. Numerical results in the setting of computer vision and text analytics with label noise illustrate the framework.

Talk 3: Convergence analyses under computable models of nonsmoothness
Speaker: Vivak Patel
Abstract: In this talk, we discuss limitations of popular approaches to automatic differentiation for nonsmooth functions, and the issues that they can create for the reliability of gradient-based methods. We then discuss correct ways of computing differentials for nonsmooth functions, and we analyze the behavior of gradient-based methods under these corrected differentials.

Speakers
FR

Fred Roosta

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AM

Andre Milzarek

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JR

Johannes Royset

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
VP

Vivak Patel

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 116 3501 Trousdale Pkwy, 116, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1I: Nonconvex Optimization and Applications
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Nonconvex Optimization and Applications
Chair: Bissan Ghaddar
Cluster: Interplay Between Continuous and Discrete Optimization

Talk 1: Solution framework for the facility location problem under endogenous road congestion
Speaker: Marten Soer
Abstract: We consider a bi-level optimization problem, where the upper-level decision maker chooses locations of facilities, by minimizing the total system travel time of the lower-level decision makers. While the lower-level decision makers, in turn, choose their fastest routes to the facilities. When the number of lowerlevel decision makers is large, their route choices affect the road congestion and thus the travel times. This setup is inspired by locating remote offices within the highly congested road network of Mexico City, but it is also applicable to other problems, such as the placement of electric vehicle charging stations and modeling electricity networks. To the best of our knowledge, the current literature focuses on finding a heuristic solution approach due to the complexity of the problem, and no approximation or exact solution framework for large instances exist. We study the properties of the problem and introduce an approximation algorithm that allows evaluating the quality of existing solutions and obtaining new close-to-optimal solutions. Moreover, we conduct numerical experiments in real-life instances to illustrate the applicability of the proposed endogenous road congestion solution framework and the effect when opening facility locations.

Talk 2: Finding long TSP tours in the unit square using the FICO Xpress global optimization solver
Speaker: Imre Polik
Abstract: We will investigate the problem of finding a set of points in the unit square such that the length of the optimal TSP tour over them is maximal. We will look at some interesting duality relations and provide nonconvex quadratic computational formulations. Additionally, we will look at related problems such as matching and the minimum pairwise distance. The theory will enable us to conduct computational experiments with the FICO Xpress global optimization MINLP solver and to find previously unknown optimal configurations. We will present some theoretical results and formulate several conjectures.

Talk 3: Facial Reduction for Semidefinite Relaxations of Combinatorial Optimization Problems
Speaker: Hao Hu
Abstract: In this talk, we present new findings on facial reduction for semidefinite relaxations of combinatorial optimization problems. In semidefinite programming (SDP), Slater’s condition is crucial for both theoretical convergence guarantees and the practical performance of optimization algorithms. When Slater’s condition fails, facial reduction can restore it through a finite sequence of reformulations. However, these reformulations often involve solving auxiliary optimization problems that can be as challenging as the original. Recent research has therefore focused on developing more efficient strategies for performing facial reduction. In our work, we specifically consider SDP problems that arise as relaxations of combinatorial optimization problems. This perspective enables us to exploit the underlying combinatorial structure, allowing the development of novel and highly efficient facial reduction techniques. We also establish theoretical results demonstrating the effectiveness of our approach. Numerical experiments further show that applying our specialized facial reduction method significantly improves both the speed and accuracy of solving SDP problems.

Speakers
MS

Marten Soer

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
IP

Imre Polik

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 100 3518 Trousdale Pkwy, 100, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1J: Distributed Optimization and Learning
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Distributed Optimization and Learning
Chair: Shi Pu
Cluster: Multi-agent Optimization and Games

Talk 1: Variance-reduced accelerated methods for decentralized stochastic double-regularized nonconvex strongly-concave minimax problems
Speaker: Yangyang Xu
Abstract: In this talk, I will present an algorithmic framework for solving the decentralized, stochastic nonconvex strongly-concave (NCSC) minimax problem with nonsmooth regularization terms on both primal and dual variables, wherein a network of $m$ computing agents collaborate via peer-to-peer communications. We consider when the coupling function is in expectation or finite-sum form and the double regularizers are convex functions, applied separately to the primal and dual variables. Our algorithmic framework introduces a Lagrangian multiplier to eliminate the consensus constraint on the dual variable. Coupling this with variance-reduction (VR) techniques, our proposed method, entitled VRLM, by a single neighbor communication per iteration, is able to achieve an $\mathcal{O}(\kappa^3\varepsilon^{-3})$ sample complexity under the general stochastic setting, with either a big-batch or small-batch VR option, where $\kappa$ is the condition number of the problem and $\varepsilon$ is the desired solution accuracy. With a big-batch VR, we can additionally achieve $\mathcal{O}(\kappa^2\varepsilon^{-2})$ communication complexity. Under the special finite-sum setting, our method with a big-batch VR can achieve an $\mathcal{O}(n + \sqrt{n} \kappa^2\varepsilon^{-2})$ sample complexity and $\mathcal{O}(\kappa^2\varepsilon^{-2})$ communication complexity, where $n$ is the number of components in the finite sum. All complexity results match the best-known results achieved by a few existing methods for solving special cases of the problem we consider. To the best of our knowledge, this is the first work which provides convergence guarantees for NCSC minimax problems with general convex nonsmooth regularizers applied to both the primal and dual variables in the decentralized stochastic setting.

Talk 2: A Moreau Envelope Approach for LQR Meta-Policy Estimation
Speaker: César Uribe
Abstract: We study the problem of policy estimation for the Linear Quadratic Regulator (LQR) in discrete-time linear time-invariant uncertain dynamical systems. We propose a Moreau Envelope-based surrogate LQR cost, built from a finite set of realizations of the uncertain system, to define a meta-policy efficiently adjustable to new realizations. Moreover, we design an algorithm to find an approximate first-order stationary point of the meta-LQR cost function. Numerical results show that the proposed approach outperforms naive averaging of controllers on new realizations of the linear system. We also provide empirical evidence that our method has better sample complexity than Model-Agnostic Meta-Learning (MAML) approaches.

Talk 3: An Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization
Speaker: Emre Sahinoglu
Abstract: We investigate the finite-time analysis of finding \goldstat points for nonsmooth nonconvex objectives in decentralized stochastic optimization. A set of agents aim at minimizing a global function using only their local information by interacting over a network. We present a novel algorithm, called Multi Epoch Decentralized Online Learning (ME-DOL), for which we establish the sample complexity in various settings. First, using a recently proposed online-to-nonconvex technique, we show that our algorithm recovers the optimal convergence rate of smooth nonconvex objectives. We then extend our analysis to the nonsmooth setting, building on properties of randomized smoothing and Goldstein-subdifferential sets. We establish the sample complexity of $O(\delta^{-1}\epsilon^{-3})$, which to the best of our knowledge is the first finite-time guarantee for decentralized nonsmooth nonconvex stochastic optimization in the first-order setting (without weak-convexity), matching its optimal centralized counterpart. We further prove the same rate for the zero-order oracle setting without using variance reduction.

Speakers
YX

Yangyang Xu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for César Uribe

César Uribe

Assistant Professor, Electrical and Computer Engineering, Rice University
César A. Uribe received his BSc. in Electronic Engineering from Universidad de Antioquia in 2010. He then received an MSc. in Systems and Control from Delft University of Technology in the Netherlands in 2013. In 2016, be received an MSc. in Applied Mathematics from the University... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 102 3501 Trousdale Pkwy, 102, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1K: Optimization for GenAI -- diffusion models and LLMs
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Optimization for GenAI -- diffusion models and LLMs
Chair: Wenpin Tang
Cluster: Optimization For Data Science

Talk 1: Gradient Guidance for Diffusion Models: An Optimization Perspective
Speaker: Minshuo Chen
Abstract: Diffusion models have demonstrated empirical successes in various applications and can be adapted to task-specific needs via guidance. This talk introduces a form of gradient guidance for adapting or fine-tuning diffusion models towards user-specified optimization objectives. We study the theoretic aspects of a guided score-based sampling process, linking the gradient-guided diffusion model to first-order optimization. We show that adding gradient guidance to the sampling process of a pre-trained diffusion model is essentially equivalent to solving a regularized optimization problem, where the regularization term acts as a prior determined by the pre-training data. We further consider an iteratively fine-tuned version of gradient-guided diffusion where one can query gradients at newly generated data points and update the score network using new samples. This process mimics a first-order optimization iteration in expectation, for which we prove O(1/K) convergence rate to the global optimum when the objective function is concave.

Talk 2: RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Speaker: Hanyang Zhao
Abstract: Recently, numerous preference optimization algorithms have been introduced as extensions to the Direct Preference Optimization (DPO) family. While these methods have successfully aligned models with human preferences, there is a lack of understanding regarding the contributions of their additional components. Moreover, fair and consistent comparisons are scarce, making it difficult to discern which components genuinely enhance downstream performance. In this work, we propose RainbowPO, a unified framework that demystifies the effectiveness of existing DPO methods by categorizing their key components into seven broad directions. We integrate these components into a single cohesive objective, enhancing the performance of each individual element. Through extensive experiments, we demonstrate that RainbowPO outperforms existing DPO variants. Additionally, we provide insights to guide researchers in developing new DPO methods and assist practitioners in their implementations.

Talk 3: A preliminary study on the generation process of diffusion models with different noise distributions
Speaker: Nanshan Jia
Abstract: We propose a class of structured diffusion models, in which the prior distribution is chosen as a mixture of Gaussians, rather than a standard Gaussian distribution. The specific mixed Gaussian distribution, as prior, can be chosen to incorporate certain structured information of the data. We develop a simple-to-implement training procedure that smoothly accommodates the use of mixed Gaussian as prior. Theory is provided to quantify the benefits of our proposed models, compared to the classical diffusion models. Numerical experiments with synthetic, image and operational data are conducted to show comparative advantages of our model. Our method is shown to be robust to mis-specifications and in particular suits situations where training resources are limited or faster training in real time is desired.

Speakers
WT

Wenpin Tang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MC

Minshuo Chen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HZ

Hanyang Zhao

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
NJ

Nanshan Jia

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 118 3501 Trousdale Pkwy, 118, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1L: Applications of polynomial optimization to data analysis I
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Applications of polynomial optimization to data analysis I
Chair: Victor Magron
Cluster: Conic and Semidefinite Optimization

Talk 1: Learning polynomial Lyapunov functions from fixed data
Speaker: Oumayma Khattabi
Abstract: Stability analysis, a cornerstone of control theory, usually relies on an explicit system model. Recently, a surge of data-driven control methods occurred, due to data abundance, high compu- tational power and model shortcomings (inaccuracies, model complexity...). This has given cause to new frameworks relying less on modelling and more on data. In particular, optimization-based frameworks, that minimize generalization errors, play a key role in this area of research, see e.g. Kernel Predictive Control (KPC) and Data-enabled Predictive Control (DeePC). More specifically, polynomial optimization and sum-of-squares (SoS) programming found an important application in this field, in a recent contribution by T. Martin and F. Allg¨ower, relying on polynomial repre- sentation of model-free nonlinear systems, through a set-membership characterization for Taylor polynomials derived from a noisy dataset. For better accuracy, multiple Taylor polynomials are combined into a piecewise polynomial representation, which enhances system property inference and allows for the verification of dissipativity properties. In this context, we propose a novel data- driven methodology to compute certified inner approximations of a region of attraction, for an equilibrium point associated to a dynamical system with unknown model. The approach requires an input-output dataset and information on the variations of the dynamics (e.g. Lipschitz bounds), and returns a Lyapunov function, valid for any dynamical system that matches the dataset and bounds given. To perform this task, the (compact) admissible state space is first partitioned into an appropriate tessellation, after which a polynomial Lyapunov candidate is assigned to each of the resulting cells. The Lyapunov condition is enforced on each cell, and complemented with boundary conditions enforcing continuity of the resulting global, piecewise polynomial Lyapunov candidate. A key contribution is that the Lyapunov condition is split in learning subproblems, following the observation that the more datapoints, the more difficult to analyze the uncertainty set for the ground truth. The whole learning problem can be recast under the form of an SoS programming problem, resulting in semidefinite programming problems (SDP) of increasing size. Interestingly, thanks to our data-wise splitting, the special case of degree one, i.e. piecewise-affine Lyapunov candidates, can be relaxed into a second order cone programming problem (SOCP) while main- taining convergence guarantees, resulting in much faster computations than the higher degree SDP formulations. Another key contribution of this work, for higher degrees of the polynomial, is the inclusion of Lagrangian duality, which hasn’t figured in previous works in data-driven SoS program- ming for dynamical systems. This approach opens the door to a probabilistic interpretation of the methodology.

Talk 2: A sparsified Christoffel function for high-dimensional inference
Speaker: Lucas Slot
Abstract: Christoffel polynomials are classical tools from approximation theory. They can be used to estimate the (compact) support of a measure based on its low-degree moments. Recently, they have been applied to problems in data science, including outlier detection and support inference. A major downside of Christoffel polynomials in such applications is the fact that, in order to compute their coefficients, one must invert a matrix whose size grows rapidly with the dimension. In this talk, we propose a modification of the Christoffel polynomial which is significantly cheaper to compute, but retains many of its desirable properties. Our approach relies on sparsity of the underlying measure, described by a graphical model. The complexity of our modification depends on the treewidth of this model. Based on joint work with Jean-Bernard Lasserre.

Talk 3: Verifying Properties of Binary Neural Networks Using Sparse Polynomial Optimization
Speaker: Srećko Ðurašinović
Abstract: In this talk, we explore methods for verifying the properties of Binary Neural Networks (BNNs), focusing on robustness against adversarial attacks. Despite their lower computational and memory needs, BNNs, like their full-precision counterparts, are also sensitive to input perturbations. Established methods for solving this problem are predominantly based on Satisfiability Modulo Theories and Mixed-Integer Linear Programming techniques, which often face scalability issues. We introduce an alternative approach using Semidefinite Programming relaxations derived from sparse Polynomial Optimization. Our approach, compatible with continuous input space, not only mitigates numerical issues associated with floating-point calculations but also enhances verification scalability through the strategic use of tighter first-order semidefinite relaxations. We demonstrate the effectiveness of our method in verifying robustness against both infinity-norm and L2-norm based adversarial attacks.

Speakers
avatar for Oumayma Khattabi

Oumayma Khattabi

PhD student, Paris-Saclay University
I am currently working on stability analysis of dynamic systems using data.
LS

Lucas Slot

Postdoc, ETH Zurich
avatar for Srećko Ðurašinović

Srećko Ðurašinović

PhD Student, Nanyang Technological University, Singapore
Areas of interest:- Sparse Polynomial Optimization- Neural Network Verification- Christoffel-Darboux Kernels
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 119 3501 Trousdale Pkwy, 119, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1M: Optimization Method Generation with Large Language Models
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Optimization Method Generation with Large Language Models
Chair: Xiangfeng Wang
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)

Talk 1: Generative Models in Reinforcement Learning
Speaker: Wenhao Li
Abstract: ~

Talk 2: LLM-based Simulation Optimization
Speaker: Jun Luo
Abstract: ~

Talk 3: LLM-based Optimization Method for Scheduling
Speaker: Xiangfeng Wang
Abstract: ~

Speakers
WL

Wenhao Li

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JL

Jun Luo

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
XW

Xiangfeng Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 157 3518 Trousdale Pkwy, 157, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1N: Optimization for Robotics I
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Optimization for Robotics I
Chair: Panos Patrinos
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: Robotics Applications of the Direct Optimal Control of Nonsmooth Systems
Speaker: Anton Pozharskiy
Abstract: When developing control algorithms for robotic systems, practitioners must contend with modeling complex environmental interactions, including contact and friction, which are often modeled as nonsmooth dynamical systems. We discuss several alternate models of varying fidelity that can be applied to robotic manipulation problems, particularly comparing those coming from complementarity-lagrangian models and those coming from simpler projected dynamical systems. In order to efficiently be used in an optimal control context, these systems must be accurately discretized, as naive discretizations result in low accuracy and incorrect sensitivities. To this end, the Finite Elements with Switch Detection (FESD) discretization can be applied, which results in nonsmooth optimization problems called Mathematical Programs with Complementarity Constraints (MPCCs). The theoretical and practical difficulties of solving MPCCs coming from optimal control and several solution methods are then described. Finally, we present the open source package nosnoc, in which both the discretization and MPCC solution methods are implemented.

Talk 2: Real-time constrained nonlinear MPC in robotics: augmented Lagrangians and fast block-sparse matrix factorizations
Speaker: Wilson Jallet
Abstract: In high-dimensional robotic platforms, such as legged robots and humanoids, achieving real-time control is a critical challenge, particularly when managing complex dynamics and constraints in nonlinear model predictive control (MPC). This talk presents recent advances in constrained nonlinear MPC, focusing on augmented Lagrangian methods and fast block-sparse matrix factorizations. By exploiting the block-banded structure arising from the time dependency in MPC, we extend the Riccati recursion to efficiently handle constraints. Additionally, a Schur complement-like approach enables parallelization, significantly accelerating computation. We also discuss ongoing developments in a flexible C++ library, open-sourced last year, designed for real-time robotic applications. Current work emphasizes performance optimization, including updating OCPs and warm-starting MPC, improvements to cache-friendliness and future work on a computation graph. Flexibility remains a key focus, enabling users to define dynamics from their own ODE or DAE models (such as those provided by the Pinocchio rigid-body dynamics library), with support for a variety of time integrators, such as Euler and Runge-Kutta (with potential support for more advanced, energy-conserving integrators in the future). Additionally, we explore the use of generalized augmented Lagrangian methods, which allow geometric handling of more complex constraint sets, further enhancing the library's capabilities for constrained optimization. These advancements aim to make real-time control in complex robotic systems, particularly humanoids, more efficient and adaptable.

Talk 3: High-performance linear algebra in quadratic programming solvers for real-time optimal control
Speaker: Pieter Pas
Abstract: Model predictive control (MPC) is a powerful control strategy that is widely used in robotics due to its excellent performance and the ability to handle constraints. However, the real-time implementation of MPC presents significant computational challenges, especially in high-speed or large-scale control applications. Efficient numerical optimization solvers are therefore essential, and remain an active area of research. Solvers based on quadratic programming and interior point methods both rely on the fast solution of linear systems with a particular KKT structure. In this talk, we explore how the specific block-wise structure of KKT systems that arise in optimal control problems can be exploited in specialized batched linear algebra routines. By employing tailored storage schemes and highly optimized micro-kernels, combined with advanced vectorization and parallelization techniques, these routines leverage the full power of modern hardware, even for small to moderately sized models. We conclude by demonstrating that the practical performance of the quadratic programming solver QPALM can be substantially improved by replacing its general-purpose linear solver with optimal-control-specific variants based on the aforementioned batched linear algebra routines. The resulting QPALM-OCP solver is released as an open-source software library.

Speakers
AP

Anton Pozharskiy

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
WJ

Wilson Jallet

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
PP

Pieter Pas

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 256 3518 Trousdale Pkwy, 256, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1O: Optimization For Data Science
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Optimization For Data Science
Chair: Niao He
Cluster: Optimization For Data Science

Talk 1: Momentum & stochasticity — some insights from continuous time models
Speaker: Stephan Wojtowytsch
Abstract: Gradient descent and its variations (stochastic, with or without momentum) are the workhorse of machine learning. We give some examples where we gain insight into high-dimensional optimization problems in a machine learning context from continuous time models and possible effects of large step sizes compared to the continuous dynamics.

Talk 2: On a continuous time model of gradient descent dynamics and instability in deep learning
Speaker: Mihaela Rosca
Abstract: The recipe behind the success of deep learning has been the combination of neural networks and gradient-based optimization. Understanding the behavior of gradient descent however, and particularly its instability, has lagged behind its empirical success. To add to the theoretical tools available to study gradient descent we propose the principal flow (PF), a continuous time flow that approximates gradient descent dynamics. To our knowledge, the PF is the only continuous flow that captures the divergent and oscillatory behaviors of gradient descent, including escaping local minima and saddle points. Through its dependence on the eigendecomposition of the Hessian the PF sheds light on the recently observed edge of stability phenomena in deep learning. Using our new understanding of instability we propose a learning rate adaptation method which enables us to control the trade-off between training stability and test set evaluation performance.

Talk 3: A Hessian-Aware Stochastic Differential Equation for Modelling SGD
Speaker: Zebang Shen
Abstract: Continuous-time approximation of Stochastic Gradient Descent (SGD) is a crucial tool to study its escaping behaviors from stationary points. However, existing stochastic differential equation (SDE) models fail to fully capture these behaviors, even for simple quadratic objectives. Built on a novel stochastic backward error analysis framework, we derive the Hessian-Aware Stochastic Modified Equation (HA-SME), an SDE that incorporates Hessian information of the objective function into both its drift and diffusion terms. Our analysis shows that HA-SME matches the order-best approximation error guarantee among existing SDE models in the literature, while achieving a significantly reduced dependence on the smoothness parameter of the objective. Further, for quadratic objectives, under mild conditions, HA-SME is proved to be the first SDE model that recovers exactly the SGD dynamics in the distributional sense. Consequently, when the local landscape near a stationary point can be approximated by quadratics, HA-SME is expected to accurately predict the local escaping behaviors of SGD.

Speakers
avatar for Stephan Wojtowytsch

Stephan Wojtowytsch

Name: Stephan WojtowytschAffiliation: University of PittsburghBio: I am an assistant professor in the Department of Mathematics at the University of Pittsburgh. My research interests lie in the mathematics of machine learning and data science. Previously, I was an assistant profe... Read More →
MR

Mihaela Rosca

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZS

Zebang Shen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 258 3518 Trousdale Pkwy, 258, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1P: Advances in large-scale nonlinear optimization for data science
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Advances in large-scale nonlinear optimization for data science
Chair: Jiawei Zhang
Cluster: Nonlinear Optimization

Talk 1: On Squared-Variable Formulations for Nonlinear Semidefinite Programming
Speaker: Lijun Ding
Abstract: We study squared-variable formulations for nonlinear semidefinite programming. We show an equivalence result of second-order stationary points of the nonsymmetric-squared-variable formulations and the nonlinear semidefinite programs. We also show that such an equivalence fails for the local minimizers and second-order stationary points of the symmetric-squared-variable formulations and the nonlinear semidefinite programs, correcting a false understanding in the literature and providing sufficient conditions for such a correspondence to hold.

Talk 2: High-probability complexity guarantees for nonconvex minimax problems
Speaker: Yasa Syed
Abstract: Stochastic smooth nonconvex minimax problems are prevalent in machine learning, e.g., GAN training, fair classification, and distributionally robust learning. Stochastic gradient descent ascent (GDA)-type methods are popular in practice due to their simplicity and single-loop nature. However, there is a significant gap between the theory and practice regarding high-probability complexity guarantees for these methods on stochastic nonconvex minimax problems. Existing high-probability bounds for GDA-type single-loop methods only apply to convex/concave minimax problems and to particular non-monotone variational inequality problems under some restrictive assumptions. In this work, we address this gap by providing the first high-probability complexity guarantees for nonconvex/PL minimax problems corresponding to a smooth function that satisfies the PL-condition in the dual variable. Specifically, we show that when the stochastic gradients are light-tailed, the smoothed alternating GDA method can compute an $\varepsilon$-stationary point within $\mathcal{O}(\frac{\ell \kappa^2 \delta^2}{\varepsilon^4} + \frac{\kappa}{\varepsilon^2}(\ell+\delta^2\log({1}/{\bq})))$ stochastic gradient calls with probability at least $1-\bq$ for any $\bq\in(0,1)$, where $\mu$ is the PL constant, $\ell$ is the Lipschitz constant of the gradient, $\kappa=\ell/\mu$ is the condition number, and $\delta^2$ denotes a bound on the variance of stochastic gradients. We also present numerical results on a nonconvex/PL problem with synthetic data and on distributionally robust optimization problems with real data, illustrating our theoretical findings.

Talk 3: Sparse Solutions to Linear Systems via Polyak’s Stepsize
Speaker: Yura Malitsky
Abstract: This talk explores the implicit bias of entropic mirror descent in finding sparse solutions to linear systems, emphasizing the importance of appropriate initialization. We present an adaptive approach to improving the algorithm, using Polyak's stepsizes as a key tool.

Speakers
JZ

Jiawei Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
LD

Lijun Ding

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YS

Yasa Syed

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YM

Yura Malitsky

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 106 3501 Trousdale Pkwy, 106, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1Q: Optimization with Approximate or Uncertain Models
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Optimization with Approximate or Uncertain Models
Chair: Matthias Heinkenschloss
Cluster: PDE-constrained Optimization

Talk 1: Preconditioned Pseudo-time Continuation for Parameterized Inverse Problems
Speaker: Bart van Bloemen Waanders
Abstract: In this presentation we discuss a continuation approach to overcome linearization limitations associated with evaluating the post-optimality sensitivity of uncertainties with respect to optimization solutions. The post-optimality sensitivities (sensitivities of the first order optimality condition)  arise from the Implicit Function Theorem and depend on second-order derivatives. If the magnitude of uncertainty is large, the post-optimality sensitivities are insufficient to predict the effects of perturbed uncertainty parameters on the optimization solution. To address this issue, we introduce a continuation process that uses a pseudo time-stepping scheme to evolve the sensitivities.  A combination of specialized time-discretization and preconditioning helps to accelerate convergence.  A key computational challenge is the calculation of the inverse Hessian as part of the post-optimality sensitivity evaluation at each iteration of the continuation process.  To that end, we use a preconditioned Conjugate Gradient (PCG) solution strategy in which two novel Quasi-Newton update schemes are implemented that exploit the pseudo-time continuation structure. Our first update scheme introduces a secant equation to captures the uncertainty variations.  The second is an adaption of the block BFGS methods that leverages the PCG iteration history.  We demonstrate our approach on an insightful yet simple Poisson PDE with nonlinear boundary conditions and a nonlinear forcing term that in turn embeds uncertainty.   We invert for a spatially distributed diffusion coefficient and demonstrate the efficacy of our time-stepping and preconditioning algorithms.

Talk 2: Shape and topology optimization under uncertainty by robust approaches with application to electric machines
Speaker: Stefan Ulbrich
Abstract: We consider shape and topology optimization for PDE-constrained problems, where parameters in the PDE (e.g. coefficients) as well as the design itself (e.g. manufacturing tolerances) are uncertain. We propose a robust optimization approach, where the usually nonsmooth maximum value functions of constraints and objective function on the uncertainty sets are used in the robust counterpart. We discuss the efficient calculation of generalized derivatives of the robustified objective function and constraints. In particular, we introduce a novel robust topological derivative that can be used for robust topology optimization. We apply the methodology to shape and topology optimization of electric machines.

Talk 3: Adaptive Surrogate Modeling for Trajectory Optimization with Model Inexactness
Speaker: Matthias Heinkenschloss
Abstract: In many applications, one must compute optimal trajectories from imperfect knowledge of the dynamics. For example, solving trajectory optimization problems for hypersonic vehicles requires computing lift and drag coefficients at many flight configurations. Determining these coefficients over the entire state space would require expensive high-fidelity computations using detailed representations of the hypersonic vehicle at prohibitively many samples. This talk proposes using computationally inexpensive adaptive kernel regression models constructed from high-fidelity samples to approximate the components of the dynamics that are expensive to evaluate. To reduce the effect of model errors on the optimal trajectory, the current kernel regression model is updated as needed at the cost of evaluating the components of the dynamics at a small number of additional sample points. First, the optimal control problem is solved using the current kernel model to represent the dynamics. Next, a new optimality sensitivity analysis is combined with error estimates of the kernel model to determine whether the kernel regression model needs to be updated and, if so, at which samples the dynamics should be evaluated to update it. This talk outlines our current model refinement procedure and demonstrates its performance on a trajectory optimization problem for a hypersonic vehicle with lift and drag models that are known but expensive to evaluate.

Speakers
BV

Bart van Bloemen Waanders

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SU

Stefan Ulbrich

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MH

Matthias Heinkenschloss

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 214 3501 Trousdale Pkwy, 214, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1R: Advances in Riemannian Optimization: Algorithms and Applications
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Advances in Riemannian Optimization: Algorithms and Applications
Chair: Andi Han
Cluster: Optimization on Manifolds

Talk 1: Riemannian Accelerated Zeroth-order Algorithm: Improved Robustness and Lower Query Complexity
Speaker: Chang He
Abstract: Optimization problems with access to only zeroth-order information of the objective function on Riemannian manifolds arise in various applications, spanning from statistical learning to robot learning. While various zeroth-order algorithms have been proposed in Euclidean space, they are not inherently designed to handle the challenging constraints imposed by Riemannian manifolds. The proper adaptation of zeroth-order techniques to Riemannian manifolds remained unknown until the pioneering work of (Li et al., 2023a). However, zeroth-order algorithms are widely observed to converge slowly and be unstable in practice. To alleviate these issues, we propose a Riemannian accelerated zeroth-order algorithm with improved robustness. Regarding efficiency, our accelerated algorithm has the function query complexity of $\mathcal{O}(\epsilon^{-7/4}d)$ for finding an $\epsilon$-approximate first-order stationary point. By introducing a small perturbation, it exhibits a function query complexity of $\tilde{\mathcal{O}}(\epsilon^{-7/4}d)$ for seeking a second-order stationary point with a high probability, matching state-of-the-art result in Euclidean space. Moreover, we further establish the almost sure convergence in the asymptotic sense through the Stable Manifold Theorem. Regarding robustness, our algorithm requires larger smoothing parameters in the order of $\tilde{\mathcal{O}}(\epsilon^{7/8}d^{-1/2})$, improving the existing result by a factor of $\tilde{\mathcal{O}}(\epsilon^{3/4})$.

Talk 2: Extragradient Type Methods for Riemannian Variational Inequality Problems
Speaker: Zihao Hu
Abstract: In this work, we consider monotone Riemannian Variational Inequality Problems (RVIPs), which encompass both Riemannian convex optimization and minimax optimization as particular cases. In Euclidean space, the last-iterates of both the extragradient (EG) and past extragradient (PEG) methods converge to the solution of monotone variational inequality problems at a rate of $O\left(\frac{1}{\sqrt{T}}\right)$ \citep{cai2022finite}. However, analogous behavior on Riemannian manifolds remains open. To bridge this gap, we introduce the Riemannian extragradient (REG) and Riemannian past extragradient (RPEG) methods. We show that both exhibit $O\left(\frac{1}{\sqrt{T}}\right)$ last-iterate convergence and $O\left(\frac{1}{{T}}\right)$ average-iterate convergence, aligning with observations in the Euclidean case. These results are enabled by judiciously addressing the holonomy effect so that additional complications in Riemannian cases can be reduced and the Euclidean proof inspired by the performance estimation problem (PEP) technique can be applied again.

Talk 3: Riemannian ADMM
Speaker: Jiaxiang Li
Abstract: We consider a class of Riemannian optimization problems where the objective is the sum of a smooth function and a nonsmooth function, considered in the ambient space. This class of problems finds important applications in machine learning and statistics such as the sparse principal component analysis, sparse spectral clustering, and orthogonal dictionary learning. We propose a Riemannian alternating direction method of multipliers (ADMM) to solve this class of problems. Our algorithm adopts easily computable steps in each iteration. The iteration complexity of the proposed algorithm for obtaining an $\epsilon$-stationary point is analyzed under mild assumptions. Existing ADMM for solving nonconvex problems either does not allow nonconvex constraint set, or does not allow nonsmooth objective function. In contrast, our complexity result is established for problems with simultaneous nonsmooth objective and manifold constraint. Numerical experiments are conducted to demonstrate the advantage of the proposed method.

Speakers
AH

Andi Han

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
CH

Chang He

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZH

Zihao Hu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JL

Jiaxiang Li

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 154 3518 Trousdale Pkwy, 154, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1S: Advances in Semidefinite Programming: Relaxations, Hierarchies, and Optimization Techniques
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Advances in Semidefinite Programming: Relaxations, Hierarchies, and Optimization Techniques
Chair: Makoto Yamashita & Sunyoung Kim
Cluster: Conic and Semidefinite Optimization

Talk 1: Well-conditioned primal-dual interior-point method for accurate low-rank semidefinite programming
Speaker: Hong-Ming Chiu
Abstract: We describe how the low-rank structure in an SDP can be exploited to reduce the per-iteration cost of a convex primal-dual interior-point method down to cubic time and quadratic memory, even at very high accuracies. A traditional difficulty is the dense Newton subproblem at each iteration, which becomes progressively ill-conditioned as progress is made towards the solution. Preconditioners have previously been proposed to improve conditioning, but these can be expensive to set up, and become ineffective as the preconditioner itself becomes increasingly ill-conditioned at high accuracies. Instead, we present a well-conditioned reformulation of the Newton subproblem that is cheap to set up, and whose condition number is guaranteed to remain bounded over all iterations. In theory, applying an inner iterative method to the reformulation reduces the per-iteration cost of the outer interior-point method to cubic time and quadratic memory. We also present a well-conditioned preconditioner that greatly improves the convergence of the inner iterations.

Talk 2: Exact SDP relaxations for a class of quadratic programs with finite and infinite quadratic constraints
Speaker: Sunyoung Kim
Abstract: We study exact semidefinite programming (SDP) relaxations for the problem of minimizing a nonconvex quadratic objective function over a feasible region defined by both finitely and infinitely many nonconvex quadratic inequality constraints (semi- infinite QCQPs). Specifically, we present two sufficient conditions on the feasible region under which the QCQP, with any quadratic objective function over the feasible region, is equivalent to its SDP relaxation. The first condition is an extension of a result recently proposed by the authors (arXiv:2308.05922, to appear in SIAM J. Optim.) from finitely constrained quadratic programs to semi-infinite QCQPs. The newly introduced second condition offers a clear geometric characterization of the feasible region for a broad class of QCQPs that are equivalent to their SDP relaxations. Several illustrative examples, including quadratic programs with ball-, parabola-, and hyperbola-based constraints, are also provided.

Talk 3: Semidefinite Programming Relaxation Hierarchy Using Third-Order Tensors for Constrained Polynomial Optimization
Speaker: Makoto Yamashita
Abstract: Exploiting the computational structure of third-order tensors, Zheng et al. (2022) proposed a semidefinite programming (SDP) hierarchy of relaxations for unconstrained polynomial optimization problems (POPs). We extend this by employing the Lagrange function to propose a hierarchy of SDP relaxation for constrained polynomial optimization problems involving third-order tensors. This relaxation can be computationally efficient, as it can be transformed into an SDP problem with a block diagonal matrix structure via the discrete Fourier transformation. Additionally, we show under a mild assumption, the objective value of the hierarchy converges to the optimal value of the POP as the degree of relaxation increases.

Speakers
HC

Hong-Ming Chiu

PhD student, University of Illinois Urbana Champaign
Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SK

Sunyoung Kim

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Makoto Yamashita

Makoto Yamashita

Professor, Institute of Science Tokyo
Name: Dr. Makoto YamashitaTitle: ProfessorAffiliation: Institute of Science TokyoBio:Dr. Makoto Yamashita is a professor of Department of Mathematical and Computing Science of the Institute of Science Tokyo.His recent research interests includes conic optimization and its applications... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 155 3518 Trousdale Pkwy, 155, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1T: Recent Advances in Stochastic Optimization: Complexity, Adaptivity, and Nonsmooth Extensions (I)
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Recent Advances in Stochastic Optimization: Complexity, Adaptivity, and Nonsmooth Extensions (I)
Chair: Sen Na & Zhaosong Lu
Cluster: Nonlinear Optimization

Talk 1: Adaptive Optimization with Highly Corrupted Inputs: A Unified Framework for High-Probability Iteration Complexity Analysis
Speaker: Miaolan Xie
Abstract: We consider an unconstrained continuous optimization problem in which gradient estimates may be arbitrarily corrupted in each iteration with a probability greater than $\frac 1 2$. Additionally, function value estimates may be noisy or adversarially corrupted throughout the algorithm’s execution. This framework is applicable to many real-world problems and is particularly relevant to stochastic and derivative-free optimization settings. We introduce an algorithmic and analytical framework that provides high probability bounds on iteration complexity for this highly corrupted setting. The analysis offers a unified approach, accommodating noisy or corrupted inputs and encompassing methods such as line search and trust region.

Talk 2: Adaptive Stochastic Algorithms for Nonconvex Constrained Optimization
Speaker: Baoyu Zhou
Abstract: In this talk, we will discuss some recent works on the design, analysis, and implementation of a class of efficient algorithms for solving stochastic optimization problems with deterministic nonlinear nonconvex constraints. Those optimization problems arise in a plethora of science and engineering applications including physics-informed learning, PDE-constrained optimization, machine learning fairness, and optimal power flow. We are especially interested in the case where the problem's feasible region is difficult to detect and projection-type methods are intractable. The theoretical results and numerical performance demonstrate the efficiency and efficacy of our proposed algorithms.

Talk 3: Variance-Reduced First-Order Methods for Constrained Stochastic Optimization
Speaker: Zhaosong Lu
Abstract: We study a class of deterministically constrained stochastic optimization problems. Existing methods typically aim to find an approximate stochastic stationary point, where the expected violations of both the constraints and first-order stationarity are nearly satisfied. However, such approximate solutions can lead to significant constraint violations. To address this issue, we propose single-loop variance-reduced stochastic first-order methods. In our approach, the stochastic gradient of the stochastic component is computed using either a truncated recursive scheme or a truncated Polyak momentum scheme for variance reduction, while the gradient of the deterministic component is computed exactly. Under suitable assumptions, our proposed methods not only achieve new sample and first-order operation complexity but also produce stronger approximate stochastic stationary points that more reliably satisfy the constraints compared to existing methods.

Speakers
avatar for Miaolan Xie

Miaolan Xie

Assistant professor, Purdue University
Name: Miaolan XieTitle: Assistant Professor of Stochastic Optimization and Continuous OptimizationAffiliation: Purdue University
BZ

Baoyu Zhou

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZL

Zhaosong Lu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 158 3518 Trousdale Pkwy, 158, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1U: Recent Advances in Fixed-Point Methods and Stability in Optimization Problems
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Recent Advances in Fixed-Point Methods and Stability in Optimization Problems
Chair: Chao Ding
Cluster: Fixed Points and Variational Inequalities

Talk 1: HPR-QP: An implementation of an HPR method for solving quadratic programming
Speaker: Kaihuang Chen
Abstract: This talk introduces HPR-QP, a solver based on a Halpern Peaceman-Rachford (HPR) method for solving quadratic programming (QP). The iteration complexity of the algorithm is $O(1/k)$ in terms of the Karush-Kuhn-Tucker residual and the objective error. We compare the performance of HPR-QP and other solvers on extensive QP datasets.

Talk 2: Characterizations of Tilt-Stable Local Minimizers of a Class of Matrix Optimization Problems
Speaker: Shiwei Wang
Abstract: As a fundamental perturbation property, tilt stability has been widely studied as it can deeply characterize the difficulty of a problem and reveal the good behavior of multiplier methods for certain problem. In this talk, we mainly focus on establishing a new characterization of tilt stability via the newly proposed quadratic bundle. By calculating the explicit form of the minimal quadratic bundle of the polyhedral spectral function, we can further obtain the equivalence between tilt stability and strong second order sufficient condition under nondegeneracy for general composite optimization problem.

Talk 3: On the ergodic convergence properties of the Peaceman-Rachford method and their applications in solving linear programming
Speaker: Guojun Zhang
Abstract: In this talk, we study the ergodic convergence properties of the Peaceman-Rachford (PR) method with semi-proximal terms for solving convex optimization problems (COPs). For the first time, we establish the global convergence of the ergodic sequence of the PR method with semi-proximal terms by leveraging the theory of the degenerate proximal point method. This result represents a significant departure from previous studies on the non-ergodic convergence of the PR method, which typically require strong convexity or monotonicity conditions that are not generally satisfied in COPs. Moreover, we demonstrate an ergodic iteration complexity of $O(1/k)$ of the PR method with semi-proximal terms, measured by the objective error and the Karush–Kuhn–Tucker residual using the $\varepsilon$-subdifferential. Based on these convergence properties, we introduce EPR-LP, using the ergodic sequence of the PR method with semi-proximal terms for solving linear programming (LP) problems. EPR-LP incorporates an adaptive restart strategy and dynamic penalty parameter updates for efficiency and robustness. Extensive numerical experiments on LP benchmark datasets, executed on a high-performance GPU, show that our Julia-based solver outperforms the award-winning solver PDLP at a tolerance level of $10^{-8}$.

Speakers
CD

Chao Ding

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
KC

Kaihuang Chen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SW

Shiwei Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
GZ

Guojun Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 108 3501 Trousdale Pkwy, 108, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1V: Stochastic Gradient Descent (SGD) and Variants
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Stochastic Gradient Descent (SGD) and Variants
Chair: Melinda Hagedorn
Cluster: nan

Talk 1: Optimized convergence of stochastic gradient descent by weighted averaging
Speaker: Melinda Hagedorn
Abstract: Under mild assumptions stochastic gradient methods asymptotically achieve an optimal rate of convergence if the arithmetic mean of all iterates is returned as an approximate optimal solution. However, in the absence of stochastic noise, the arithmetic mean of all iterates converges considerably slower to the optimal solution than the iterates themselves. And also in the presence of noise, when a termination of the stochastic gradient method after a finite number of steps is considered, the arithmetic mean is not necessarily the best possible approximation to the unknown optimal solution. This paper aims at identifying optimal strategies in a particularly simple case, the minimization of a strongly convex function with i.i.d. noise terms and termination after a finite number of steps. Explicit formulas for the stochastic error and the optimality error are derived in dependence of certain parameters of the SGD method. The aim was to choose parameters such that both stochastic error and optimality error are reduced compared to arithmetic averaging. This aim could not be achieved; however, by allowing a slight increase of the stochastic error it was possible to select the parameters such that a significant reduction of the optimality error could be achieved. This reduction of the optimality error has a strong effect on the approximate solution generated by the stochastic gradient method in case that only a moderate number of iterations is used or when the initial error is large. The numerical examples confirm the theoretical results and suggest that a generalization to non-quadratic objective functions may be possible. This paper was written together with Professor Florian Jarre and is already published in Optimization Methods and Software 39 (2024), Nr. 4, S. 699–724

Talk 2: Linear Convergence Rate in Convex Setup is Possible! Gradient Descent Method Variants under (L0, L1)-Smoothness
Speaker: Aleksandr Lobanov
Abstract: The gradient descent (GD) method -- is a fundamental and likely the most popular optimization algorithm in machine learning (ML), with a history traced back to a paper in 1847 \cite{Cauchy_1847}. In this paper, we provide an improved convergence analysis of gradient descent and its variants, assuming generalized smoothness $(L_0,L_1)$. In particular, we show that GD has the following behavior of convergence in the \textit{convex setup}: as long as $\norms{\nabla f(x^k)} \geq \frac{L_0}{L_1}$ the algorithm has \textit{linear convergence}, and approaching the solution $x^*$ such that $\norms{\nabla f(x^k)} < \frac{L_0}{L_1}$, has standard sublinear rate. Moreover, we show that this behavior of convergence is also common for its variants using different types of oracle: \textit{Normalized Gradient Descent} as well as \textit{Clipped Gradient Descent} (the case when the oracle has access to the full gradient $\nabla f(x)$); \textit{Random Coordinate Descent} (when the oracle has access only to the gradient component $\nabla_{i} f(x)$); \textit{Random Coordinate Descent with Order Oracle} (when the oracle has access only to the comparison value of the objective function $\text{sign} [f(y) - f(x)]$). In addition, we also analyze the behavior of convergence rate of GD algorithm in a strongly convex setup. Finally, we validate our theoretical results via numerical experiment. https://arxiv.org/pdf/2412.17050

Talk 3: High Probability Guarantees for Random Reshuffling
Speaker: Hengxu Yu
Abstract: We consider the stochastic gradient method with random reshuffling (RR) for tackling smooth nonconvex optimization problems. RR finds broad applications in practice, notably in training neural networks. In this work, we provide high probability first-order and second-order complexity guarantees for this method. First, we establish a high probability first-order sample complexity result for driving the Euclidean norm of the gradient (without taking expectation) below a required accuracy. Our derived complexity matches the best existing in-expectation one up to a logarithmic term while imposing no additional assumptions nor changing RR's updating rule. We then propose a simple and computable stopping criterion for RR (denoted as RR-sc). This criterion is guaranteed to be triggered after a finite number of iterations, enabling us to prove a high probability first-order complexity guarantee for the last iterate. Second, building on the proposed stopping criterion, we design a perturbed random reshuffling method (p-RR) that involves an additional randomized perturbation procedure near stationary points. We derive that p-RR provably escapes strict saddle points and establish a high probability second-order complexity result, without requiring any sub-Gaussian tail-type assumptions on the stochastic gradient errors. The fundamental ingredient in deriving the aforementioned results is a new concentration property for sampling without replacement in RR, which could be of independent interest. Finally, we conduct numerical experiments on neural network training to support our theoretical findings. The full preprint paper, which is under revision for SIOPT, can be found at https://arxiv.org/abs/2311.11841

Speakers
avatar for Melinda Hagedorn

Melinda Hagedorn

PhD student, Heinrich Heine University Düsseldorf
Name: Melinda HagedornDegrees: Master's degrees in Mathematics and PhysicsAffiliation: Heinrich Heine University Düsseldorf, GermanyMelinda Hagedorn is a PhD student in Mathematical Optimization under the supervision of Prof. Florian Jarre, research associate and teaching assistant... Read More →
AL

Aleksandr Lobanov

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HY

Hengxu Yu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 110 3501 Trousdale Pkwy, 110, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1W: Adaptive and Accelerated First-Order Methods
Monday July 21, 2025 10:30am - 11:45am PDT
Session: Adaptive and Accelerated First-Order Methods
Chair: Wenzhi Gao
Cluster: nan

Talk 1: Gradient Descent as a Collaborative Game
Speaker: Wenzhi Gao
Abstract: We introduce a framework to accelerate the convergence of gradient-based methods with online learning. The framework learns to update the stepsize in gradient descent with online learning and provably accelerates gradient-based methods. A key insight is to view gradient descent as a collaborative game between the stepsize scheduler and the optimization landscape -- both players working together for faster convergence. We also discuss implications of the framework, including global and local convergence properties and several extensions. Numerical experiments on deterministic convex and nonconvex problems demonstrate the promising performance of our method. Reference: https://arxiv.org/pdf/2411.01803

Talk 2: An Adaptive and Parameter-Free Nesterov's Accelerated Gradient Method
Speaker: Jaewook J. Suh
Abstract: In this talk, we introduce AdaNAG, an adaptive accelerated gradient method based on Nesterov's accelerated gradient (NAG). The algorithm is line-search-free, parameter-free, and achieves the accelerated convergence rates $f(x_k) - f_\star = O(1/k^2)$ and $\min_{i\in\{1, ... ,k\}} \|\nabla f(x_i)\|^2 = O(1/k^3)$ for an $L$-smooth convex function $f$. We provide a Lyapunov analysis for the convergence proof of AdaNAG, which additionally enables us to propose a novel adaptive gradient descent (GD) method, AdaGD. AdaGD achieves the non-ergodic convergence rate $f(x_k) - f_\star = O(1/k)$, like the original GD. Motivated by the relationship between the parameter choice and the convergence guarantee of AdaGD, we obtain a generalized AdaNAG that provides a practically useful variant of AdaNAG. We provide numerical results showing that our method outperforms other recently proposed adaptive methods in certain scenarios.

Talk 3: Stochastic gradient methodswithBlock Coordinate Optimistic Stepsizes
Speaker: Tao Jiang
Abstract: Ill-conditioning is a major challenge for optimization with first-order methods. This is especially the case for stochastic optimization, where preconditioners in the classical sense are hard to construct due to the nature of stochastic gradients. We propose a block-coordinate stepsize rule that can effectively combat ill-conditioning as well as inhomogeneous noise in the stochastic setting. Our method is motivated by minimizing the expected distance to an optimal point during each iteration. Specifically, we use the optimistic stepsizes as if the expected search directions (e.g., stochastic gradients with or without momentum) along each coordinate always point to the optimal point. These stepsizes rely on online estimates of the second-moments of the coordinate-wise search directions. The popular Adam algorithm can be interpreted as a heuristic for such an estimation. Compared with Adam, our method requires fewer hyperparameters, obtains similar or better performance, and is numerically more stable.

Speakers
WG

Wenzhi Gao

Ph.D. student, Stanford University
Name: Wenzhi GaoSecond year Ph.D. student at Stanford ICME, working on large-scale numerical optimization, first-order methods, and online decision-making problems.
avatar for Jaewook J. Suh

Jaewook J. Suh

Name: Jaewook J. SuhAffiliation: Rice University
TJ

Tao Jiang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 112 3501 Trousdale Pkwy, 112, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1X
Monday July 21, 2025 10:30am - 11:45am PDT
Monday July 21, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 215 3501 Trousdale Pkwy, 215, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 1Y
Monday July 21, 2025 10:30am - 11:45am PDT
Monday July 21, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 200 3518 Trousdale Pkwy, 200, Los Angeles, CA 90089

11:45am PDT

Lunch 1 (provided)
Monday July 21, 2025 11:45am - 1:15pm PDT
Mediterranean Buffet
Monday July 21, 2025 11:45am - 1:15pm PDT
USC Founder's / Hutton Park 3551 Trousdale Pkwy, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2A: Advances in Solving Large-Scale Problems: Accelerated Methods and Sharp Analyses (II)
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Advances in Solving Large-Scale Problems: Accelerated Methods and Sharp Analyses (II)
Chair: Liwei Jiang Sen Na
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: Online Learning Guided Quasi-Newton Methods: Improved Global Non-Asymptotic Guarantees
Speaker: Ruichen Jiang
Abstract: Quasi-Newton methods are popular iterative algorithms known for their superior practical performance over gradient-descent-type methods. However, existing theoretical results for this class of algorithms fail to fully justify their observed advantages. In this talk, we discuss our recent efforts to address this issue. Specifically, in the strongly convex setting, we propose the first “globally” convergent quasi-Newton method that achieves an explicit “non-asymptotic superlinear” rate. We show that the rate presented for our method is provably faster than gradient descent after at most $O(d)$ iterations, where $d$ is the problem dimension. Additionally, in the convex setting, we present an accelerated variant of our proposed method that provably outperforms the accelerated gradient method and converges at a rate of $O(\min\{1/k^2, \sqrt{d \log k }/ k^{2.5}\})$, where $k$ is the number of iterations. To attain these results, we diverge from conventional approaches and construct our quasi-Newton methods based on the Hybrid Proximal Extragradient framework and its accelerated variants. Furthermore, a key algorithmic concept in our methods is an online learning framework for updating the Hessian approximation matrices. Specifically, we relate our method's convergence rate to the regret of a specific online convex optimization problem in the matrix space and choose the sequence of Hessian approximation matrices to minimize its overall regret.

Talk 2: Accelerated stochastic approximation with state-dependent noise
Speaker: Tianjiao Li
Abstract: We consider a class of stochastic smooth convex optimization problems under rather general assumptions on the noise in the stochastic gradient observation. As opposed to the classical problem setting in which the variance of noise is assumed to be uniformly bounded, herein we assume that the variance of stochastic gradients is related to the “sub-optimality” of the approximate solutions delivered by the algorithm. Such problems naturally arise in a variety of applications, in particular, in the well-known generalized linear regression problem in statistics. However, to the best of our knowledge, none of the existing stochastic approximation algorithms for solving this class of problems attain optimality in terms of the dependence on accuracy, problem parameters, and mini-batch size. We discuss two non-Euclidean accelerated stochastic approximation routines—stochastic accelerated gradient descent (SAGD) and stochastic gradient extrapolation (SGE)—which carry a particular duality relationship. We show that both SAGD and SGE, under appropriate conditions, achieve the optimal convergence rate, attaining the optimal iteration and sample complexities simultaneously. However, corresponding assumptions for the SGE algorithm are more general; they allow, for instance, for efficient application of the SGE to statistical estimation problems under heavy tail noises and discontinuous score functions. We also discuss the application of the SGE to problems satisfying quadratic growth conditions, and show how it can be used to recover sparse solutions. Finally, we report on some simulation experiments to illustrate numerical performance of our proposed algorithms in high-dimensional settings.

Talk 3: SQP for physics-informed machine learning
Speaker: Qi Wang
Abstract: A methodology for physics-informed machine learning is presented, which incorporates prior information in the training problem through hard constraints, rather than the typical modern practice of using soft constraints (i.e., regularization terms or penalty methods). The methodology is based on a recently proposed stochastic-gradient-based SQP algorithm and is extended to use Adam-type step computation in the presence of hard constraints. The effectiveness of the method is demonstrated through numerical experiments on physics-informed learning problems.

Speakers
avatar for Ruichen Jiang

Ruichen Jiang

PhD student, UT Austin
Hi, I am a PhD student in the Department of ECE at UT Austin, advised by Aryan Mokhtari. My current research interests focus on convex and non-convex optimization, particularly in using online learning techniques to design optimization methods. I am always happy to chat about min-max... Read More →
avatar for Tianjiao Li

Tianjiao Li

PhD student, Georgia Institute of Technology
Name: Tianjiao LiTitle: PhD studentAffiliation: Georgia Institute of TechnologyBio:Tianjiao Li is a fifth-year PhD candidate in the School of Industrial and Systems Engineering at Georgia Institute of Technology, advised by Prof. George Lan and Prof. Ashwin Pananjady. His research... Read More →
QW

Qi Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 101 3501 Trousdale Pkwy, 101, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2B: Deterministic and Stochastic Methods for Optimization and Games- Part I
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Deterministic and Stochastic Methods for Optimization and Games- Part I
Chair: Gesualdo Scutari
Cluster: Multi-agent Optimization and Games

Talk 1: On the computation of quasi-Nash equilibria under uncertainty
Speaker: Zhuoyu Xiao
Abstract: Motivated by applications in network congestion games and Cournot games, we consider the computation of either Nash or quasi-Nash equilibria in static stochastic noncooperative games afflicted by either non convexity or non-monotonicity. We consider sampled variants of gradient and best-response and show that under specified conditions, both schemes generate sequences which converge to quasi-Nash equilibria almost surely. We also provide non-asymptotic rate statements in some cases. Time permitting, we briefly discuss distributed extensions in networked settings of both schemes. Numerical experiments are also provided to support our theoretical results.

Talk 2: Iteratively Regularized Gradient Tracking Methods for Distributed Optimal Equilibrium Seeking
Speaker: Farzad Yousefian
Abstract: We consider a class of distributed constrained optimization problems where the constraint set is characterized by the solution set of a distributed monotone variational inequality problem. This problem is motivated by the need for estimation of the efficiency of equilibria in Nash games. First, we consider solving this problem over directed networks. We develop an iteratively regularized distributed gradient tracking method where the agents employ a push-pull protocol to communicate over the network. Second, we consider a stochastic variant of this problem over undirected networks and develop an iteratively regularized distributed stochastic gradient tracking method. For both algorithms, we establish the convergence of the generated iterates by the agents to the optimal equilibrium and derive new convergence rate statements. We validate the two proposed methods and present preliminary numerical results for computing the optimal equilibrium in a Cournot competition.

Talk 3: Clipped-Stochastic Methods for Generalized Smooth Stochastic Variational Inequalities
Speaker: Angelia Nedich
Abstract: We focus on solving a stochastic variational inequality (SVI) problem under relaxed smoothness assumption for a class of structured non-monotone operators. The SVI problem has attracted significant interest in the machine learning community due to its immediate application to adversarial training and multi-agent reinforcement learning. In many such applications, the resulting operators do not satisfy the smoothness assumption. To address this issue, we focus on the generalized smoothness assumption and consider two well-known stochastic methods with clipping, namely, projection and Korpelevich. For these clipped methods, we provide the first almost-sure convergence results without making any assumptions on the boundedness of either the stochastic operator or the stochastic samples. Furthermore, we provide the first almost-sure convergence results and in-expectation convergence rate results for these methods under a relaxed smoothness assumption.

Speakers
GS

Gesualdo Scutari

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
FY

Farzad Yousefian

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AN

Angelia Nedich

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 201 3501 Trousdale Pkwy, 201, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2C: Recent Advances in Theory and Algorithms for Multiagent Systems
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Recent Advances in Theory and Algorithms for Multiagent Systems
Chair: Andrew Liu
Cluster: Multi-agent Optimization and Games

Talk 1: Approximate Global Convergence of Independent Learning in Multi-Agent Systems
Speaker: Zaiwei Chen
Abstract: Independent learning (IL), despite being a popular approach in practice to achieve scalability in large-scale multi-agent systems, usually lacks global convergence guarantees. In this paper, we study two representative algorithms, independent Q-learning and independent natural actor-critic, within value-based and policy-based frameworks and provide the first finite-sample analysis for approximate global convergence. Our results indicate that IL can achieve global convergence up to a fixed error, which arises from the dependence among agents and characterizes the fundamental limit of IL in attaining global convergence. To establish the result, we develop a novel approach for analyzing IL by constructing a separable Markov decision process (MDP) for convergence analysis and then bounding the gap due to the model difference between the separable MDP and the original one. Moreover, we conduct numerical experiments using a synthetic MDP and an electric vehicle charging example to demonstrate our results and the practical applicability of IL.

Talk 2: Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies
Speaker: Alex Deweese
Abstract: Many multi-agent systems in practice are decentralized and have dynamically varying dependencies. There has been a lack of attempts in the literature to analyze these systems theoretically. In this paper, we propose and theoretically analyze a decentralized model with dynamically varying dependencies called the Locally Interdependent Multi-Agent MDP. This model can represent problems in many disparate domains such as cooperative navigation, obstacle avoidance, and formation control. Despite the intractability that general partially observable multi-agent systems suffer from, we propose three closed-form policies that are theoretically near-optimal in this setting and can be scalable to compute and store. Consequentially, we reveal a fundamental property of Locally Interdependent Multi-Agent MDP's that the partially observable decentralized solution is exponentially close to the fully observable solution with respect to the visibility radius. We then discuss extensions of our closed-form policies to further improve tractability. We also provide simulations to investigate some long horizon behaviors of our closed-form policies.

Talk 3: Hybrid Mean-Field Control and Mean-Field Equilibrium: Theories, Algorithms and Applications
Speaker: Andrew Liu
Abstract: In this talk, we introduce a hybrid multiagent modeling framework that combines Mean Field Control (MFC) and Mean Field Equilibrium (MFE). A perfect example of this framework is the operation of multiple virtual power plants (VPPs) or aggregators, each applying an MFC algorithm to manage the distributed energy resources (DERs) within their portfolios. These aggregators participate in the wholesale energy market by bidding on behalf of the DERs they represent, navigating the dynamic and uncertain market environment. Traditional game-theoretic approaches fall short in capturing the complexity of repeated and dynamic interactions under such uncertainties. Hence, we leverage the MFG approach to study these agent interactions and the resulting market dynamics. The MFC framework empowers each aggregator to determine optimal control policies despite uncertainties in solar output, demand fluctuations, and price volatility. Simultaneously, the MFE framework models strategic interactions between aggregators and other market participants, enabling a scalable approach for large systems. We establish the existence of a strong Nash equilibrium within this hybrid structure and propose a reinforcement learning-based algorithm to help aggregators learn and optimize their strategies over time. Crucially, this prescriptive approach facilitates control automation, enabling the integration of advanced AI and machine learning techniques at the grid edge, to optimize resource management and achieve system-wide benefits. We validate this framework through simulations of the Oahu Island electricity grid, showing that the combination of energy storage and mean-field learning significantly reduces price volatility and yields stable market outcomes. This work demonstrates the power and flexibility of the hybrid MFC-MFE approach, offering a robust foundation for scalable, automated decision-making in energy markets and beyond.

Speakers
ZC

Zaiwei Chen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AD

Alex Deweese

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AL

Andrew Liu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 208 3501 Trousdale Pkwy, 208, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2D: Methods for Large-Scale Nonlinear Optimization II
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Methods for Large-Scale Nonlinear Optimization II
Chair: Mike O'Neill
Cluster: Nonlinear Optimization

Talk 1: Stochastic-Gradient-based Interior-Point Methods
Speaker: Frank E. Curtis
Abstract: I will discuss some of our recent work on stochastic-gradient-based interior-point algorithms for solving constrained optimization problems, such as those arising in informed machine learning. The algorithms are single-loop in the sense that they do not require an inner criterion for updating the barrier parameter; rather, the barrier parameter is decreased according to a prescribed sequence. Convergence guarantees are attained in both deterministic and stochastic settings. The algorithms exhibit good practical performance in comparison to projected-gradient-based methods.

Talk 2: Fast unconstrained optimization via Hessian Averaging and Adaptive Gradient Sampling Methods
Speaker: Raghu Bollapragada
Abstract: In this talk, we discuss minimizing finite-sum and expectation objective functions using Hessian-averaging-based subsampled Newton methods. These methods accommodate gradient inexactness and maintain fixed per-iteration Hessian approximation costs. Recent work (Na et al. 2023) showed that Hessian averaging achieves fast $\mathcal{O}\left(\sqrt{\tfrac{\log k}{k}}\right)$ local superlinear convergence for strongly convex functions, but requires gradient exactness and strong convexity, limiting practical use. To address this, we propose Hessian-averaged methods with adaptive-sampling strategies allowing gradient inexactness. For finite-sum problems, we use deterministic sampling, yielding global linear and sublinear convergence for strongly convex and nonconvex functions. We derive an improved local superlinear rate of $\mathcal{O}\left(\tfrac{1}{k}\right)$. For expectation problems, we use stochastic sampling and derive global linear/sublinear rates and a local superlinear rate of $\mathcal{O}\left(\tfrac{1}{\sqrt{k}}\right)$. Additionally, we introduce scalable methods like the diagonally-averaged Newton (Dan) method for large-scale problems. Numerical results show that Hessian averaging enhances convergence and achieves state-of-the-art performance on challenging tasks like CIFAR100 classification with ResNets.

Talk 3: A Stochastic Objective-Function-Free Adaptive Regularization Method with Optimal Complexity
Speaker: Sadok Jerad
Abstract: A fully stochastic second-order adaptive-regularization method for unconstrained non-convex optimization is presented which never computes the objective-function value, but yet achieves the optimal complexity bound for finding first-order critical points. The method is fully adaptive and the inexactness conditions required for convergence depend on the history of past steps. Numerical experiments on large binary classification problems illustrate the potential of the new method.

Speakers
avatar for Frank E. Curtis

Frank E. Curtis

Professor, Lehigh University
Name: Frank E. Curtis, Ph.D.Title: ProfessorAffiliation: Lehigh UniversityBio: Please see my website.Fun Fact: My wife and I have been together for 20 years and she's never seen me without a beard.
RB

Raghu Bollapragada

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SJ

Sadok Jerad

Postdoctoral Research Associate, University of Oxford
Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 210 3501 Trousdale Pkwy, 210, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2E: Efficient Optimization Methods for LLMs (Part I)
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Efficient Optimization Methods for LLMs (Part I)
Chair: Ruoyu Sun
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)

Talk 1: Adam-mini: Use Fewer Learning Rates To Gain More
Speaker: Ruoyu Sun
Abstract: TBD

Talk 2: GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Speaker: Zhangyang Wang
Abstract: TBD

Talk 3: LoRA-GA: Low-Rank Adaptation with Gradient Approximation
Speaker: Jian Li
Abstract: TBD

Speakers
RS

Ruoyu Sun

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZW

Zhangyang Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JL

Jian Li

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 212 3501 Trousdale Pkwy, 212, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2F: Quantum Linear Algebra and Optimization (Part 1))
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Quantum Linear Algebra and Optimization (Part 1))
Chair: Mohammadhossein Mohammadisiahroudi
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)

Talk 1: Quantum Computing-based Sensitivity Analysis for PDE-constrained Optimization
Speaker: Sicheng He
Abstract: Quantum computing is an emerging paradigm offering significant speed-ups for solving specific mathematical problems. In recent years, optimization and scientific computing researchers have developed quantum algorithms that demonstrate complexity advantage for large-scale problems. A key area of focus has been to leverage quantum linear algebra techniques to solve linear systems that arise in optimization and scientific computing applications. We propose quantum computing-based direct and adjoint methods for implicit sensitivity analysis in PDE-constrained optimization. The proposed quantum approaches achieve exponential speed-up in complexity with respect to the problem dimension, i.e., the number of state variables, compared to classical methods. Notably, in the quantum computing framework, both the direct and adjoint methods exhibit similar computational complexity, a departure from their classical counterparts. We demonstrate the proposed method using a simple heat transfer problem implemented with the IBM Qiskit simulators

Talk 2: Recent Advances in Quantum Interior Point Methods
Speaker: Tamas Terlaky
Abstract: Quantum Interior Point Methods (QIPMs) have recently emerged as a potential approach to accelerating the solution of large-scale conic optimization problems by leveraging quantum linear system algorithms for solving the Newton systems in IPMs. However, the one of significant challenges of QIPMs is the inexact and noisy nature of quantum solvers. In this talk, we discuss recent advancements in the design of efficient QIPMs that effectively manage errors. We introduce novel reformulations of the Newton system that enable maintaining feasibility despite inexact Newton directions. Additionally, we employ iterative refinement techniques to enhance solution accuracy while operating under limited precision. Our proposed QIPMs achieve the best-known iteration complexity, offering a significant step forward in the practical realization of quantum-accelerated optimization.

Talk 3: Quantum Approaches to Mixed Integer PDE-Constrained Optimization
Speaker: Adrian Harkness
Abstract: Mixed-integer PDE-constrained optimization (MIPDECO) problems arise in applications like gas and power networks or turbine placement. These problems combine the combinatorial complexity of integer programming with the large-scale linear systems of PDE-constrained optimization. This work investigates quantum computing methods for solving MIPDECO problems, specifically with binary control variables and a knapsack constraint. By using a first-discretize-then-optimize approach, we derive a binary quadratic optimization (BQO) formulation. We then explore two quantum algorithmic strategies based on the Quantum Approximate Optimization Algorithm (QAOA): a constraint-enforcing approach using custom mixer Hamiltonians, and a penalty-based method embedding constraints into the objective function. We discuss penalty parameter selection for feasibility and compare simulations of both quantum approaches. Our results illustrate the potential advantages of quantum methods for solving PDE-constrained optimization problems

Speakers
MM

Mohammadhossein Mohammadisiahroudi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SH

Sicheng He

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
TT

Tamas Terlaky

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 156 3518 Trousdale Pkwy, 156, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2G: Recent progresses in derivative-free optimization I
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Recent progresses in derivative-free optimization I
Chair: Giampaolo Liuzzi
Cluster: Derivative-free Optimization

Talk 1: Worst-case complexity analysis of derivative-free methods for multi-objective optimization
Speaker: Giampaolo Liuzzi
Abstract: In this work, we consider unconstrained multiobjective optimization problems where objective function values can only be obtained by querying a black box. The main aim of the paper is to give worst-case complexity bounds for derivative-free multi-objective optimization methods which adopt a linesearch expansion technique. We show that the considered methods enjoy the same worst-case complexity bounds recently proved for a directional multisearch method. Furthermore, exploiting the expansion technique, we are also able to give a further complexity results concerning the number of iterations with a measure of stationarity above a prefixed tolerance.

Talk 2: Exploring complexity bounds of model based trust region derivative free methods
Speaker: Katya Scheinberg
Abstract: Model-based trust region derivative free methods pioneered by Powell rely on interpolation models to approximate objective function in a trust region. The quality of this approximation dictates algorithmic progress and is, in turn, dictated by the geometry of the sample set. The methods are designed to trade-off carefully between the "exploration" and "exploitation", i.e. between seeking progress an improving sample set geometry. While these methods have been very successful in practice and have been show to converge, their complexity analysis has been incomplete, especially in terms of dependence on the dimension. We will present an improved complexity analysis for different variants of these methods.

Talk 3: Revisiting the convergence analysis of derivative-free trust region and direct search
Speaker: Cunxin Huang
Abstract: Derivative-Free trust region and direct search are two popular classes of derivative-free optimization methods. In this paper, we propose a unified new perspective for the convergence analysis of these two classes of methods. Specifically, we find that the behavior of an algorithm-determined series will decide the asymptotic convergence, which is a generalization of the existing results under both deterministic and randomized settings.

Speakers
GL

Giampaolo Liuzzi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
KS

Katya Scheinberg

Professor, Georgia Institute of Technology
Name:Katya Scheinberg Title: Coca-Cola Foundation Chair and ProfessorAffiliation: H. Milton Stewart School of Industrial and Systems Engineering  Georgia Institute of Technology, Atlanta, GABio:Katya Scheinberg is a Coca-Cola Foundation Chair and Professor in the H. Milton Stewart... Read More →
CH

Cunxin Huang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 114 3501 Trousdale Pkwy, 114, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2H: Optimization as the Engine of Generative AI - II
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Optimization as the Engine of Generative AI - II
Chair: Renyuan Xu
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)

Talk 1: Designing and Optimizing Biological Molecules with Multimodal Stochastic Interpolant Generative Models
Speaker: Ge Liu
Abstract: The design and optimization of biological molecules, such as proteins and peptides, offers transformative potential in medicine, materials science, and synthetic biology. Traditional methods such as directed evolution often struggle to explore the vast and complex molecular landscape efficiently. In addition, molecule design problems inherently involve both discrete and continuous variables (e.g., protein sequences and 3D structures) and operate on non-Euclidean manifolds to model geometric information (e.g., rotational group SO(3)). Generative modeling has emerged as a powerful framework for biological molecule design. In this talk, I will present recent advances in SDE/ODE-based stochastic interpolant generative models, such as diffusion and flow-matching, that enabled precise and controllable generation of biological molecules across multiple modalities. First, I will introduce Statistical Flow Matching (SFM), a novel generative framework leveraging the Riemannian geometry of statistical manifolds that enables efficient generation of discrete data. SFM has demonstrated strong performance on biological sequence design (DNA, protein) and generalizable to text and image domains. Next, I will introduce OC-Flow, a theoretically grounded training-free optimal control framework for guiding flow-matching generative models on both Euclidean and non-Euclidean manifolds. By formulating generative sampling as an optimal control problem, OC-Flow enables effective guided sampling for solving a diverse set of inverse problem across computer vision, chemical molecule, and peptide design tasks, achieving controlled generation of molecules with optimized properties and energies. This talk will provide new perspectives on how stochastic interpolant generative models can bridge optimization, machine learning, and biomolecular engineering, paving the way for next-generation protein design.

Talk 2: Panel
Speaker: Ahmad Beirami
Abstract: Ahmad Beirami (Google DeepMind) and Renyuan Xu (Stanford University) will host a panel on the interactions between Optimization and Generative AI.

Talk 3: Panel
Speaker: Renyuan Xu
Abstract: Ahmad Beirami (Google DeepMind) and Renyuan Xu (Stanford University) will host a panel on the interactions between Optimization and Generative AI.

Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 116 3501 Trousdale Pkwy, 116, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2I: Frontiers of Optimization for Machine Learning - Part II
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Frontiers of Optimization for Machine Learning - Part II
Chair: Vivak Patel
Cluster: Nonlinear Optimization

Talk 1: Online Statistics Inference via Matrix-Free Newton Methods
Speaker: Sen Na
Abstract: Given the ubiquity of streaming data, online algorithms have been widely used for parameter estimation, with second-order methods particularly standing out for their efficiency and robustness. In this talk, we introduce an online sketched Newton method that leverages a randomized sketching technique to perform an approximate Newton step in each iteration, thereby eliminating the computational bottleneck of second-order methods. While existing studies have established the asymptotic normality of sketched Newton methods, a consistent estimator of the limiting covariance matrix remains an open problem. We propose a fully online covariance matrix estimator that is constructed entirely from the Newton iterates and requires no matrix factorization. Compared to covariance estimators for first-order online methods, our estimator for second-order methods is batch-free. We establish the consistency and convergence rate of our estimator, and coupled with asymptotic normality results, we can then perform online statistical inference for the model parameters based on sketched Newton methods. We also discuss the extension of our estimator to constrained problems, and demonstrate its superior performance on regression problems as well as benchmark problems in the CUTEst set.

Talk 2: Optimization in Combinatorial and Non-Convex ML: Positive and Negative Results
Speaker: Jean Honorio
Abstract: Several modern machine learning (ML) problems are combinatorial and non-convex, for which theoretical guarantees are quite limited. My long-term research goal is to uncover the general foundations of ML and optimization that drives empirical success. I aim to develop a set of optimization-theoretic frameworks and tools to bridge the aforementioned gaps, to further our understanding of continuous (possibly non-convex) relaxations of combinatorial problems, as well as our knowledge of non-convexity. In this talk, I first focus on invex (non-convex) optimization problems, with some motivation from exploratory research on fairness and mixed linear regression. I present a generalization of gradient descent for the family of invex (non-convex) problems, which provably converges to the global minimum in polynomial time. Finally, for general non-convex problems, I show that any gradient-based algorithm, requires an exponential number of gradients to find the global minimum. Second, when stochastic gradients are biased, how can we obtain convergence to the global minima of a complex convex function? I propose a provably convergent algorithm that requires more computational effort as the algorithm progresses through several gradient descent iterations. Interestingly, more complex algorithms do not converge in this regime. Third, I discuss a difficult combinatorial problem over directed graphs with acyclicity constraints. Interestingly, using the statistical principle of identifiability, one can reduce the search space, and propose provably correct sequential optimization algorithms. Finally, I focus on problems with high-order relationships, usually formulated as tensor optimization problems. I propose a convex conic form relaxation. To this end, I carefully define the Caratheodory symmetric tensor cone, and discuss its benefits in optimization

Talk 3: Hessian-aware scaling of the gradient directions
Speaker: Fred Roosta
Abstract: Gradient descent is the primary workhorse for the optimisation of large-scale, nonconvex problems in machine learning. However, its performance is heavily dependent on step size selection. Due to a lack of natural scaling, this necessitates costly line searches or heuristic guess-and-check methods for step size selection. We propose an efficient scaling of gradient descent using a scalar informed by Hessian information. By carefully considering the curvature along the gradient direction, we demonstrate that Hessian-aware scaled gradient directions provide a local unit step size guarantee, even in the nonconvex setting. We extend this result to scenarios where the Hessian and gradient are stochastic. Additionally, we show global convergence of Hessian-aware scaled gradient descent under a significant weakening of the typical Lipschitz gradient smoothness assumption. We validate our results on large-scale machine learning problems and demonstrate that, through alternating scalings, we obtain an algorithm that rapidly converges across various problems.

Speakers
VP

Vivak Patel

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SN

Sen Na

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JH

Jean Honorio

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
FR

Fred Roosta

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 100 3518 Trousdale Pkwy, 100, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2J: Optimization in Control - Algorithms and Applications
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Optimization in Control - Algorithms and Applications
Chair: Arvind Raghunathan
Cluster: Interplay Between Continuous and Discrete Optimization

Talk 1: Optimization based control of networked systems
Speaker: Sridharakumar Narasimhan
Abstract: Process industries often use multiple units in series parallel combination, e.g., heat exchangers, pumps, compressors, fuel cells/solar cells etc. Optimal operation of such a network involves determining the appropriate loads of each equipment optimally by minimizing a cost function (e.g., power consumed, current drawn, heat loss) while satisfying safety constraints and meeting overall constraints on flow, power, etc. This results in a constrained optimization problem. Real time optimization or optimal control requires such a problem to solved in real time using measurements with uncertain plant models. In this work, we present a methodology for optimal control of a network of equipment using the structure of the optimal solution, e.g., . In many networked systems, optimality requires that the gradients of the cost function with respect to manipulated variables are equal. Hence, the optimal loads in the different branches of the network are manipulated such that the optimality conditions are satisfied. This is demonstrated using numerical simulations of a fuel cell stack.

Talk 2: Strong Disjunctive Cuts for MIP formulations of Optimal Control of Piecewise Affine Systems
Speaker: Prachi Shah
Abstract: Hybrid systems governed by piecewise affine dynamics are widely used in modern control applications, yet their optimal control remains a computational challenge due to weak relaxations of traditional mixed-integer programming (MIP) formulations. In this work, we present a novel methodology that introduces strong disjunctive cuts derived from the convex hull of a subproblem restricted to consecutive time intervals. Our approach tightens the linear relaxation of any choice of MIP formulation while keeping the cut-generation complexity independent of the overall time horizon. Comprehensive computational experiments on benchmark problems demonstrate that this strategy not only yields tighter bounds at the root node but also reduces the solution time of the MIP.

Talk 3: On global convergence of MPEC methods for Optimal Control of Hybrid Dynamical Systems
Speaker: Saif Kazi
Abstract: Optimal control of a Hybrid Dynamical System is a difficult problem because of unknown non differentiable points or switches in the solution of discontinuous ODEs. The optimal control problem for such hybrid dynamical system can be reformulated into a dynamic complementarity system (DCS) problem. Subsequently, the differential equation system is further reformulated using numerical discretization schemes such as Implicit Runge Kutta (IRK) or Orthogonal Collocation method for higher order accurate numerical solutions. Since the solutions are non-differentiable, a moving finite element with switch detection method is implemented to ensure higher order accuracy along with different reformulations for the non-smooth complementarity constraints such as relaxation based formulations or the l1-penalty term formulation. In this paper, we analyze the global convergence properties of such reformulations and introduce a mixed active-set based strategy to converge to real optimal solutions and escape spurious stationary points.

Speakers
SN

Sridharakumar Narasimhan

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
PS

Prachi Shah

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SK

Saif Kazi

Research Scientist, Los Alamos National Laboratory
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 102 3501 Trousdale Pkwy, 102, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2K: Recent advances in matrix optimization
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Recent advances in matrix optimization
Chair: Chao Ding
Cluster: Conic and Semidefinite Optimization

Talk 1: A quadratically convergent semismooth Newton method for nonlinear semidefinite programming without subdifferential regularity
Speaker: Fuxiaoyue Feng
Abstract: The non-singularity of generalized Jacobians of the Karush-Kuhn-Tucker (KKT) system is crucial for local convergence analysis of semismooth Newton methods. In this talk, we present a new approach that challenges this conventional requirement. Our discussion revolves around a methodology that leverages some newly developed variational properties, effectively bypassing the necessity for non-singularity of all elements in the generalized Jacobian. Quadratic convergence results of our Newton methods are established without relying on commonly assumed subdifferential regularity conditions. This discussion may offer fresh insights into semismooth Newton methods, potentially paving the way for designing robust and efficient second-order algorithms for general nonsmooth composite optimizations.

Talk 2: On efficient and scalable computation of the nonparametric maximum likelihood estimator in mixture models
Speaker: Yangjing Zhang
Abstract: The nonparametric maximum likelihood estimation (NPMLE) is a classic and important method to estimate the mixture models from finite observations. In this talk, we propose an efficient and scalable semismooth Newton based augmented Lagrangian method (ALM). By carefully exploring the structure of the ALM subproblem, we show that the computational cost of the generalized Hessian (second order information) is independent of the number of grid points. Extensive numerical experiments are conducted to show the effectiveness of our approach. 

Talk 3: An Accelerated Proximal ADMM for ODC of Uncertain Systems
Speaker: Xinyuan Zhao
Abstract: To ensure the system stability of the H2-guaranteed cost optimal decentralized control (ODC) problem, we formulate an approximate semidefinite programming (SDP) problem that leverages the block diagonal structure of the decentralized controller's gain matrix. To minimize data storage requirements and enhance computational efficiency, we employ the Kronecker product to vectorize the SDP problem into a conic programming (CP) problem. We then propose a proximal alternating direction method of multipliers (PADMM) to solve the dual of the resulting CP problem. By using the equivalence between the semi-proximal ADMM and the (partial) proximal point algorithm, we identify the non-expansive operator of PADMM, enabling the use of Halpern fixed-point iteration to accelerate the algorithm. Finally, we demonstrate that the sequence generated by the proposed accelerated PADMM exhibits a fast convergence rate for the Karush-Kuhn-Tucker residual. Numerical experiments confirm that the accelerated algorithm outperforms the well-known COSMO, MOSEK, and SCS solvers in efficiently solving large-scale CP problems, particularly those arising from H2-guaranteed cost ODC problems.

Speakers
CD

Chao Ding

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
FF

Fuxiaoyue Feng

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Yangjing Zhang

Yangjing Zhang

assistant professor, Chinese Academy of Sciences
I am currently an assistant professor in Institute of Applied MathematicsAcademy of Mathematics and Systems Science, Chinese Academy of Sciences. My current research is focused on large scale sparse optimization problems, the design of efficient algorithms for statistical models and graphical models... Read More →
XZ

Xinyuan Zhao

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 118 3501 Trousdale Pkwy, 118, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2L: Decision-Aware Optimization
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Decision-Aware Optimization
Chair: Vishal Gupta
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: Contextual Linear Optimization with Bandit Feedback
Speaker: Yichun Hu
Abstract: Contextual linear optimization (CLO) uses predictive contextual features to reduce uncertainty in random cost coefficients and thereby improve average-cost performance. An example is stochastic shortest path with random edge costs (e.g., traffic) and contextual features (e.g., lagged traffic, weather). Existing work on CLO assumes the data has fully observed cost coefficient vectors, but in many applications we can only see the realized cost of a historical decision, that is, just one projection of the random cost coefficient vector, to which we refer as bandit feedback. We study a class of offline learning algorithms for CLO with bandit feedback, which we term induced empirical risk minimization (IERM), where we fit a predictive model to directly optimize downstream performance of the policy it induces. We show a fast-rate regret bound for IERM that allows for misspecified model classes and flexible choices of the optimization estimate, and we develop computationally tractable surrogate losses. A byproduct of our theory of independent interest is fast-rate regret bound for IERM with full feedback and misspecified policy class. We compare the performance of different modeling choices numerically using a stochastic shortest path example and provide practical insights from the empirical results.

Talk 2: Learning Uncertainty Sets in Dynamic Robust Optimization
Speaker: Irina Wang
Abstract: We present a data-driven technique to automatically learn uncertainty sets in dynamic decision making under uncertainty. We formulate the learning problem as a control design problem where the control policy involves solving a robust optimization problem parametrized by the past disturbances, as well as the parameters of the uncertainty set. We propose a learning procedure to dynamically predict the parameters of the uncertainty set to minimize a closed-loop performance metric while satisfying probabilistic guarantees of constraint satisfaction. Our approach allows for uncertain data that is correlated across time periods, and can learn a wide range of commonly used uncertainty sets. By modeling our training problem objective and constraints using coherent risk metrics, we derive finite sample probabilistic guarantees of constraint satisfaction in multi-stage settings.

Talk 3: Surrogates for Decision-Aware Learning: Beyond the Linear Setting
Speaker: Vishal Gupta
Abstract: Designing surrogates that exploit downstream optimization structures is one of the key approaches to decision-aware learning. However, most work to date is either restricted to the linear contextual optimization setting or is largely heuristic with few theoretical performance guarantees. By extending recent work on using directional gradients to approximate decision loss, we show how to design surrogates with provable performance guarantees for nonlinear settings. This approach provides a natural recipe for attacking non-parametric and high-dimensional settings.

Speakers
YH

Yichun Hu

Assistant Professor, Cornell University
IW

Irina Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
VG

Vishal Gupta

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 119 3501 Trousdale Pkwy, 119, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2M: Applications of polynomial optimization to data analysis II
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Applications of polynomial optimization to data analysis II
Chair: Yong Sheng Soh
Cluster: Conic and Semidefinite Optimization

Talk 1: Semidefinite Network Games
Speaker: Antonios Varvitsiotis
Abstract: Network games are an important class of games that model agent interactions in networked systems, where players are situated at the nodes of a graph and their payoffs depend on the actions taken by their neighbors. We extend the classical framework by considering a game model where the strategies are positive semidefinite matrices having trace one. These (continuous) games can serve as a simple model of quantum strategic interactions. We focus on the zero-sum case, where the sum of all players’ payoffs is equal to zero. We establish that in this class of games, Nash equilibria can be characterized as the projection of a spectrahedron, that is, the feasible region of a semidefinite program. Furthermore, we demonstrate that determining whether a game is a semidefinite network game is equivalent to deciding if the value of a semidefinite program is zero. Beyond the zero-sum case, we characterize Nash equilibria as the solutions of a semidefinite linear complementarity problem.

Talk 2: Sum of squares hierarchy for the Gromov-Wasserstein Problem
Speaker: Yong Sheng Soh
Abstract: The Gromov-Wasserstein (GW) Problem is an extension of the classical optimal transport problem that allows one to compute distances between probability distributions specified over incomparable metric spaces. Broadly speaking, to get around the lack of a natural notion of distance between objects residing in different metric spaces, the GW computes the minimum of a suitably defined objective taken over all possible embeddings of the input metric spaces to a common space. This process leaves us with solving a non-convex quadratic programming instance. In this talk, we discuss the ideas of the sum-of-squares hierarchy applied to solving the GW problem. As a note, the central object of interest in the GW problem is a probability distribution, and we describe the necessary language in which ideas of polynomial optimization carry through to distributions.

Talk 3: On the geometric and computational complexity of polynomial bilevel optimization
Speaker: Quoc-Tung Le
Abstract: Bilevel optimization is an important mathematical tool to model phenomena in many domains, such as economic game theory, decision science and machine learning, to name but a few. Despite its importance, efficient and scalable algorithms for bilevel optimization are mostly developed for the (strong) convexity of the lower-level problem case, which is unrealistic for many practical tasks. In the quest to understand more general bilevel problems, we relax the lower level strong convexity and consider polynomial bilevel optimization, i.e., polynomial objective functions and constraints. We focus on the worst-case analysis of this class of problems, from both geometric and computational viewpoints. Our analysis suggests that even the algebraic rigidity of polynomials does not exclude extreme pathologies induced by the bilevel optimization. More specifically, we demonstrate that any semi-algebraic function can be represented as the objective of a polynomial bilevel problem. This discovery implies that solving polynomial bilevel optimization is equivalent to optimizing general semi-algebraic functions. We obtain other sharp variations of this result by considering relevant properties of the lower problem, such as convexity or feasible set compacity. In addition, we show the Σ2p-hardness of polynomial bilevel optimization, characterizing polynomial bilevel problems as vastly more challenging than NP-complete problems (under reasonable hardness assumptions).

Speakers
AV

Antonios Varvitsiotis

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YS

Yong Sheng Soh

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
QL

Quoc-Tung Le

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 157 3518 Trousdale Pkwy, 157, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2N: Optimization for Robotics II
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Optimization for Robotics II
Chair: Panos Patrinos
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: Graphs of Convex Sets with Applications to Robot Motion Planning
Speaker: Tobia Marcucci
Abstract: In this talk we introduce a novel modeling and computational framework for joint discrete and continuous decision making. We consider graphs where each vertex is associated with a convex optimization problem, and each edge couples two problems through additional convex costs and constraints. We call these Graphs of Convex Sets (GCS). Many classical problems in graph theory are naturally generalized to GCS, yielding a new class of problems at the interface of combinatorial and convex optimization with a wide variety of applications. For the solution of these problems, we present a unified technique that leverages perspective operators to formulate tight convex relaxations and strong mixed-integer formulations. In the second part of the presentation, we focus on the shortest-path problem in GCS and its application to robot motion planning. We present early experiments from Amazon Robotics, where our framework enables warehouse robots to move packages between bins nearly twice as fast as the current motion-planning solution.

Talk 2: Deep-Learning Aided Optimization for Decision-Making
Speaker: Evangelos Theodorou
Abstract: Optimization problems in robotics are typically nonlinear and nonconvex, while their scalability can range from few to millions of states and controls variables depending on the use case, the robotic system and the task in consideration. In this talk I will present a new class of algorithms for distributed optimization of multi-agent systems with emphasis on robotics applications. The primary goal is to out-perform existing algorithms in terms of convergence speed, optimality and scaling. To do so we will draw connections between iterative optimization algorithms and model-based deep learning approaches. These connections will allow us to develop neural networks architectures for learning to optimize that are interpretable, scalable and come with generalization guarantees. The interpretability arises from treating each iteration of an optimization method as layer with the corresponding tuning parameters treated as learnable parameters. Training of such architectures takes place in a supervised as well as semi-supervised learning fashion. We will show a range of applications of such neural network architectures including large-scale distributed optimal control, model predictive control, and network flow problems. The proposed architectures do not only improve performance, they also address a long-standing problem in industry and academia related to interpretability of neural network architectures when deployed to the real world.

Talk 3: Safe treatment of infeasible convex optimization problems via the augmented Lagrangian
Speaker: Roland Andrews
Abstract: This work focuses on constrained convex optimization problems. The augmented Lagrangian method is a popular algorithm designed to tackle such problems by solving sequences of unconstrained optimization problems. It is practically efficient and offers strong theoretical guarantees under minimal assumptions, provided that the feasible set associated with the constraints is non-empty. However, the infeasible setting for constrained optimization problems has only recently started to attract attention. This issue is particularly relevant in areas such as optimal control (e.g., Model Predictive Control) and machine learning (e.g., neural networks using convex optimization layers), where infeasibility frequently arises. Recent studies have approached this problem under various assumptions. In this work, we analyze the general case, relying solely on convexity as the key assumption. Our approach leverages the classical relationship between the augmented Lagrangian algorithm and the dual proximal point algorithm.

Speakers
avatar for Tobia Marcucci

Tobia Marcucci

Assistant Professor, University of California, Santa Barbara
Name: Tobia MarcucciTitle: Assistant Professor of Electrical and Computer EngineeringAffiliation: University of California, Santa BarbaraBio:Tobia Marcucci is an Assistant Professor in the department of Electrical and Computer Engineering at the University of California, Santa Barbara... Read More →
ET

Evangelos Theodorou

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RA

Roland Andrews

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 256 3518 Trousdale Pkwy, 256, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2O: Optimization in Global Health
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Optimization in Global Health
Chair: Aleksandr Aravkin
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: Large-scale Kernel Regression in Complex Global Health Estimation
Speaker: Peng Zheng
Abstract: We present an efficient approach to large-scale kernel regression with applications to estimating global mortality and causes of death. We highlight computational elements, particularly how computational elements in kernel regression interact with the problem scale and technical requirements, including constraints and aggregated observations.

Talk 2: Joint estimation of Prevalence, Sensitivity, and Specificity
Speaker: Nora Gilbertson
Abstract: Lack of perfect tests is a classic problem in epidemiology, and must be overcome to understand prevalence and burden of disease. Multiple imperfect tests are typically available, with partial information on their diagnostic properties (such as sensitivity and specificity). We present a joint inversion approach that allows us to obtain improved results for location-specific prevalence and diagnostic properties of multiple tests jointly, using all available information multiple locations and multiple imperfect tests. We explain the approach and show results on both simulated cases and schistosomiasis data.

Talk 3: Fast optimization approaches for raking
Speaker: Ariane Ducellier
Abstract: Raking is a classic problem in survey science, where available granular estimates are updated so that their aggregations across particular dimensions match available constraints from independent sources. We formulate raking as an optimization problem, and show how to efficiently solve complex raking examples in multiple dimensions with both direct and aggregated observations. The approach leverages duality theory, and intermediate results together with the implicit function theorem allow us to efficiently estimate asymptotic uncertainty of the raked estimates.

Speakers
PZ

Peng Zheng

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
NG

Nora Gilbertson

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AD

Ariane Ducellier

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 258 3518 Trousdale Pkwy, 258, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2P: Techniques of PDE Optimization in Machine Learning
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Techniques of PDE Optimization in Machine Learning
Chair: Anton Schiela
Cluster: PDE-constrained Optimization

Talk 1: SA-NODEs and the Universal Approximation of Dynamical Systems"
Speaker: Lorenzo Liverani
Abstract: In this talk, I will introduce the framework of semi-autonomous neural ordinary differential equations (SA-NODEs), a variation of vanilla NODEs employing fewer parameters. This is achieved by making the coefficients of the SA-NODEs independent of time. Despite this apparent simplification, I will demonstrate that SA-NODEs retain all the strong approximation properties of Vanilla NODEs, both from a theoretical and a numerical perspective. Specifically, SA-NODEs are able to learn the global flow of a dynamical system and track the entire trajectory over a finite (but arbitrary) time horizon. I will conclude the talk by presenting several numerical experiments, showing that SA-NODEs perform well for various systems and significantly outperform vanilla NODEs. This is joint work with Z. Li, K. Liu, and E. Zuazua.

Talk 2: Preconditioned Gradient Methods for Optimizing Neural Networks with Hilbert Space Layers
Speaker: Frederik Koehne
Abstract: Optimization problems in the context of machine learning typically involve optimization variables that are operators between Hilbert Spaces. In gradient based methods, selecting an appropriate inner product on this space of linear operators is fundamental to obtain meaningful search directions. We review the natural inner product on the space of Hilbert Schmidt operators and demonstrate its efficient application in computing gradients for the transition matrices in artificial neural networks. This approach ensures that structural information from network layers is incorporated into optimization updates. We present the theoretical foundations, discretization details, and numerical results, confirming that the solutions obtained retain the expected structural properties.

Talk 3: ProxSTORM: A Stochastic Trust Region Algorithm for Nonsmooth Optimization
Speaker: Aurya Javeed
Abstract: This talk is about minimizing a smooth term plus a convex nonsmooth term. We present a stochastic proximal Newton trust region algorithm that assumes models and estimates of the objective are sufficiently accurate, sufficiently often. Like STORM (work on stochastic optimization with random models), we use facts about martingales to prove our algorithm is globally convergent with probability one.

Speakers
AS

Anton Schiela

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
LL

Lorenzo Liverani

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
FK

Frederik Koehne

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AJ

Aurya Javeed

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 106 3501 Trousdale Pkwy, 106, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2Q: New Riemannian Optimization Applications
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: New Riemannian Optimization Applications
Chair: Jiang Hu
Cluster: Optimization on Manifolds

Talk 1: Retraction-free optimization over the Stiefel manifold with application to the LoRA fine-tuning
Speaker: Jiang Hu
Abstract: Optimization over the Stiefel manifold has played a significant role in various machine learning tasks. Many existing algorithms either use the retraction operator to keep each iterate staying on the manifold, or solve an unconstrained quadratic penalized problem. The retraction operator in the former corresponds to orthonormalization of matrices and can be computationally costly for large-scale matrices. The latter approach usually equips with an unknown large penalty parameter. To address the above issues, we propose a retraction-free and penalty parameter-free algorithm, which lands on the manifold. Moreover, our convergence theory allows using constant step size, which improve the result of converging to a neighborhood in \citep{ablin2022fast}. A key component of the analysis is the convex-like property of the quadratic penalty of the Stiefel manifold, which enables us to explicitly characterize the constant penalty parameter. As an application, we introduce a new algorithm, Manifold-LoRA, which employs the landing technique and a carefully designed step size strategy to accelerate low-rank adaptation (LoRA) in fine-tuning large language models. Numerical experiments on the benchmark datasets demonstrate the efficiency of our proposed method.

Talk 2: Optimal Tensor Network Disentanglement via Manifold Optimization
Speaker: Chao Yang
Abstract: A tensor network can be disentangled by performing a unitary gauge transformation within the network to allow the transformed network to be approximated by a low rank decomposition. Seeking an unitary transformation to minimize the truncation error is equivalent to solving a constrained optimization problem in which the optimal solution of the problem lies on a Stiefel manifold. We describe the objective function for achieving disentanglement and show how the problem can be solved by a Riemannian Newton's method. We also discuss practical issues such as the choice of a starting guess, the stopping criterion and how the gradient and Hessian can be computed efficiently.

Talk 3: A projected semismooth Newton method for a class of nonconvex composite programs with strong prox-regularity
Speaker: Jiayuan Wu
Abstract: This paper aims to develop a Newton-type method to solve a class of nonconvex composite programs. In particular, the nonsmooth part is possibly nonconvex. To tackle the nonconvexity, we develop a notion of strong prox-regularity which is related to the singleton property and Lipschitz continuity of the associated proximal operator, and we verify it in various classes of functions, including weakly convex functions, indicator functions of proximally smooth sets, and two specific sphere-related nonconvex nonsmooth functions. In this case, the problem class we are concerned with covers smooth optimization problems on manifold and certain composite optimization problems on manifold. For the latter, the proposed algorithm is the first second-order type method. Combining with the semismoothness of the proximal operator, we design a projected semismooth Newton method to find a root of the natural residual induced by the proximal gradient method. Due to the possible nonconvexity of the feasible domain, an extra projection is added to the usual semismooth Newton step and new criteria are proposed for the switching between the projected semismooth Newton step and the proximal step. The global convergence is then established under the strong prox-regularity. Based on the BD regularity condition, we establish local superlinear convergence. Numerical experiments demonstrate the effectiveness of our proposed method compared with state-of-the-art ones.

Speakers
JH

Jiang Hu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
CY

Chao Yang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JW

Jiayuan Wu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 214 3501 Trousdale Pkwy, 214, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2R: Theoretical and computational advances in nonconvex optimization via convex relaxations and distributionally robust optimization
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Theoretical and computational advances in nonconvex optimization via convex relaxations and distributionally robust optimization
Chair: E. Alper Yildirim
Cluster: Global Optimization

Talk 1: Semidefinite liftings for the complex cut polytope
Speaker: Lennart Sinjorgo
Abstract: We consider the complex cut polytope: the convex hull of Hermitian rank-one matrices xx*, where the elements of the n-dimensional vector x are complex m-th unit roots. These polytopes find applications in MAX-3-CUT, digital communication, and more generally, complex quadratic programming. For m = 2, the complex cut polytope corresponds to the well-known real cut polytope. We provide an exact description of the complex cut polytope for m = n = 3 and investigate second order semidefinite liftings of the complex cut polytope. For such second order liftings, we show a method for reducing the size of the matrix, without weakening the approximation. We support our theoretical findings with numerical experiments.

Talk 2: Novel and Tractable Convex Relaxations of Standard Quadratic Optimization Problems under Cardinality Constraints
Speaker: E. Alper Yildirim
Abstract: Standard quadratic optimization problems (StQPs) provide a versatile modelling tool in a multitude of applications such as mathematical finance, machine learning (clustering) and modelling in biosciences (e.g. selection and ecology). In this talk, we consider StQPs under an additional cardinality (sparsity) constraint which, even for convex objectives, renders NP-hard problems. One motivation to study StQPs under such sparsity restrictions is the high-dimensional portfolio selection problem with too many assets to handle, in particular in the presence of transaction costs. We present novel computational approaches to this relevant but difficult problem, involving modern conic optimization techniques, along with significant dimensional reduction, which is essential for tractability of these methods when problem size grows. In addition, we propose a particular generation procedure that systematically avoids too easy instances. We present extensive computational results demonstrating the versatility and strength of the proposed relaxations.

Talk 3: Distributionally robust standard quadratic optimization with Wasserstein ambiguity
Speaker: Daniel de Vicente
Abstract: The standard quadratic optimization problem (StQP) consists of minimizing a quadratic form over the standard simplex. If the quadratic form is neither convex nor concave, the StQP is NP-hard. This problem has many interesting applications ranging from portfolio optimization to machine learning. Sometimes, the data matrix is uncertain but some information about its distribution can be inferred, e.g. the first two moments or else a reference distribution (typically, the empirical distribution after sampling). In distributionally robust optimization, the goal is to minimize over all possible distributions in an ambiguity set defined based upon above mentioned characteristics. We will explore two versions: the distributionally robust stochastic StQP focussing on expectations, and the distributionally robust chance constrained StQP, both with an ambiguity set based upon maximal Wasserstein distance to the sampled distribution.

Speakers
avatar for Lennart Sinjorgo

Lennart Sinjorgo

Phd Student, Tilburg University
Lennart SinjorgoPhD Student in Operations Research at Tilburg UniversityInterested in semidefinite programming
EA

E. Alper Yildirim

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
DD

Daniel de Vicente

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 154 3518 Trousdale Pkwy, 154, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2S: Robust Machine Learning
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Robust Machine Learning
Chair: Zhengyuan Zhou
Cluster: Optimization For Data Science

Talk 1: Robust Reinforcement Learning from Corrupted Human Feedback
Speaker: Zixuan Zhang
Abstract: Reinforcement learning from human feedback (RLHF) provides a principled framework for aligning AI systems with human preference data. For various reasons, e.g., personal bias, context ambiguity, lack of training, etc, human annotators may give incorrect or inconsistent preference labels. To tackle this challenge, we propose a robust RLHF approach $R^3M$, which models the potentially corrupted preference label as sparse outliers. Accordingly, we formulate the robust reward learning as an l1-regularized maximum likelihood estimation problem. Computationally, we develop an efficient alternating optimization algorithm, which only incurs negligible computational overhead compared with the standard RLHF approach. Theoretically, we prove that under proper regularity conditions, $R^3M$ can consistently learn the underlying reward and identify outliers, provided that the number of outlier labels scales sublinearly with the preference sample size. Furthermore, we remark that $R^3M$ is versatile and can be extended to various preference optimization methods, including direct preference optimization (DPO). Our experiments on robotic control and natural language generation with large language models (LLMs) show that $R^3M$ improves robustness of the reward against several types of perturbations to the preference data.

Talk 2: Robust Online Control with Model Misspecification
Speaker: Xinyi Chen
Abstract: We study online control of an unknown nonlinear dynamical system that is approximated by a time-invariant linear system with model misspecification. Our study focuses on robustness, a measure of how much deviation from the assumed linear approximation can be tolerated by a controller while maintaining finite ℓ2-gain. A basic methodology to analyze robustness is via the small gain theorem. However, as an implication of recent lower bounds on adaptive control, this method can only yield robustness that is exponentially small in the dimension of the system and its parametric uncertainty. The work of Cusumano and Poolla shows that much better robustness can be obtained, but the control algorithm is inefficient, taking exponential time in the worst case. In this paper we investigate whether there exists an efficient algorithm with provable robustness beyond the small gain theorem. We demonstrate that for a fully actuated system, this is indeed attainable. We give an efficient controller that can tolerate robustness that is polynomial in the dimension and independent of the parametric uncertainty; furthermore, the controller obtains an ℓ2-gain whose dimension dependence is near optimal.

Talk 3: Approximations to worst-case data dropping: unmasking failure modes
Speaker: Jenny Huang
Abstract: A data analyst might worry about generalization if dropping a very small fraction of data points from a study could change its substantive conclusions. Finding the worst-case data subset to drop poses a combinatorial optimization problem. To overcome this intractability, recent works propose using additive approximations, which treat the contribution of a collection of data points as the sum of their individual contributions, and greedy approximations, which iteratively select the point with the highest impact to drop and re-run the data analysis without that point [Broderick et al., 2020, Kuschnig et al., 2021]. We identify that, even in a setting as simple as OLS linear regression, many of these approximations can break down in realistic data arrangements. Several of our examples reflect masking, where one outlier may hide or conceal the effect of another outlier. Based on the failures we identify, we provide recommendations for users and suggest directions for future improvements.

Speakers
ZZ

Zhengyuan Zhou

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
XC

Xinyi Chen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JH

Jenny Huang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 155 3518 Trousdale Pkwy, 155, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2T: Recent Advances in Stochastic Optimization: Complexity, Adaptivity, and Nonsmooth Extensions (II)
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Recent Advances in Stochastic Optimization: Complexity, Adaptivity, and Nonsmooth Extensions (II)
Chair: Sen Na & Zhaosong Lu
Cluster: Nonlinear Optimization

Talk 1: On the convergence of policy gradient methods for stochastic nonlinear dynamical systems
Speaker: Sungho Shin
Abstract: We analyze the local convergence of policy gradient methods for stochastic nonlinear dynamical systems. Under several technical assumptions, we show that the policy gradient algorithm converges to the optimal policy.

Talk 2: Improved complexity of proximal bundle methods and new insights on bundle management
Speaker: Andy Sun
Abstract: Proximal bundle method (PBA) is a fundamental and computationally effective algorithm for solving optimization problems with nonsmooth components. In this talk, we will first investigate a composite setting where one function is smooth and the other is piecewise linear. We present a novel complexity analysis of PBA and derive a O(\epsilon^{-0.8}) iteration complexity, improving upon the known O(\epsilon^{-2}) guarantee. Our analysis also sheds new light on bundle management strategies. Computation experiments on two-stage robust optimization and support vector machine demonstrate the effectiveness of the new insights.

Talk 3: An Optimal Single-Loop Algorithm for Convex Finite-Sum Coupled Compositional Stochastic Optimization
Speaker: Tianbao Yang
Abstract: We will talk about a class of convex Finite-Sum Coupled Compositional Stochastic Optimization (cFCCO) problems with many applications, including group distributionally robust optimization (GDRO), learning with imbalanced data, reinforcement learning, and learning to rank. We will present an efficient single-loop primal-dual block-coordinate proximal algorithm, dubbed ALEXR. This algorithm leverages block-coordinate stochastic mirror ascent updates for the dual variable and stochastic proximal gradient descent updates for the primal variable. We establish the convergence rates of ALEXR in both convex and strongly convex cases under smoothness and non-smoothness conditions of involved functions, which not only improve the best rates in previous works on smooth cFCCO problems but also expand the realm of cFCCO for solving more challenging non-smooth problems such as the dual form of GDRO. Finally, we present lower complexity bounds to demonstrate that the convergence rates of ALEXR are optimal among first-order block-coordinate stochastic algorithms for the considered class of cFCCO problems.

Speakers
SS

Sungho Shin

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AS

Andy Sun

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
TY

Tianbao Yang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 158 3518 Trousdale Pkwy, 158, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2U: Advances in Domain-Specific Languages for Optimization
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Advances in Domain-Specific Languages for Optimization
Chair: Steven Diamond
Cluster: Computational Software

Talk 1: Ten Years of CVXPY
Speaker: William Zhang
Abstract: The CVXPY project has been under development for over ten years. The initial motivation was to reproduce in Python the functionality of the CVX MATLAB package. However, early CVXPY design decisions diverged from CVX, with a greater emphasis on modular problem construction, expression trees, and parameterized problems. We explore how these initial design decisions led to later research on object oriented optimization, matrix-free optimization, and differentiable optimization. We conclude with an overview of upcoming CVXPY features.

Talk 2: CvxLean, a convex optimization modeling framework based on the Lean 4 proof assistant
Speaker: Paul Jackson
Abstract: A proof assistant is a computer environment in which mathematical definitions, theorems and proofs can be expressed, developed and checked in a completely formal way. CvxLean at heart is a realization of the disciplined convex programming paradigm in the proof assistant Lean 4.  As with other DCP frameworks, optimization problems can be initially expressed in a language of atomic functions.  Rules are implemented for automatically inferring problems are convex, and automatic transformations are supported for reducing problems to conic form.  Currently CvxLean uses Mosek to solve these reduced problems. Unlike other frameworks, with CvxLean, the convexity rules and transformation rewrites all must be formally proven to be correct, giving the user extra confidence in the reduction process.  Also, initial problems do not have to be directly expressed in terms of atomic functions and they do not have to satisfy the rules for inferring convexity.  Rather, users can draw on a range of standard definitions available in Lean's mathematical library.  If this is done, standard Lean machinery can be used to formally manipulate problems into recognizably-convex forms using the atomic functions. These manipulations can be tedious to guide, and recent work has explored using e-graph rewriting to discover them automatically.

Talk 3: Disciplined Saddle Programming
Speaker: Philipp Schiele
Abstract: We consider convex-concave saddle point problems, and more generally convex optimization problems we refer to as saddle problems, which include the partial supremum or infimum of convex-concave saddle functions. Saddle problems arise in a wide range of applications, including game theory, machine learning, and finance. It is well known that a saddle problem can be reduced to a single convex optimization problem by dualizing either the convex (min) or concave (max) objectives, reducing a min-max problem into a min-min (or max-max) problem. Carrying out this conversion by hand can be tedious and error prone. In this paper we introduce disciplined saddle programming (DSP), a domain specific language (DSL) for specifying saddle problems, for which the dualizing trick can be automated. The language and methods are based on recent work by Juditsky and Nemirovski, who developed the idea of conic-representable saddle point programs, and showed how to carry out the required dualization automatically using conic duality. Juditsky and Nemirovski's conic representation of saddle problems extends Nesterov and Nemirovski's earlier development of conic representable convex problems; DSP can be thought of as extending disciplined convex programming (DCP) to saddle problems. Just as DCP makes it easy for users to formulate and solve complex convex problems, DSP allows users to easily formulate and solve saddle problems. Our method is implemented in an open-source package, also called DSP.

Speakers
SD

Steven Diamond

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
PJ

Paul Jackson

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
PS

Philipp Schiele

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 108 3501 Trousdale Pkwy, 108, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2V: Nonsmooth stochastic optimization and variational inequalities
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Nonsmooth stochastic optimization and variational inequalities
Chair: Wei Bian
Cluster: Fixed Points and Variational Inequalities

Talk 1: Dynamic Stochastic Approximation Jacobi-Type ADMM Method for Two-stage Stochastic Generalized Nash Equilibrium Problems
Speaker: Hailin Sun
Abstract: This paper studies a specific class of two-stage stochastic generalized Nash equilibrium problems (SGNEPs), where each player engages in a two-stage sequential decision-making process in a random environment: first, they make a decision in the current stage and compete with one another, followed by making a decision in the future stage. This type of two-stage SGNEPs is widely found in fields such as production and manufacturing, transportation logistics, and portfolio management. From the perspective of solving the problem, the main difference between two-stage SGNEPs and single-stage SGNEPs is the need to handle the optimal value function of the second-stage problem, which does not have an explicit expression. At present, there is no effective algorithm to address these challenges. To overcome this difficulty, an accelerated primal-dual method (APDM) is proposed in the paper to obtain an approximate $\epsilon$-subgradient of the second-stage optimal value function, achieving a convergence rate of $O\left(\frac{1}{\sqrt{N}}\right)$. Using this approximate $\epsilon$-subgradient along with a variance reduction technique, a dynamic stochastic approximation Jacobi-type Alternating Direction Method of Multipliers (DSA-JADMM) method is proposed and applied to solve two-stage SGNEPs. This algorithm represents an inexact stochastic version of the Jacobi-type ADMM, as it computes an approximate $\epsilon$-subgradient for the second stage randomly at each iteration using APDM. It is also demonstrated that the algorithm can converge to a weak $\epsilon$-variational equilibrium point of two-stage SGNEPs with a convergence rate of $O\left(\frac{1}{\sqrt{K}}\right)$, which is a special type of Nash equilibrium point. To validate the effectiveness of the DSA-JADMM method, preliminary numerical experiments are conducted. These experiments demonstrate the advantages and superior performance of the proposed method.

Talk 2: Nonsmooth convex-concave saddle point problems with cardinality penalties
Speaker: Wei Bian
Abstract: we focus on a class of convexly constrained nonsmooth convex-concave saddle point problems with cardinality penalties. Although such nonsmooth nonconvex-nonconcave and discontinuous min-max problems may not have a saddle point, we show that they have a local saddle point and a global minimax point, and some local saddle points have the lower bound properties. We define a class of strong local saddle points based on the lower bound properties for stability of variable selection. Moreover we give a framework to construct continuous relaxations of the discontinuous min-max problems based on convolution, such that they have the same saddle points with the original problem. We also establish the relations between the continuous relaxation problems and the original problems regarding local saddle points, global minimax points, local minimax points and stationary points.

Talk 3: AN AUGMENTED LAGRANGIAN METHOD FOR TRAINING RECURRENT NEURAL NETWORKS
Speaker: Chao Zhang
Abstract: Recurrent Neural Networks (RNNs) are widely used to model sequential data in a wide range of areas, such as natural language processing, speech recognition, machine translation, and time series analysis. In this paper, we model the training process of RNNs with the ReLU activation function as a constrained optimization problem with a smooth nonconvex objective function and piecewise smooth nonconvex constraints. We prove that any feasible point of the optimization problem satisfies the no nonzero abnormal multiplier constraint qualification (NNAMCQ), and any local minimizer is a Karush-Kuhn-Tucker (KKT) point of the problem. Moreover, we propose an augmented Lagrangian method (ALM) and design an efficient block coordinate descent (BCD) method to solve the subproblems of the ALM. The update of each block of the BCD method has a closed-form solution. The stop criterion for the inner loop is easy to check and can be stopped in finite steps. Moreover, we show that the BCD method can generate a directional stationary point of the subproblem. Furthermore, we establish the global convergence of the ALM to a KKT point of the constrained optimization problem. Compared with the state-of-the-art algorithms, numerical results demonstrate the efficiency and effectiveness of the ALM for training RNNs.

Speakers
HS

Hailin Sun

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
WB

Wei Bian

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
CZ

Chao Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 110 3501 Trousdale Pkwy, 110, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2W: Variance-Related Issues in Stochastic Optimization Methods
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Variance-Related Issues in Stochastic Optimization Methods
Chair: Yue Wu
Cluster: nan

Talk 1: Some Unified Theory for Variance Reduced Prox-Linear Methods
Speaker: Yue Wu
Abstract: This work considers the nonconvex, nonsmooth problem of minimizing a composite objective of the form f(g(x))+h(x) where the inner mapping g is a smooth finite summation or expectation amenable to variance reduction. In such settings, prox-linear methods can enjoy variance-reduced speed-ups despite the existence of nonsmoothness. We provide a unified convergence theory applicable to a wide range of common variance-reduced vector and Jacobian constructions. Our theory (i) only requires operator norm bounds on Jacobians (whereas prior works used potentially much larger Frobenius norms), (ii) provides state-of-the-art high probability guarantees, and (iii) allows inexactness in proximal computations.

Talk 2: An Accelerated Variance Reduced Extra-Point Approach to Finite-Sum Hemivariational Inequality Problem
Speaker: Kevin Huang
Abstract: In this paper, we develop stochastic variance reduced algorithms for solving a class of finite-sum hemivariational inequality (HVI) problem. In this HVI problem, the associated function is assumed to be differentiable, and both the vector mapping and the function are of finite-sum structure. We propose two algorithms to solve the cases when the vector mapping is either merely monotone or strongly monotone, while the function is assumed to be convex. We show how to apply variance reduction in the proposed algorithms when such an HVI problem has a finite-sum structure, and the resulting accelerated gradient complexities can match the best bound established for finite-sum VI problem, as well as the bound given by the direct Katyusha for finite-sum optimization respectively, in terms of the corresponding parameters such as (gradient) Lipschitz constants and the sizes of the finite-sums. We demonstrate the application of our algorithms through solving a finite-sum constrained finite-sum optimization problem and provide preliminary numerical results. Archival version: https://doi.org/10.48550/arXiv.2211.03269

Talk 3: Adaptive stochastic optimization algorithms for problems with biased oracles
Speaker: Yin Liu
Abstract: Motivated by multiple emerging applications, e.g., stochastic composition optimization, we consider a general optimization problem where the gradient of the objective is only available through a biased stochastic oracle where the bias magnitude can be controlled by a parameter; however, lower bias requires higher computation. Without exploiting a specific bias decay structure, we propose a couple of adaptive and nonadaptive stochastic algorithms to solve the underlying problem. We analyze the nonasymptotic performance of the proposed algorithms in the nonconvex regimes. The numerical performance of the proposed methods over three applications on composition optimization, policy optimization for infinite-horizon Markov decision processes, and distributionally robust optimization will be presented.

Speakers
YW

Yue Wu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
KH

Kevin Huang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YL

Yin Liu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 112 3501 Trousdale Pkwy, 112, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2X: Proximal Methods and Splitting Algorithms
Monday July 21, 2025 1:15pm - 2:30pm PDT
Session: Proximal Methods and Splitting Algorithms
Chair: Mikhail Solodov
Cluster: nan

Talk 1: Asymmetric Variable Metric Proximal Point Methods, with Relations Between Relative-Errors and Summable-Errors
Speaker: Mikhail Solodov
Abstract: For the problem of finding zeroes of maximally monotone operators, we introduce a general scheme allowing both symmetric and asymmetric metrics in the proximal point algorithms. Our approach subsumes as special cases various proximal point methods in the literature. Moreover, it brings some new insights into the relationship between the hybrid proximal methods with bounded relative-error approximations and inexact proximal methods with error-summability-type conditions. In particular, introducing asymmetric metrics made it possible to cast summable errors in the framework of relative ones, asymptotically. Global convergence is established under standard assumptions. In addition, a certain generalization of Féjer-monotonicity is presented to obtain the local linear convergence rate under the calmness condition on the operator.

Talk 2: A FULL SPLITTING ALGORITHM FOR FRACTIONAL PROGRAMS 1 WITH STRUCTURED NUMERATORS AND DENOMINATORS
Speaker: Min Tao
Abstract: In this paper, we consider a class of nonconvex and nonsmooth fractional programming problems, which involve the sum of a convex, possibly nonsmooth function composed with a linear operator and a differentiable, possibly nonconvex function in the numerator and a convex, possibly nonsmooth function composed with a linear operator in the denominator. These problems have applications in various fields, including CT reconstruction and sparse signal recovery. We propose an adaptive full splitting proximal subgradient algorithm with an extrapolated step that addresses the challenge of evaluating the composition in the numerator by decoupling the linear operator from the nonsmooth component. We specifically evaluate the nonsmooth function using its proximal operator, while the linear operator is assessed through forward evaluations. Furthermore, the smooth component in the numerator is evaluated through its gradient, the nonsmooth component in the denominator is managed using its subgradient, and the linear operator in the denominator is also assessed through forward evaluations. We demonstrate subsequential convergence toward an approximate lifted stationary point and ensure global convergence under the Kurdyka-\L ojasiewicz property, all achieved {\it without relying on any full-row rank assumptions regarding the linear operators}. We provide further discussions on the tightness of the convergence results of the proposed algorithm and its related variants, and also on the reasoning behind aiming for an approximate lifted stationary point. This is exemplified by constructing a scenario illustrating that the algorithm could diverge when seeking exact solutions. Lastly, we present a practical version of the algorithm incorporating a nonmonotone line search, significantly enhancing its convergence performance. Our theoretical findings are validated through simulations involving limited-angle CT reconstruction and the robust Sharpe ratio minimization problem.

Talk 3: Inexact Proximal Point Algorithms for Zeroth-Order Global Optimization
Speaker: Minxin Zhang
Abstract: This work concerns the zeroth-order global minimization of continuous nonconvex functions with a unique global minimizer and possibly multiple local minimizers. We formulate a theoretical framework for inexact proximal point (IPP) methods for global optimization, establishing convergence guarantees under mild assumptions when either deterministic or stochastic estimates of proximal operators are used. The quadratic regularization in the proximal operator and the scaling effect of a positive parameter create a concentrated landscape of an associated Gibbs measure that is practically effective for sampling. The convergence of the expectation under the Gibbs measure is established, providing a theoretical foundation for evaluating proximal operators inexactly using sampling-based methods such as Monte Carlo (MC) integration. In addition, we propose a new approach based on tensor train (TT) approximation. This approach employs a randomized TT cross algorithm to efficiently construct a low-rank TT approximation of a discretized function using a small number of function evaluations, and we provide an error analysis for the TT-based estimation. We then propose two practical IPP algorithms, TT-IPP and MC-IPP. The TT-IPP algorithm leverages TT estimates of the proximal operators, while the MC-IPP algorithm employs MC integration to estimate the proximal operators. Both algorithms are designed to adaptively balance efficiency and accuracy in inexact evaluations of proximal operators. The effectiveness of the two algorithms is demonstrated through experiments on diverse benchmark functions and various applications. Reference: Zhang, M., Han, F., Chow, Y. T., Osher, S., & Schaeffer, H. (2024). Inexact Proximal Point Algorithms for Zeroth-Order Global Optimization. arXiv preprint arXiv:2412.11485.

Speakers
MS

Mikhail Solodov

IMPA
http://www.impa.br/~optim/solodov.html
avatar for Min Tao

Min Tao

Professor, Nanjing University
Tao Min's primary research interests lie in the theory and applications of optimization algorithms, with a particular focus on first-order methods and their applications in machine learning. Her representative works have been published in leading journals such as the SIAM Journal... Read More →
MZ

Minxin Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 215 3501 Trousdale Pkwy, 215, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 2Y
Monday July 21, 2025 1:15pm - 2:30pm PDT
Monday July 21, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 200 3518 Trousdale Pkwy, 200, Los Angeles, CA 90089

2:30pm PDT

Coffee & Snack Break (Provided)
Monday July 21, 2025 2:30pm - 3:00pm PDT
Monday July 21, 2025 2:30pm - 3:00pm PDT
TBA

3:00pm PDT

Parallel Semi-Plenary Talk 1A
Monday July 21, 2025 3:00pm - 4:00pm PDT
Speakers
LX

Lin Xiao

Lin Xiao is a Research Scientist at Facebook AI Research (FAIR) in Seattle, Washington. He received BE from Beijing University of Aeronautics and Astronautics (Beihang University) and PhD from Stanford University, and was a postdoctoral fellow in the Center for the Mathematics of... Read More →
Monday July 21, 2025 3:00pm - 4:00pm PDT
Taper Hall (THH) 101 3501 Trousdale Pkwy, 101, Los Angeles, CA 90089

3:00pm PDT

Parallel Semi-Plenary Talk 1B
Monday July 21, 2025 3:00pm - 4:00pm PDT
Speakers
JB

Jérôme Bolte

Jérôme Bolte is a Full Professor at the Toulouse School of Economics and holds a Chair in Artificial Intelligence at the Artificial Intelligence Institute of Toulouse (ANITI). He studied pure and applied mathematics before completing a degree in mathematics and then a doctorate... Read More →
Monday July 21, 2025 3:00pm - 4:00pm PDT
Taper Hall (THH) 201 3501 Trousdale Pkwy, 201, Los Angeles, CA 90089

4:00pm PDT

Break
Monday July 21, 2025 4:00pm - 4:15pm PDT
Monday July 21, 2025 4:00pm - 4:15pm PDT
TBA

4:15pm PDT

Parallel Sessions 3A: Advances in Solving Large-Scale Problems: Accelerated Methods and Sharp Analyses (I)
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Advances in Solving Large-Scale Problems: Accelerated Methods and Sharp Analyses (I)
Chair: Sen Na Liwei Jiang
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: Implicit Preconditioning in Stochastic Linear System Solvers
Speaker: Michał Dereziński
Abstract: In this talk, I will present new algorithms and convergence guarantees for solving linear systems via sketch-and-project, a framework which unifies many known iterative methods that use randomized sub-sampling and sketching, including randomized Kaczmarz, coordinate descent, and others. Our new results uncover a connection between stochastic iterative solvers and sketching-based randomized preconditioning algorithms: Whenever the spectral structure of a linear system is amenable to constructing a strong preconditioner via low-rank approximation, then one can construct a stochastic solver based on sketch-and-project that will implicitly take advantage of this spectral structure. In particular, I will show how this leads to solving an n x n linear system with at most k large (outlying) singular values in ~O( n^2 + nk^2 ) arithmetic operations, which is faster than the ~O( n^2 k ) cost of constructing a good preconditioner for a deterministic iterative solver such as conjugate gradient.

Talk 2: The radius of statistical efficiency
Speaker: Mateo Díaz
Abstract: Classical results in asymptotic statistics show that the Fisher information matrix controls the difficulty of estimating a statistical model from observed data. In this work, we introduce a companion measure of robustness of an estimation problem: the radius of statistical efficiency (RSE) is the size of the smallest perturbation to the problem data that renders the Fisher information matrix singular. We compute RSE up to numerical constants for a variety of test bed problems, including principal component analysis, generalized linear models, phase retrieval, bilinear sensing, and matrix completion. In all cases, the RSE quantifies the compatibility between the covariance of the population data and the latent model parameter. Interestingly, we observe a precise reciprocal relationship between RSE and the intrinsic complexity/sensitivity of the problem instance, paralleling the classical Eckart–Young theorem in numerical analysis.

Talk 3: Low rank approximation for faster optimization
Speaker: Madeleine Udell
Abstract: Low rank structure is pervasive in real-world datasets. This talk shows how to accelerate the solution of fundamental computational problems, including eigenvalue decomposition, linear system solves, composite convex optimization, and stochastic optimization (including deep learning), by exploiting this low rank structure. We present a simple method based on randomized numerical linear algebra for efficiently computing approximate top eigendecompositions, which can be used to replace large matrices (such as Hessians and constraint matrices) with low rank surrogates that are faster to apply and invert. The resulting solvers for linear systems (NystromPCG), composite convex optimization (NysADMM), and stochastic optimization (SketchySGD and PROMISE) demonstrate strong theoretical and numerical support, outperforming state-of-the-art methods in terms of speed and robustness to hyperparameters.

Speakers
MD

Michał Dereziński

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MD

Mateo Díaz

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Madeleine Udell

Madeleine Udell

Postdoctoral Fellow, Caltech Center for the Mathematics of Information
Madeleine Udell is a postdoctoral fellow at Caltech's Center for the Mathematics of Information, hosted by Joel Tropp. She will be joining Cornell as an Assistant Professor in the School of Operations Research and Information Engineering in July 2016. Her research focus is on modeling... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 101 3501 Trousdale Pkwy, 101, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3B: Deterministic and Stochastic Methods for Optimization and Games- Part II
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Deterministic and Stochastic Methods for Optimization and Games- Part II
Chair: Angelia Nedich
Cluster: Multi-agent Optimization and Games

Talk 1: Frequentist Guarantees of Distributed (Non)-Bayesian Inference
Speaker: Bohan Wu
Abstract: Motivated by the need to analyze large, decentralized datasets, distributed Bayesian inference has become a critical research area across multiple fields, including statistics, electrical engineering, and economics. This paper establishes Frequentist properties, such as posterior consistency, asymptotic normality, and posterior contraction rates, for the distributed (non-)Bayes Inference problem among agents connected via a communication network. Our results show that, under appropriate assumptions on the communication graph, distributed Bayesian inference retains parametric efficiency while enhancing robustness in uncertainty quantification. We also explore the trade-off between statistical efficiency and communication efficiency by examining how the design and size of the communication graph impact the posterior contraction rate. Furthermore, We extend our analysis to time-varying graphs and apply our results to exponential family models, distributed logistic regression, and decentralized detection models.

Talk 2: Decentralized high-dimensional inference over mesh networks: a unified perspective
Speaker: Marie Maros
Abstract: We consider the problem of solving high-dimensional statistical inference problems over a network of agents (with no coordinating agent) who have exclusive access to a fraction of the total available samples. In the high-dimensional setting, the problem dimension is much larger than the total number of available samples, making the problem ill-conditioned. Despite this, we empirically observe that obtaining a statistically meaningful solution is possible with many existing decentralized schemes, given that the underlying parameter to estimate lies in a low dimensional subspace. Our observations challenge the existing theories in two key ways: (i) most decentralized schemes do not break down as the problem dimensionality increases, and (ii) decentralized schemes that are expected to behave like one another behave very differently in high dimensions. To understand the behavior of decentralized optimization methods in high-dimensional inference we introduce a unified framework and analysis, allowing us to develop an understanding of the mechanisms enabling dimension independent performance of decentralized schemes.

Talk 3: Toward Parameter-free Decentralized Optimization
Speaker: Gesualdo Scutari
Abstract: We study the minimization of (locally strongly) convex, locally smooth functions over a network of agents without a centralized server. Existing decentralized algorithms require knowledge of problem and network parameters, such as the Lipschitz constant of the global gradient and/or network connectivity, for hyperparameter tuning. Agents usually cannot access this information, leading to conservative selections, slow convergence, or divergence. We introduce a decentralized algorithm that eliminates the need for specific parameter tuning. Our approach employs an operator splitting technique with a novel variable metric, enabling a local backtracking line-search to adaptively select the stepsize without global information or extensive communications. This results in favorable convergence guarantees and dependence on optimization and network parameters compared to existing nonadaptive methods. Theoretical findings are supported by numerical experiments.

Speakers
AN

Angelia Nedich

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MM

Marie Maros

Assistant Professor, Texas A&M University
Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
GS

Gesualdo Scutari

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 201 3501 Trousdale Pkwy, 201, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3C: Methods for Large-Scale Nonlinear Optimization III
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Methods for Large-Scale Nonlinear Optimization III
Chair: Baoyu Zhou
Cluster: Nonlinear Optimization

Talk 1: A Two Stepsize SQP Method for Nonlinear Equality Constrained Stochastic Optimization
Speaker: Michael O'Neill
Abstract: We develop a Sequential Quadratic Optimization (SQP) algorithm for minimizing a stochastic objective function subject to deterministic equality constraints. The method utilizes two different stepsizes, one which exclusively scales the component of the step corrupted by the variance of the stochastic gradient estimates and a second which scales the entire step. We prove that this stepsize splitting scheme has a worst-case complexity result which improves over the best known result for this class of problems. In terms of approximately satisfying the constraint violation, this complexity result matches that of deterministic SQP methods, up to constant factors, while matching the known optimal rate for stochastic SQP methods to approximately minimize the norm of the gradient of the Lagrangian. We also propose and analyze multiple variants of our algorithm. One of these variants is based upon popular adaptive gradient methods for unconstrained stochastic optimization while another incorporates a safeguarded line search along the constraint violation. Preliminary numerical experiments show competitive performance against a state of the art stochastic SQP method. In addition, in these experiments, we observe an improved rate of convergence in terms of the constraint violation, as predicted by the theoretical results.

Talk 2: A Proximal-Stochastic-Gradient Method for Regularized Equality Constrained Problems
Speaker: Daniel P. Robinson
Abstract: I present an algorithm of the proximal-stochastic-gradient variety for minimizing the sum of a nonconvex loss function and a convex regularization function subject to nonlinear equality constraints. Motivation for the algorithm is provided, along with a theoretical analysis and preliminary numerical results.

Talk 3: Randomized Feasibility-Update Algorithms For Variational Inequality Problems
Speaker: Abhishek Chakraborty
Abstract: This paper considers a variational inequality (VI) problem arising from a game among multiple agents, where each agent aims to minimize its own cost function subject to its constrained set represented as the intersection of a (possibly infinite) number of convex functional level sets. A direct projection-based approach or Lagrangian-based techniques for such a problem can be computationally expensive if not impossible to implement. To deal with the problem, we consider randomized methods that avoid the projection step on the whole constraint set by employing random feasibility updates. In particular, we propose and analyze such random methods for solving VIs based on the projection method, Korpelevich method, and Popov method. We establish the almost sure convergence of the methods and, also, provide their convergence rate guarantees. We illustrate the performance of the methods in simulations for two-agent games.

Speakers
BZ

Baoyu Zhou

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MO

Michael O'Neill

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Daniel P. Robinson

Daniel P. Robinson

Name: Daniel P. RobinsonTitle: Associate ProfessorAffiliation: Lehigh UniversityBio:Daniel P. Robinson received his Ph.D. from the University of California at San Diego in 2007. He spent the next three years working with Nicholas I. M. Gould and Jorge Nocedal as a Postdoctoral Researcher... Read More →
avatar for Abhishek Chakraborty

Abhishek Chakraborty

Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 208 3501 Trousdale Pkwy, 208, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3D: Semidefinite Programming - Theory and Algorithms
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Semidefinite Programming - Theory and Algorithms
Chair: Hao Hu
Cluster: Conic and Semidefinite Optimization

Talk 1: Tight error bounds for log-determinant cones without constraint qualifications
Speaker: Ting Kei Pong
Abstract: Log-determinant cone is the closure of the hypograph of the perspective function of the log-determinant function, and can be viewed as a spectral analogue of the exponential cone. In this talk, without requiring any constraint qualifications, we establish tight error bounds for the log-determinant cone. This error bound is obtained using the recently developed framework based on one-step facial residual functions, which has been successfully applied to deriving error bounds for the exponential cone. This is a joint work with Ying Lin, Scott B. Lindstrom and Bruno F. Lourenço.

Talk 2: Primal-dual interior point algorithms for nonsymmetric convex conic optimization
Speaker: Anita Varga
Abstract: In this talk, we present primal-dual interior point methods for nonsymmetric conic optimization. The proposed methods are of the path following type; we discuss different strategies for initialization and different neighborhoods to ensure fast convergence. We will also discuss some applications including sums-of-squares optimization, where the proposed methods can outperform the conventional semidefinite programming approach. We also share numerical results to illustrate the practical performance of our algorithms.

Talk 3: Ramana's exact dual for semidefinite programming, and elementary row operations
Speaker: Gabor Pataki
Abstract: Thirty years ago, in a seminal paper Ramana derived an exact dual for Semidefinite Programming (SDP). Ramana's dual has the following remarkable features: i) it assumes feasibility of the primal, but it does not make any regularity assumptions, such as strict feasibility ii) its optimal value is the same as the optimal value of the primal, so there is no duality gap. iii) it attains its optimal value when it is finite iv) it yields a number of complexity results in SDP, which are fundamental, and to date are still the best known. For example, it proves that SDP feasibility in the Turing model is not NP-complete, unless NP = co-NP. In this work we give a fairly complete analysis of Ramana's dual. First, we connect it to a seemingly very different way of inducing strong duality: reformulating the SDP into a rank revealing form using mostly elementary row operations. Second, we completely characterize the feasible set of Ramana's dual. As a corollary, we obtain a short and transparent derivation of Ramana's dual, which we hope is accessible to both the optimization and the theoretical computer science community. Our approach is combinatorial in the following sense: i) we use a minimum amount of continuous optimization theory ii) we show that feasible solutions in Ramana's dual are identified with regular facial reduction sequences, i.e., essentially discrete structures.

Speakers
TK

Ting Kei Pong

Professor, The Hong Kong Polytechnic University
Name: Dr. Ting Kei PONGTitle: ProfessorAffiliation: The Hong Kong Polytechnic UniversityBio: Please see my webpageFun Fact: :)
AV

Anita Varga

Postdoctoral Researcher, North Carolina State University
Name: Dr.Anita VargaTitle: Postdoctoral ResearcherAffiliation: North Carolina State UniversityBio:
GP

Gabor Pataki

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 210 3501 Trousdale Pkwy, 210, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3E: Efficient Optimization Methods for LLMs (Part II)
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Efficient Optimization Methods for LLMs (Part II)
Chair: Xiao Li
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)

Talk 1: BAdam: A Memory Efficient Block Coordinate Descent Method for LLM Training
Speaker: Xiao Li
Abstract: TBD

Talk 2: Pretraining of an LLM at KAUST: Mistakes and Learned Lessons
Speaker: Francesco Orabona
Abstract: TBD

Talk 3: Benchmarking Neural Network Training Algorithms
Speaker: Frank Schneider
Abstract: TBD

Speakers
avatar for Xiao Li

Xiao Li

Assistant Professor, The Chinese University of Hong Kong, Shenzhen
Name: Dr. Xiao LiTitle: Assistant Professor of School of Data ScienceAffiliation: The Chinese University of Hong Kong, ShenzhenBio:I work on (continuous) optimization. Currently, my specific focus is on developing efficient optimization methods for large language models.
FO

Francesco Orabona

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Frank Schneider

Frank Schneider

PostDoc, University of Tübingen
Name: Dr. Frank SchneiderTitle: Postdoctoral ResearcherAffiliation: University of Tübingen, GermanyBio:I’m a postdoctoral researcher at the University of Tübingen in the Methods of Machine Learning group, led by Philipp Hennig. I’m also a chair of the Algorithms working... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 212 3501 Trousdale Pkwy, 212, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3F: Advances in commercial solvers
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Advances in commercial solvers
Chair: Robert Luce
Cluster: Computational Software

Talk 1: A Whole New Look for CONOPT
Speaker: Steven Dirkse
Abstract: Following GAMS' recent acquisition of CONOPT from ARKI Consulting & Development A/S, this presentation delves into the continuous evolution of this robust nonlinear optimization solver, emphasizing the significant advancements introduced in the latest release and the strategic implications of the new ownership. Recent updates have optimized the active set method at CONOPT’s core, leading to measurable performance improvements across a diverse set of test cases. These enhancements boost computational efficiency, stability and accuracy. The latest iteration of CONOPT introduces new APIs for C++ and Python, opening up new possibilities for a clean, efficient, and robust integration into various software environments and projects requiring nonlinear optimization. Finally, we will demonstrate the practical application of providing derivatives to CONOPT, an important step that is often necessary to achieve the best possible performance.

Talk 2: Continuous Optimization in MATLAB
Speaker: Shengchao Lin
Abstract: This presentation highlights MATLAB's optimization capabilities. Key features include the broad applications of the Optimization and Global Optimization Toolboxes, problem-based optimization setup for better modeling, and code generation and deployment of optimization algorithms. These improvements demonstrate MATLAB's evolving role as a powerful platform for optimization and modeling in engineering and science.

Talk 3: The Latest Developments in the Knitro Optimization Solver
Speaker: Richard Waltz
Abstract: Knitro was originally developed in the 1990s as an interior-point algorithm for general nonlinear, non-convex optimization.  Over the years, Knitro evolved into a more general optimization toolbox that includes an Active-Set LP-based solver, a Sequential Quadratic Programming (SQP) solver, specialized LP, QP, and SOCP solvers, a branch-and-bound solver for mixed integer programming, and multi-start heuristics for global optimization.  To add to this toolbox of algorithms, we have recently started developing first-order methods that do not require any matrix factorizations.  The hope is that these might be able to provide useful solutions to extremely large-scale models where the factorizations in interior-point methods become too expensive.  In this talk we will present some of this work, primarily based on Augmented Lagrangian (AL) type methods.

Speakers
RL

Robert Luce

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SD

Steven Dirkse

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SL

Shengchao Lin

Software Engineer, MathWorks
RW

Richard Waltz

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 156 3518 Trousdale Pkwy, 156, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3G: Special Session in Honor of Suvrajeet Sen: Sampling-based Methods in Stochastic Programming
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Special Session in Honor of Suvrajeet Sen: Sampling-based Methods in Stochastic Programming
Chair: Harsha Gangammanavar
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: A Stochastic Decomposition Method for Multistage Distributionally Robust Optimization under Streaming Data
Speaker: Tian Xia
Abstract: TBD

Talk 2: TBD
Speaker: Di Zhang
Abstract: TBD

Talk 3: TBD
Speaker: Harsha Gangammanavar
Abstract: TBD

Speakers
DZ

Di Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HG

Harsha Gangammanavar

Associate Professor, Southern Methodist University
Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 114 3501 Trousdale Pkwy, 114, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3H: Recent Advances in Large-scale Optimization I
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Recent Advances in Large-scale Optimization I
Chair: Salar Fattahi
Cluster: Nonlinear Optimization

Talk 1: Algorithms for nonconvex nonsmooth optimization
Speaker: Jong-Shi Pang
Abstract: This talk will be focused on algorithms for nonconvex nonsmooth optimization.

Talk 2: Discrete Optimization methods for compressing foundation models
Speaker: Rahul Mazumder
Abstract: Foundation models have achieved remarkable performance across various domains, but their large model sizes lead to high computational costs (storage, inference latency, memory, etc). Neural network pruning, roughly categorized as unstructured and structured, aims to reduce these costs by removing less-important parameters while retaining model utility as much as possible. Depending upon available hardware, different types of pruning approaches are useful. In this talk, I will discuss discrete optimization methods to address such problems. At a high-level, these are related to cardinality constrained least squares problems involving billions of variables; and require the development of large-scale algorithms that can run on GPUs.

Talk 3: A Parametric Approach for Solving Convex Quadratic Optimization with Indicators Over Trees
Speaker: Salar Fattahi
Abstract: In this talk, we discuss convex quadratic optimization problems involving indicator variables, each associated with a continuous variable, particularly focusing on scenarios where the matrix defining the quadratic term is positive definite and its sparsity pattern corresponds to the adjacency matrix of a tree graph. We introduce a graph-based dynamic programming algorithm that solves this problem in quadratic time and memory. Central to our algorithm is a precise parametric characterization of the cost function across various nodes of the graph corresponding to distinct variables. Our computational experiments conducted on both synthetic and real-world datasets demonstrate the superior performance of our proposed algorithm compared to existing algorithms and state-of-the-art mixed-integer optimization solvers. An important application of our algorithm is in the real-time inference of Gaussian hidden Markov models from data affected by outlier noise. Using a real on-body accelerometer dataset, we solve instances of this problem with over 30,000 variables in under a minute, and its online variant within milliseconds on a standard computer.

Speakers
JP

Jong-Shi Pang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RM

Rahul Mazumder

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SF

Salar Fattahi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 116 3501 Trousdale Pkwy, 116, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3I: Large-scale optimization for data science
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Large-scale optimization for data science
Chair: Jason Altschuler
Cluster: Optimization For Data Science

Talk 1: Preconditioning for Linear Regression and Kernel Methods: The State of Play
Speaker: Ethan Epperly
Abstract: Simple models like linear regression and kernel methods continue to be powerful tools for learning from data. For large-scale problems, the state-of-art algorithms for these models use iterative methods with randomized preconditioning. This talk surveys the best-known preconditioners for these models, discusses recent advances, and describes open problems.

Talk 2: Balancing Regret and Runtime: Faster Iterative Projections over Submodular Base Polytopes
Speaker: Jai Moondra
Abstract: Optimization algorithms like projected Newton’s method, FISTA, and Mirror Descent achieve near-optimal regret bounds (e.g., O(sqrt(T)) for Online Mirror Descent) but face high computational costs due to Bregman projections at each iteration. By contrast, conditional gradient methods perform linear optimization at each step, achieving faster runtimes but at the expense of suboptimal regret bounds (e.g., O(T^⅔) for Online Frank-Wolfe). Motivated by this runtime-regret trade-off, we propose efficient iterative projection techniques for closely spaced points over submodular base polytopes, a widely applicable structure. Our approach, using both continuous and discrete perspectives, leads to significant runtime improvements in Online Mirror Descent, achieving up to several orders of magnitude in speed-ups in numerical experiments. For cardinality-based submodular polytopes, we further reduce Bregman projection costs by a factor of Omega(n/log n) in n-dimensions. Joint work with Hassan Mortagy and Swati Gupta.

Talk 3: TBD
Speaker: Babak Hassibi
Abstract: TBD

Speakers
JA

Jason Altschuler

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
EE

Ethan Epperly

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JM

Jai Moondra

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
BH

Babak Hassibi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 100 3518 Trousdale Pkwy, 100, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3J: Frontiers of Optimization for Machine Learning - Part III
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Frontiers of Optimization for Machine Learning - Part III
Chair: Fred Roosta
Cluster: Nonlinear Optimization

Talk 1: Randomized Techniques for Fast and Scalable Operator
Speaker: Michael Mahoney
Abstract: The approximation of linear operators and their inverses lies at the heart of many optimization and machine learning algorithms. This work improves the efficiency and scalability of these tasks by integrating two central computational tools of recent decades-- randomization and preconditioning. A particular focus is placed on addressing large-scale applications on modern hardware architectures with stringent communication and memory constraints. Notably, the proposed methods are designed to enable effective recycling of computations when dealing with sequences of operators with similar intrinsic properties, a scenario frequently arising in iterative optimization algorithms.

Talk 2: Approximation and Preconditioning
Speaker: Matus Telgarsky
Abstract: This talk will demonstrate a duality-based proof technique for establishing that coordinate and gradient descent follow specific paths (and not just limit points) for linear and deep network classifiers.

Talk 3: Example Selection for Distributed Learning
Speaker: Christopher de Sa
Abstract: Training example order in SGD has long been known to affect convergence rate. Recent results show that accelerated rates are possible in a variety of cases for permutation-based sample orders, in which each example from the training set is used once before any example is reused. This talk will cover a line of work in my lab on decentralized learning and sample-ordering schemes. We will discuss the limits of the classic gossip algorithm and random-reshuffling schemes and explore how both can be improved to make SGD converge faster both in theory and in practice with little overhead.

Speakers
FR

Fred Roosta

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MM

Michael Mahoney

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MT

Matus Telgarsky

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
CD

Christopher de Sa

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 102 3501 Trousdale Pkwy, 102, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3K: Recent advances in federated learning
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Recent advances in federated learning
Chair: Minseok Ryu
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: FedSpaLLM: Federated Pruning of Large Language Models
Speaker: Yijiang Li
Abstract: Federated Learning (FL) has gained significant interest in training AI models in a distributed computing environment benefiting from its capability to maintain the privacy of sensitive data of the participating parties. However, challenges remain in effectively handling of participating parties with heterogeneous computational power, such as edge devices. In this work, we propose a federated framework that involves an adaptive global pruning scheme to enable collaborative training of large models, such as LLMs, on parties with heterogeneous computational power.

Talk 2: Balancing uneven client participation in asynchronous Federated Learning
Speaker: Charikleia Iakovidou
Abstract: Asynchronous communication is a popular approach for speeding up the convergence of Federated Learning (FL) in the presence of slowly updating clients. Existing asynchronous FL methods typically provide convergence guarantees under the assumption that each client is equally likely to participate in a global aggregation. In practice, however, due to variations in computation or communication capabilities, clients may update the server at different frequencies. We demonstrate theoretically that under uneven client participation and non-iid local data, vanilla asynchronous FedAvg cannot achieve convergence to an arbitrarily small neighborhood of the optimum of the global loss function, even when a diminishing stepsize sequence is adopted. We introduce AREA, a new asynchronous FL method that employs a memory-based correction mechanism for balancing uneven client participation, and supports a wide variety of deterministic and stochastic aggregation protocols. Without the strong assumptions of bounded maximum client delay and bounded gradients, we establish theoretically optimal convergence rates for AREA for (i) strongly convex and smooth functions, (ii) convex and nonsmooth functions, and (iii) nonconvex and smooth functions.

Talk 3: Federated Communication-Efficient Multi-Objective Optimization
Speaker: Baris Askin
Abstract: We study a federated version of multi-objective optimization (MOO), where a single model is trained to optimize multiple objective functions. MOO has been extensively studied in the centralized setting but is less explored in federated or distributed settings. We propose FedCMOO, a novel communication-efficient federated multi-objective optimization (FMOO) algorithm that improves the error convergence performance of the model compared to existing approaches. Unlike prior works, the communication cost of FedCMOO does not scale with the number of objectives, as each client sends a single aggregated gradient, obtained using randomized SVD (singular value decomposition), to the central server. We provide a convergence analysis of the proposed method for smooth non-convex objective functions under milder assumptions than in prior work. In addition, we introduce a variant of FedCMOO that allows users to specify a preference over the objectives in terms of a desired ratio of the final objective values. Through extensive experiments, we demonstrate the superiority of our proposed method over baseline approaches.

Speakers
YL

Yijiang Li

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
CI

Charikleia Iakovidou

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
BA

Baris Askin

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 118 3501 Trousdale Pkwy, 118, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3L: Bilevel Optimization for Inverse Problems Part 1
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Bilevel Optimization for Inverse Problems Part 1
Chair: Juan Carlos de los Reyes
Cluster: PDE-constrained Optimization

Talk 1: A descent algorithm for the optimal control of ReLU neural network informed PDEs based on approximate directional derivatives
Speaker: Michael Hintermuller
Abstract: We propose and analyze a numerical algorithm for solving a class of optimal control problems for learning-informed semilinear partial differential equations. The latter is a class of PDEs with constituents that are in principle unknown and are approximated by nonsmooth ReLU neural networks. We first show that a direct smoothing of the ReLU network with the aim to make use of classical numerical solvers can have certain disadvantages, namely potentially introducing multiple solutions for the corresponding state equation. This motivates us to devise a numerical algorithm that treats directly the nonsmooth optimal control problem, by employing a descent algorithm inspired by a bundle-free method. Several numerical examples are provided and the efficiency of the algorithm is shown.

Talk 2: Differential estimates for fast first-order multilevel nonconvex optimisation
Speaker: Tuomo Valkonen
Abstract: PDE constraints appear in inverse imaging problems as physical models for measurements, while bilevel optimisation can be used for optimal experimental design and parameter learning. Such problems have been traditionally very expensive to solve, but recently, effective single-loop approaches have been introduced, both in our work, as well as in the machine learning community. In this talk, we discuss a simple gradient estimation formalisation for very general single-loop methods that include primal-dual methods for the inner problem, and conventional iterative solvers (Jacobi, Gauss–Seidel, conjugate gradients) for the adjoint problem and PDE constraints.

Talk 3: Deep Equilibrium Models for Poisson Inverse Problems via Mirror Descent
Speaker: Christian Daniele
Abstract: Inverse problems in imaging arise in a wide range of scientific and engineering applications, including medical imaging, astrophysics, and microscopy. These problems are inherently ill-posed, requiring advanced regularization techniques and optimization strategies to achieve stable and accurate reconstructions. In recent years, hybrid approaches that combine deep learning and variational methods have gained increasing attention. Well-established techniques include Algorithmic Unrolling, Plug-and-Play methods, and Deep Equilibrium Models. The latter are networks with fixed points, which are trained to match data samples from a training dataset. In this work, we focus on Deep Equilibrium Models to learn a data-driven regularization function for Poisson inverse problems, using the Kullback-Leibler divergence as the data fidelity term. To effectively handle this fidelity term, we employ Mirror Descent as the underlying optimization algorithm. We discuss theoretical guarantees of convergence, even in non-convex settings, incorporating a backtracking strategy, along with key aspects of training this class of models. To validate our approach, we evaluate its performance on a deblurring task with different kernels and varying levels of Poisson noise. Authors: Luca Calatroni, Silvia Villa, Samuel Vaiter, Christian Daniele In this work, we focus on Deep Equilibrium Models to learn a data-driven regularization function for Poisson inverse problems, using the Kullback-Leibler divergence as the data fidelity term. To effectively handle this fidelity term, we employ Mirror Descent as the underlying optimization algorithm. We discuss theoretical guarantees of convergence, even in non-convex settings, incorporating a backtracking strategy, along with key aspects of training this class of models. To validate our approach, we evaluate its performance on a deblurring task with different kernels and varying levels of Poisson noise.

Speakers
JC

Juan Carlos de los Reyes

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MH

Michael Hintermuller

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
TV

Tuomo Valkonen

MODEMAT & University of Helsinki
Nonsmooth optimisation, bilevel optimisation, inverse problems, variational analysis, optimisation in measure spaces. 
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 119 3501 Trousdale Pkwy, 119, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3M: Optimization Applications in Energy Systems
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Optimization Applications in Energy Systems
Chair: Sungho Shin
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: Multiperiod Optimization for Power Grid Applications
Speaker: Mihai Anitescu
Abstract: There has been a growing interest in solving multi-period AC OPF problems, as the increasingly fluctuating electricity market requires operation planning over multiple periods. These problems, formerly deemed intractable, are now becoming technologically feasible to solve thanks to the advent of high-memory GPU hardware and accelerated NLP tools. This study evaluates the capability of the ExaModels.jl and MadNLP.jl tools for GPU-centered nonlinear programming to tackle previously unsolvable multi-period AC OPF instances. Our numerical experiments, run on an NVIDIA GH200, demonstrate that we can solve a multi-period OPF instance with more than 10 million variables up to 10−4 precision in less than 10 minutes.

Talk 2: Optimal Power Flow Under Constraint-Informed Uncertainty
Speaker: Anirudh Subramanyam
Abstract: Chance-constrained optimization has emerged as a promising framework for managing uncertainties in power systems. This work advances its application to DC Optimal Power Flow (DC-OPF) problems, developing a novel approach to uncertainty modeling. Current methods tackle these problems by first modeling random variables using high-dimensional statistical distributions that scale with the number of system buses, followed by deriving convex reformulations of the probabilistic constraints. We propose an alternative methodology that uses the probabilistic constraints themselves to inform the structure of uncertainty, enabling significant dimensionality reduction. Rather than learning joint distributions of wind generation forecast errors across all units, we model two key distributions: system-wide aggregate forecast errors and individual unit errors weighted by transmission line flow sensitivities. We evaluate our approach under both Gaussian and non-Gaussian uncertainty distributions, demonstrating improvements over state-of-the-art in both statistical accuracy and optimization performance.

Talk 3: Characterizing marginal value of storage in distribution grid operations
Speaker: Dirk Lauinger
Abstract: Electricity distribution companies invest in storage to shave peak load and reduce investments into substation and distribution line upgrades. In deregulated electricity markets, storage assets owned by distribution companies are not allowed to participate in electricity markets, which leads to the assets sitting idle most of the time. Contracting for storage could provide investors with additional value streams, distribution companies with cheaper storage, and rate payers with reduced prices. We integrate contracted storage into distribution company investment planning problems and find that peak shaving reduces profits from market participation by about 1% in a Massachusetts case study. Capital investment savings from contracted storage more than compensate for this reduction. Both distribution companies and storage investors could thus benefit from contracted storage.

Speakers
SS

Sungho Shin

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MA

Mihai Anitescu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AS

Anirudh Subramanyam

Assistant Professor, Pennsylvania State University
Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
DL

Dirk Lauinger

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 157 3518 Trousdale Pkwy, 157, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3N: Adjustable Robust Optimization: Theories and Algorithms
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Adjustable Robust Optimization: Theories and Algorithms
Chair: Ahmadreza Marandi
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: A semi-infinite Benders’ cut approach for adjustable robust optimization
Speaker: Ayse Nur Arslan
Abstract: In this talk we consider two-stage linear adjustable robust optimization problems with continuous recourse. These problems have been the subject of exact solution algorithms, notably, Benders decomposition and constraint-and-column generation (CCG) approaches. Here, we present an alternative decomposition approach reposing on a novel reformulation of the problem using semi-infinite Benders’ cuts. We argue that this approach will enjoy the same quality of dual bounds as the CCG approach while requiring to solve a smaller number of separation problems. We additionally study the formulation and solution of separation problems under different assumptions on the form of the uncertainty set and the feasibility of the recourse problem. We perform a detailed numerical study that showcases the superior performance of our proposed approach as well as compares the performances of different formulations for the separation problem.

Talk 2: Robust Bilevel Optimization with Wait-and-See Follower: A Column-and-Constraint Generation Approach
Speaker: Henri Lefebvre
Abstract: Bilevel optimization is a classical framework for modeling hierarchical decision-making processes. Typically, it is assumed that all input parameters for both the leader and the follower are known when the leader makes a decision. However, in many real-world applications, the leader must decide without fully anticipating the follower's response due to uncertainties in the follower's problem. In this talk, we address robust bilevel optimization problems in which the follower adopts a ``wait-and-see'' approach. Thus, the leader decides without knowledge of the explicit realization of the uncertainty, then the uncertainty realizes in a worst-case manner, and afterward the follower's decisions are made. For this challenging problem class, we discuss mathematical properties and present a corresponding solution approach based on column-and-constraint generation. The convergence of the proposed algorithm is discussed along with its practical implementation including numerical results. We finally outline potential research directions.

Talk 3: The Value of Flexibility in Robust Supply Chain Network Design
Speaker: Amir Ardestani-Jaafari
Abstract: A supply chain network design problem (SCNDP) involves making long-term and mostly irreversible strategic decisions, requiring the utilization of various sources of flexibility and demand information. The cost efficiency of this process hinges, to a large extent, on how, among other factors, flexibilities from strategic and operational perspectives are tailored and how demand information is leveraged. In this paper, we investigate five distinct policies for our SCNDP, stemming from incorporating new flexibilities at both the strategic and operational levels. We commence with a localized production model where local production satisfies the demand. We then extend the model to cases where the production capacity in one location can be utilized to meet the demand of other nodes (Policy II). Furthermore, the capacity can be shared among open facilities if \textit{capacity-sharing links} have already been arranged (Policy III). In contrast to these policies, where the set capacity in the first stage serves as a surrogate for production amount in the second stage, we allow production decisions to be postponed until after the realization of demand, leading to a make-to-order rather than a make-to-stock production strategy (Policies IV and V). To analyze each of these policies, we develop a two-stage robust optimization framework for which we introduce a novel computationally efficient exact (based on Column-and-Constraint Generation (C\&CG)) and multiple approximation techniques (based on Linear Decision Rules). These techniques effectively solve realistically sized instances of the problem under varying demand uncertainty budgets, enabling us to derive managerial insights that would otherwise be unattainable. We demonstrate, among other results, that (i) the existence of capacity-sharing links among facilities yields (a) the highest percentage of cost-saving due to the pooling effect for all uncertainty budget values (b) significantly reduces shortage probability and (ii) production postponement brings only a marginal improvement in the results, suggesting that upstream supply chain connections (capacity-sharing links among facilities) are more critical than postponing production decisions, especially for moderate budgets of uncertainty. Finally, we apply our most effective policies to a real-world case study to contextualize these concepts, quantify their values, and formulate our design recommendations.

Speakers
AN

Ayse Nur Arslan

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Henri Lefebvre

Henri Lefebvre

Post-doc, Trier University
Henri is currently a post-doctoral researcher at Trier University (Germany) in the "Nonlinear Optimization" group. He earned his PhD at the University of Bologna (DEI) under the supervision of Michele Monaci and Enrico Malaguti where he studied optimization problems under uncertainty... Read More →
AA

Amir Ardestani-Jaafari

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 256 3518 Trousdale Pkwy, 256, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3O: Advances in Discrete Optimization: From Relaxation to Convex Hull Exactness
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Advances in Discrete Optimization: From Relaxation to Convex Hull Exactness
Chair: Akang Wang
Cluster: Interplay Between Continuous and Discrete Optimization

Talk 1: ROS: A GNN-based Relax-Optimize-and-Sample Framework for Max-k-Cut Problems
Speaker: Yeqing Qiu
Abstract: The Max-k-Cut problem is a fundamental combinatorial optimization challenge that generalizes the classic NP-complete Max-Cut problem. While relaxation techniques are commonly employed to tackle Max-k-Cut, they often lack guarantees of equivalence between the solutions of the original problem and its relaxation. To address this issue, we introduce the Relax-Optimize-and-Sample (ROS) framework. In particular, we begin by relaxing the discrete constraints to the continuous probability simplex form. Next, we pre-train and fine-tune a graph neural network model to efficiently optimize the relaxed problem. Subsequently, we propose a sampling-based construction algorithm to map the continuous solution back to a high-quality Max-k-Cut solution. By integrating geometric landscape analysis with statistical theory, we establish the consistency of function values between the continuous solution and its mapped counterpart. Extensive experimental results on random regular graphs and the Gset benchmark demonstrate that the proposed ROS framework effectively scales to large instances with up to 20,000 nodes in just a few seconds, outperforming state-of-the-art algorithms. Furthermore, ROS exhibits strong generalization capabilities across both in-distribution and out-of-distribution instances, underscoring its effectiveness for large-scale optimization tasks.

Talk 2: Solving Sparse & High-Dimensional-Output Regression via Compression
Speaker: Guanyi Wang
Abstract: Multi-Output Regression (MOR) has been widely used in scientific data analysis for decision-making. Unlike traditional regression models, MOR aims to simultaneously predict multiple real-valued outputs given an input. However, the increasing dimensionality of the outputs poses significant challenges regarding interpretability and computational scalability for modern MOR applications. As a first step to address these challenges, this paper proposes a Sparse & High-dimensional-Output REgression (SHORE) model by incorporating additional sparsity requirements to resolve the output interpretability, and then designs a computationally efficient two-stage framework capable of handling SHORE with provable accuracy via compression on outputs. Theoretically, we show that the proposed framework is computationally scalable while maintaining the same order of training loss and prediction loss before-and-after compression under arbitrary or relatively weak sample set conditions. Empirically, numerical results further validate the theoretical findings, showcasing the efficiency and accuracy of the proposed framework.

Talk 3: On the Exactness of Partial Convexification
Speaker: Rui Chen
Abstract: The idea of partial convexification is very commonly used in discrete optimization and global optimization (e.g., Dantzig-Wolfe decomposition, cutting planes from substructures etc.). We provide sufficient conditions and necessary conditions under which partial convexification gives the exact convex hull. Our results generalize the simplex lemma (folklore), and several known convexification results.

Speakers
GW

Guanyi Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RC

Rui Chen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 258 3518 Trousdale Pkwy, 258, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3P: Fast Algorithms for Data Science
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Fast Algorithms for Data Science
Chair: Shota Takahashi
Cluster: Optimization For Data Science

Talk 1: Sparsity Constrained and Sparsity Regularized Piecewise Linear Minimization: Optimality Conditions and Algorithms
Speaker: Mustafa Celebi Pinar
Abstract: Motivated by some applications such as sparsity regularized boosting where the so called boosting algorithms maximize the soft margin of produced linear combination of base hypotheses (given an oracle for producing base hypotheses) while dealing with a training set of plus-minus labeled examples, we study optimality conditions for the resulting piecewise-linear and sparsity imposed minimization problems. Different concepts of optimality are defined and developed. Algorithms are proposed and tested. Numerical results will be reported.

Talk 2: Adaptive first-order method for nonconvex optimization derived from vanishing damping continuous-time dynamics
Speaker: Kansei Ushiyama
Abstract: We propose a new first-order algorithm for nonconvex functions with Lipschitz continuous gradients and Hessian matrices. Existing first-order methods use momentum to achieve the lowest known computational complexity for finding a stationary point. The limitation of these methods is that they either require the knowledge of parameters, including Lipschitz constants, or rely on the restart strategy that resets the momentum and can slow down the algorithm. Our method has the lowest known complexity, does not require the knowledge of parameters, and uses a strategy other than restart that does not reset the momentum. This algorithm is derived from a continuous-time algorithm that can be interpreted as a dynamics with vanishing damping. We show numerically that our method works efficiently for some problems.

Talk 3: Accelerated Convergence of Frank–Wolfe Algorithms with Adaptive Bregman Step-Size Strategy
Speaker: Shota Takahashi
Abstract: We propose a Frank–Wolfe (FW) algorithm with an adaptive Bregman step-size strategy for smooth adaptable (weakly-) convex functions. This means that the gradient of the objective function is not necessarily Lipschitz continuous and we only require the smooth adaptable property. Compared to existing FW algorithms, our assumptions are thus less restrictive. We establish convergence guarantees in various settings, such as sublinear to linear convergence rates depending on the assumptions. Assuming that the objective function is weakly convex, we also provide both local sublinear and local linear convergence in terms of the primal gap under the (local) quadratic growth condition. We also propose a variant of the away-step FW algorithm using Bregman distances and establish its global linear convergence for convex optimization and its local linear convergence for nonconvex optimization under the (local) quadratic growth condition over polytopes. Numerical experiments demonstrate that our proposed FW algorithms outperform existing methods. This talk is based on joint works with Akiko Takeda and Sebastian Pokutta.

Speakers
MC

Mustafa Celebi Pinar

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
KU

Kansei Ushiyama

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ST

Shota Takahashi

Assistant Professor, The University of Tokyo
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 106 3501 Trousdale Pkwy, 106, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3Q: Fast Algorithmic Frameworks for Semidefinite Programming and Applications
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Fast Algorithmic Frameworks for Semidefinite Programming and Applications
Chair: Ling Liang
Cluster: Conic and Semidefinite Optimization

Talk 1: Convex relaxation for quantum many-body physics
Speaker: Yuehaw Khoo
Abstract: In this talk, we explore adaptations of semidefinite programming relaxations for solving many-body physics problems. Our approach transforms a high-dimensional PDE problem into a convex optimization problem, setting it apart from traditional non-convex methods that rely on nonlinear re-parameterizations of the solution. For quantum mechanical systems, we present a convex program to obtain the ground state in terms of its moments. We further introduce a near-linear time algorithm for solving the convex program using hierarchical matrices.

Talk 2: Fast and Certifiable Trajectory Optimization
Speaker: Shucheng Kang
Abstract: We propose semidefinite trajectory optimization (STROM), a framework that computes fast and certifiably optimal solutions for nonconvex trajectory optimization problems defined by polynomial objectives and constraints. STROM employs sparse second-order Lasserre's hierarchy to generate semidefinite program (SDP) relaxations of trajectory optimization. Different from existing tools (e.g., YALMIP and SOSTOOLS in Matlab), STROM generates chain-like multiple-block SDPs with only positive semidefinite (PSD) variables. Moreover, STROM does so two orders of magnitude faster. Underpinning STROM is cuADMM, the first ADMM-based SDP solver implemented in CUDA and runs in GPUs. cuADMM builds upon the symmetric Gauss-Seidel ADMM algorithm and leverages GPU parallelization to speedup solving sparse linear systems and projecting onto PSD cones. In five trajectory optimization problems (inverted pendulum, cart pole, vehicle landing, flying robot, and car back-in), cuADMM computes optimal trajectories (with certified suboptimality below 1%) in minutes (when other solvers take hours or run out of memory) and seconds (when others take minutes). Further, when warmstarted by data-driven initialization in the inverted pendulum problem, cuADMM delivers real-time performance: providing certifiably optimal trajectories in 0.66 seconds despite the SDP has 49,500 variables and 47,351 constraints.

Talk 3: Exploring chordal sparsity in semidefinite programming with sparse plus low-rank data matrices
Speaker: Tianyun Tang
Abstract: Semidefinite programming (SDP) problems are challenging to solve because of their high dimensionality. However, solving sparse SDP problems with small tree-width are known to be relatively easier because: (1) they can be decomposed into smaller multi-block SDP problems through chordal conversion; (2) they have low-rank optimal solutions. In this paper, we study more general SDP problems whose coefficient matrices have sparse plus low-rank (SPLR) structure. We develop a unified framework to convert such problems into sparse SDP problems with bounded tree-width. Based on this, we derive rank bounds for SDP problems with SPLR structure, which are tight in the worst case.

Speakers
LL

Ling Liang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YK

Yuehaw Khoo

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SK

Shucheng Kang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
TT

Tianyun Tang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 214 3501 Trousdale Pkwy, 214, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3R: Manifold optimization with special metrics
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Manifold optimization with special metrics
Chair: André Uschmajew
Cluster: Optimization on Manifolds

Talk 1: Optimal transport barycenter via nonconvex concave minimax optimization
Speaker: Xiaohui Chen
Abstract: The optimal transport barycenter is a fundamental notion of averaging that extends from the Euclidean space to the Wasserstein space of probability distributions. Computation of the unregularized barycenter for discretized probability distributions on point clouds is a challenging task when the domain dimension $d > 1$. Most practical algorithms approximating the barycenter problem are based on entropic regularization. In this paper, we introduce a nearly linear time $O(m \log{m})$ primal-dual algorithm for computing the exact barycenter when the input probability density functions are discretized on an $m$-point grid. The key success of our Wasserstein-Descent $\dot{\mathbb{H}}^1$-Ascent (WDHA) algorithm hinges on alternating between two different yet closely related Wasserstein and Sobolev optimization geometries for the primal barycenter and dual Kantorovich potential subproblems. Under reasonable assumptions, we establish the convergence rate and iteration complexity of the proposed algorithm to its stationary point when the step size is appropriately chosen for the gradient updates. Superior computational efficacy and approximation accuracy over the existing Wasserstein gradient descent and Sinkhorn's algorithms are demonstrated on 2D synthetic and real data.

Talk 2: Information geometry of operator scaling
Speaker: Tasuku Soma
Abstract: Matrix scaling is a classical problem with a wide range of applications. It is known that the Sinkhorn algorithm for matrix scaling is interpreted as alternating e-projections from the viewpoint of classical information geometry. Recently, a generalization of matrix scaling to completely positive maps called operator scaling has been found to appear in various fields of mathematics and computer science, and the Sinkhorn algorithm has been extended to operator scaling. In this talk, we discuss operator scaling from the viewpoint of quantum information geometry. For example, the operator Sinkhorn algorithm is shown to coincide with alternating e-projections with respect to the symmetric logarithmic derivative metric, which is a Riemannian metric on the space of quantum states relevant to quantum estimation theory.

Talk 3: Operator Sinkhorn iteration with overrelaxation
Speaker: André Uschmajew
Abstract: We propose accelerated versions of the operator Sinkhorn iteration for operator scaling using successive overrelaxation. We analyze the local convergence rates of these accelerated methods via linearization, which allows us to determine the asymptotically optimal relaxation parameter based on Young's SOR theorem. Based on the Hilbert metric on positive definite cones, we also obtain a global convergence result for a geodesic version of overrelaxation in a specific range of relaxation parameters. Numerical experiments demonstrate that the proposed methods outperform the original operator Sinkhorn iteration in certain applications.

Speakers
XC

Xiaohui Chen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
TS

Tasuku Soma

Associate Professor, The Institute of Statistical Mathematics
AU

André Uschmajew

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 154 3518 Trousdale Pkwy, 154, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3S: Manifold optimization with special metrics
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Manifold optimization with special metrics
Chair: Max Pfeffer
Cluster: Optimization on Manifolds

Talk 1: Riemannian optimization methods for ground state computations of multicomponent Bose-Einstein condensates
Speaker: Tatjana Stykel
Abstract: In this talk, we address the computation of ground states of multicomponent Bose-Einstein condensates by solving the underlying energy minimization problem on the infinite-dimensional generalized oblique manifold. First, we discuss the existence and uniqueness of a ground state with non-negative components and its connection to the coupled Gross-Pitaevskii eigenvector problem. Then, we study the Riemannian structure of the generalized oblique manifold by introducing several Riemannian metrics and computing important geometric tools such as orthogonal projections and Riemannian gradients. This allows us to develop the Riemannian gradient descent methods based on different metrics. Exploiting first- and second-order information of the energy functional for the construction of appropriate metrics makes it possible to incorporate preconditioning into Riemannian optimization, which significantly improves the performance of the optimization schemes. A collection of numerical experiments demonstrates the computational efficiency of the proposed methods. (Joint work with R. Altmann, M. Hermann, and D. Peterseim)

Talk 2: Approximating maps into manifolds with multiple tangent spaces
Speaker: Hang Wang
Abstract: A manifold-valued function takes values from a Euclidean domain into a manifold. Approximating a manifold-valued function from input-output samples consists of modeling the relationship between an output on a Riemannian manifold and the Euclidean input vector. In this talk, I will present algorithms for building a surrogate model to approximate either a known or an unknown manifold-valued function. The proposed methods are based on pullbacks to multiple tangent spaces and the Riemannian center of mass, hereby relying on Riemannian optimization algorithms. The effectiveness of this scheme will be illustrated with numerical experiments for a few model problems.

Talk 3: The injectivity radii of the Stiefel manifold under a one-parameter family of deformation metrics
Speaker: Ralf Zimmermann
Abstract: The injectivity radius of a manifold is an important quantity, both from a theoretical point of view and in terms of numerical applications. It is the largest possible radius within which all geodesics are unique and length-minimizing. In consequence, it is the largest possible radius within which calculations in Riemannian normal coordinates are well-defined. A matrix manifold that arises frequently in a wide range of practical applications is the compact Stiefel manifold of orthogonal p-frames in the Euclidean n-space. Its geometry may be considered under a one-parameter family of deformation metrics. We observe that the associated geodesics are space curves of constant Frenet curvatures. In combination with tight sectional curvature bounds, this allows us to determine the injectivity radius of the Stiefel manifold for a large subset of the one-parameter family of metrics that includes the Euclidean metric.

Speakers
TS

Tatjana Stykel

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HW

Hang Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RZ

Ralf Zimmermann

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 155 3518 Trousdale Pkwy, 155, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3T: Variational Analysis: Theory and Applications I
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Variational Analysis: Theory and Applications I
Chair: Walaa Moursi
Cluster: Nonsmooth Optimization

Talk 1: TBA
Speaker: Heinz Bauschke
Abstract: TBA

Talk 2: TBA
Speaker: Radu Bot
Abstract: TBA

Talk 3: TBA
Speaker: Xianfu Wang
Abstract: TBA

Speakers
WM

Walaa Moursi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HB

Heinz Bauschke

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RB

Radu Bot

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
XW

Xianfu Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 158 3518 Trousdale Pkwy, 158, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3U: Recent Advances on PDE-constrained optimization packages and libraries: Part II
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Recent Advances on PDE-constrained optimization packages and libraries: Part II
Chair: Umberto Villa
Cluster: Computational Software

Talk 1: hIPPYlib: An Extensible Software Framework for Large-Scale Inverse Problems Governed by PDEs
Speaker: Noemi Petra
Abstract: We present an extensible software framework, hIPPYlib, for solution of large-scale deterministic and Bayesian inverse problems governed by partial differential equations (PDEs) with infinite-dimensional parameter fields (which are high-dimensional after discretization). hIPPYlib overcomes the prohibitive nature of Bayesian inversion for this class of problems by implementing state-of-the-art scalable algorithms for PDE-based inverse problems that exploit the structure of the underlying operators, notably the Hessian of the log-posterior. The key property of the algorithms implemented in hIPPYlib is that the solution of the deterministic and linearized Bayesian inverse problem is computed at a cost, measured in linearized forward PDE solves, that is independent of the parameter dimension. The mean of the posterior is approximated by the MAP point, which is found by minimizing the negative log-posterior. This deterministic nonlinear least-squares optimization problem is solved with an inexact matrix-free Newton-CG method. The posterior covariance is approximated by the inverse of the Hessian of the negative log posterior evaluated at the MAP point. The construction of the posterior covariance is made tractable by invoking a low-rank approximation of the Hessian of the log-likelihood. hIPPYlib makes all of these advanced algorithms easily accessible to domain scientists and provides an environment that expedites the development of new algorithms. Authors: Umberto Villa (UT Austin), Noemi Petra (UC Merced), Omar Ghattas (UT Austin)

Talk 2: SimPEG: an open-source framework for simulation and parameter estimation in geophysics
Speaker: Lindsey Heagy
Abstract: Geophysical data can provide insights about the subsurface in a range of applications. A few examples include locating critical minerals, monitoring geologic storage of CO2, managing groundwater, and characterizing changes to permafrost. The geophysical inverse problem is posed as a PDE-constrained optimization problem where we aim to fit the observed data and incorporate additional information that may include petrophysical, geologic, and geochemical measurements, as well as additional geophysical data sets. We started the SimPEG project with the aim of accelerating research and education in geophysics and enabling researchers to build upon and contribute to a modular toolbox for solving problems in geophysics. At the core is a framework for finite volume forward simulations and gradient-based inversions. SimPEG currently supports simulation and inversion of gravity, magnetic, electrical and electromagnetic data. In this talk, I will provide an overview of how we have broken down inverse problems in geophysics into modular components and how working in an open-source paradigm has facilitated our research, collaborations with industry, and dissemination of educational resources in classrooms and for humanitarian projects.

Talk 3: TBA
Speaker: TBA TBA
Abstract: TBA

Speakers
NP

Noemi Petra

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
LH

Lindsey Heagy

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 108 3501 Trousdale Pkwy, 108, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3V: Derivative-free optimization for special classes of problems I
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Derivative-free optimization for special classes of problems I
Chair: Clément Royer
Cluster: Derivative-free Optimization

Talk 1: A derivative-free algorithm for continuous submodular optimization
Speaker: Clément Royer
Abstract: Submodular functions are a classical concept of discrete optimization, that can also be extended to the continuous setting. In particular, the class of continuous submodular functions encompasses some nonconvex functions arising in natural language processing, which partly explains renewed interest for this topic in recent years. In this talk, we propose a derivative-free algorithm for submodular optimization over compact sets, adapted from a classical framework for bound-constrained derivative-free optimization. By leveraging properties of submodular functions, we obtain complexity guarantees for this method, that represent a significant improvement over guarantees in the general, nonconvex setting. We then investigate the practical behavior of our method on our problems of interest.

Talk 2: The cosine measure of a function
Speaker: Gabriel Jarry-Bolduc
Abstract: The cosine measure of a set of vectors is a valuable tool in derivative-free optimization to judge the quality of a set of vectors. It gives information on how uniformly the set of vectors is covering the space R^n. A set of vectors is a positive spanning set of R^n if and only if its cosine measure is greater than zero. An important property of positive spanning sets is that when the gradient of a function at a point is well-defined and not equal to the zero vector, then there is at least one descent direction (ascent direction) of the function at the point contained in the set. This is not necessarily true if the gradient is equal to the zero vector or if the gradient does not exist. To characterize the previous two cases, the novel concept of cosine measure of a function is introduced in this talk. It provides an infimum on the value of the cosine measure of a set of vectors guaranteed to contain a descent direction of the function at the point of interest. It is shown how to theoretically compute the cosine measure of a function for popular classes of nonsmooth functions.

Talk 3: Using resilient positive spanning sets to deal with stragglers
Speaker: Sébastien Kerleau
Abstract: Positive spanning sets (PSSs) are families of vectors that span a given linear space through non-negative linear combinations. Such sets are of particular interest for their use in derivative-free optimization algorithms. In that context, the cost of determining the value of the objective function at a given point can be particularly expensive, taking up to weeks for a single function evaluation. Although time can partly be saved by conducting multiple function evaluations in parallel, the issue of dealing with stragglers - function evaluations that take significantly longer than others - remains to be solved. With that goal in mind, this talk will study a subclass of PSSs whose properties can allow for an improvement of the classical DFO algorithms, making them resilient to the presence of stragglers. The talk will end with numerical experiments studying the efficiency of the new-found resilient algorithm.

Speakers
CR

Clément Royer

Associate professor, Université Paris Dauphine-PSL
Clément W. Royer is an associate professor of computer science at Université Paris Dauphine-PSL. Clément received his Ph.D. from the University of Toulouse, France, and was then a postdoctoral research associate at the Wisconsin Institute of Discovery, University of Wisconsin-Madison... Read More →
GJ

Gabriel Jarry-Bolduc

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SK

Sébastien Kerleau

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 110 3501 Trousdale Pkwy, 110, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3W: Quantum Linear Algebra and Optimization (Part 2)
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Quantum Linear Algebra and Optimization (Part 2)
Chair: Mohammadhossein Mohammadisiahroudi
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)

Talk 1: Quantum Linear Algebra for Interior Point Methods
Speaker: Mohammadhossein Mohammadisiahroudi
Abstract: Quantum computing has recently emerged as a promising avenue for accelerating large-scale optimization. One notable direction is the use of quantum linear system algorithms to enhance the efficiency of Interior Point Methods. However, a key challenge in Quantum Interior Point Methods (QIPMs) lies in the tomography process, which extracts a classical representation of the Newton system solution from a quantum state. In this talk, we introduce a novel hybrid iterative approach that significantly improves the precision of Newton system solutions, achieving exponential accuracy gains. Additionally, we explore the integration of advanced quantum linear algebra techniques, such as quantum matrix-vector and matrix-matrix multiplications, to further accelerate QIPMs and enhance their practical feasibility.

Talk 2: Quantum Multiplicative Weights SOCP Solver
Speaker: Maria Isabel Franco Garrido
Abstract: Second-order cone programming (SOCP) is a key optimization framework that extends linear and quadratic programming and can be seen as a subset of semidefinite programming (SDP). Its wide applicability in fields such as machine learning, portfolio optimization, robotics, engineering design, and power systems underscores the critical need for faster and scalable solvers. State-of-the-art methods, such as interior-point methods (IPMs), are efficient and widely used in modern solvers; however, as these problems are large-scale, the development of more optimized solvers remains an important area of interest. In this work, we explore multiplicative weights (MW)-based methods as an alternative approach for solving SOCPs, addressing a gap in the classical optimization literature while also investigating quantum speedups. Although MW methods have been successfully applied to linear and semidefinite programs, their potential for SOCPs has remained largely unexplored. We analyze the complexity of our proposed algorithms in terms of query complexity, assuming oracle access to problem data. Our results demonstrate a quadratic quantum speed-up over classical implementations, contingent on the availability of quantum random access memory (QRAM), a common assumption in prior asymptotic analyses.

Talk 3: Exponentially Better Bounds for Quantum Optimization via Dynamical Simulation
Speaker: Sophia Simon
Abstract: In this talk, we present several quantum algorithms for continuous optimization that do not require any gradient estimation. Instead, we encode the optimization problem into the dynamics of a physical system and then leverage existing Hamiltonian simulation algorithms to efficiently simulate the time evolution in a coherent manner. This allows us, in certain cases, to obtain exponentially better query upper bounds relative to the best known upper bounds for optimization schemes based on gradient descent which utilize the quantum computer only for estimating gradients of the objective function.

Speakers
MM

Mohammadhossein Mohammadisiahroudi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 112 3501 Trousdale Pkwy, 112, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3X: Mechanism and Pricing Designs in Stochastic Decision Making
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Mechanism and Pricing Designs in Stochastic Decision Making
Chair: Helene Le Cadre
Cluster: nan

Talk 1: Incentive Design in Nonsmooth Games
Speaker: Helene Le Cadre
Abstract: Considering the growing trend towards integrated human-in-the-loop systems, incorporating irrational behaviors into game-theoretic models that allow to closely reflect human-beings attitudes towards risk is of high relevance. Ultimately, understanding how agents with different risk preferences interact can better inform the mechanism designer and provide guidelines on how to effectively steer agents towards improved collective and individual outcomes. To this end, we study non-cooperative stochastic games, where agents display irrational behaviors in response to underlying risk factors. Our formulation incorporates Prospect Theory (PT), a behavioral model used to describe agents’ risk attitude. We show that the resulting nonconvex nonsmooth game admits equilibria and we quantify the suboptimality induced by irrational behaviors. Then, we extend our PT-based game to an incentive-design problem formulated as a decision-dependent learning game, enabling us to cope with the multiplicity of solutions of the lower-level problem. In this setting, we provide a distributed algorithm with provable convergence, allowing the incentives to adapt dynamically to the information received in a feedback-loop approach. The results are applied to a local energy community involving strategic end users exposed to two-part tariffs.

Talk 2: Distributionally Fair Two-stage Stochastic Programming by Bilevel Optimization
Speaker: Yutian He
Abstract: Two-stage stochastic programming (TSSP) is a fundamental framework for decision-making under uncertainty, where a first-stage decision is made before uncertainty is realized, followed by scenario-dependent second-stage decisions. While most TSSP literature focuses on cost minimization, fairness considerations in decision-making have largely been overlooked. Recently, Ye et al (2025) studied a one-stage stochastic program subject to a distributional fairness constraint, but similar development under the two-stage setting is still unavailable. In this work, we propose two models of TSSP under distributional fairness constraints: one where the first- and second-stage decision-makers collaborate to ensure fairness, and another where only the first-stage decision-maker wants to ensure fairness, while the second-stage decision-maker only aims at minimizing the cost. To solve these models, we approximate the expectations by sample average and then reformulate them as mixed integer nonlinear programs. For large instances, we further develop an alternating minimization method to efficiently solve our problems, providing faster solutions.

Talk 3: Competitive Demand Learning: A Non-cooperative Pricing Algorithm with Coordinated Price Experimentation
Speaker: Yu-Ching Lee
Abstract: We consider a periodical equilibrium pricing problem for multiple firms over a planning horizon of $T$ periods. At each period, firms set their selling prices and receive stochastic demand from consumers. Firms do not know their underlying demand curve, but they wish to determine the selling prices to maximize total revenue under competition. Hence, they have to do some price experiments such that the observed demand data are informative to make price decisions. However, uncoordinated price updating can render the demand information gathered by price experimentation less informative or inaccurate. We design a nonparametric learning algorithm to facilitate coordinated dynamic pricing, in which competitive firms estimate their demand functions based on observations and adjust their pricing strategies in a prescribed manner. We show that the pricing decisions, determined by estimated demand functions, converge to underlying equilibrium as time progresses. {We obtain a bound of the revenue difference that has an order of $\mathcal{O}(F^2T^{3/4})$ and a regret bound that has an order of $\mathcal{O}(F\sqrt{T})$ with respect to the number of the competitive firms~$F$ and $T$.} We also develop a modified algorithm to handle the situation where some firms may have the knowledge of the demand curve.

Speakers
HL

Helene Le Cadre

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Yutian He

Yutian He

University of Iowa
Monday July 21, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 215 3501 Trousdale Pkwy, 215, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 3Y: Machine Learning - Privacy and Learning Theory
Monday July 21, 2025 4:15pm - 5:30pm PDT
Session: Machine Learning - Privacy and Learning Theory
Chair: Devansh Gupta
Cluster: nan

Talk 1: Inherent Privacy of Zeroth Order Projected Gradient Descent
Speaker: Devansh Gupta
Abstract: Differentially private zeroth-order optimization methods have recently gained popularity in private fine tuning of machine learning models due to their reduced memory requirements. Current approaches for privatizing zeroth-order methods rely on adding Gaussian noise to the estimated zeroth-order gradients. However, since the search direction in the zeroth-order methods is inherently random, researchers including Tang et. Al [1] and Zhang et. Al [2] have raised an important question: is the inherent noise in zeroth-order estimators sufficient to ensure the overall differential privacy of the algorithm? This work settles this question for a class of oracle-based optimization algorithms where the oracle returns zeroth-order gradient estimates. In particular, we show that for a fixed initialization, there exist strongly convex objective functions such that running (Projected) Zeroth-Order Gradient Descent (ZO-GD) is not differentially private. Furthermore, we show that even with random initialization and without revealing intermediate iterates, the privacy loss in ZO-GD can grow superlinearly with the number of iterations when minimizing convex objective functions. [1] Tang, X., Panda, A., Nasr, M., Mahloujifar, S., and Mittal, P. (2024). Private fine-tuning of large language models with zeroth-order optimization. [2] Zhang, L., Li, B., Thekumparampil, K. K., Oh, S., and He, N. (2024a). DPZero: Private fine-tuning of language models without backpropagation. In International Conference on Machine Learning. PMLR.

Talk 2: Near-Optimal and Tractable Estimation under Shift-Invariance
Speaker: Dmitrii Ostrovskii
Abstract: In 1990s, Arkadi Nemirovski asked the following question: How hard is it to estimate a sequence of length N satisfying an unknown linear recurrence relation of order S and observed in i.i.d. Gaussian noise? The class of all such sequences is parametric but extremely rich: it contains all exponential polynomials with total degree S, including harmonic oscillations with s arbitrary frequencies. Geometrically, this class corresponds to the projection onto R^N of the union of all shift-invariant subspaces of R^Z of dimension S. In this work, we show that the statistical complexity of this class, as measured by the squared minimax radius of the (1−P)-confidence Euclidean ball, is nearly the same as for the class of S-sparse signals, namely (S\log(N) + \log(1/P)) \log^2(S) \log(N/S) up to a constant factor. Moreover, the corresponding near-minimax estimator is tractable, and it can be used to build a test statistic with a near-minimax detection threshold in the associated detection problem. These statistical results rest upon an approximation-theoretic one: we show that finite-dimensional shift-invariant subspaces admit compactly supported reproducing kernels whose Fourier spectra have the smallest possible p-norms, simultaneously for all p >= 1.

Talk 3: On stochastic mirror descent under relative nonconvexity and nonsmoothness
Speaker: Philip Thompson
Abstract: In this talk, we review recent convergence analysis of the stochastic mirror descent method and present novel convergence analysis within the framework of relative variation (e.g. Lipschitness, smoothness, etc) and relative notions of the PL inequality. We also present some applications in machine learning.

Speakers
DG

Devansh Gupta

PhD Student, University of Southern California
I am a Computer Science PhD student at University of Southern California in the Theory Group and Optimization for Data Driven Science where I am advised by Meisam Razaviyayn and Vatsal Sharan. I am broadly interested in working on fundamental problems in Optimization and Differential... Read More →
DO

Dmitrii Ostrovskii

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
PT

Philip Thompson

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Monday July 21, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 200 3518 Trousdale Pkwy, 200, Los Angeles, CA 90089

5:30pm PDT

Break
Monday July 21, 2025 5:30pm - 5:45pm PDT
Monday July 21, 2025 5:30pm - 5:45pm PDT
TBA

5:45pm PDT

Best Paper Session
Monday July 21, 2025 5:45pm - 7:15pm PDT
Finalists (in alphabetical order):

Guy Kornowski for the paper "Oracle Complexity in Nonsmooth Nonconvex Optimization”, co-authored with Ohad Shamir.
Abstract: It is well-known that given a smooth, bounded-from-below, and possibly nonconvex function, standard gradient-based methods can find $\epsilon$-stationary points (with gradient norm less than $\epsilon$) in $\mathcal{O}(1/\epsilon^2)$ iterations. However, many important nonconvex optimization problems, such as those associated with training modern neural networks, are inherently not smooth, making these results inapplicable. In this paper, we study nonsmooth nonconvex optimization from an oracle complexity viewpoint, where the algorithm is assumed to be given access only to local information about the function at various points. We provide two main results: First, we consider the problem of getting \emph{near} $\epsilon$-stationary points. This is perhaps the most natural relaxation of \emph{finding} $\epsilon$-stationary points, which is impossible in the nonsmooth nonconvex case. We prove that this relaxed goal cannot be achieved efficiently, for any distance and $\epsilon$ smaller than some constants. Our second result deals with the possibility of tackling nonsmooth nonconvex optimization by reduction to smooth optimization: Namely, applying smooth optimization methods on a smooth approximation of the objective function. For this approach, we prove under a mild assumption an inherent trade-off between oracle complexity and smoothness: On the one hand, smoothing a nonsmooth nonconvex function can be done very efficiently (e.g., by randomized smoothing), but with dimension-dependent factors in the smoothness parameter, which can strongly affect iteration complexity when plugging into standard smooth optimization methods. On the other hand, these dimension factors can be eliminated with suitable smoothing methods, but only by making the oracle complexity of the smoothing process exponentially large.

Naoki Marumo for the paper “Parameter-free Accelerated Gradient Descent for nonconvex optimization", co-authored with Akiko Takeda.
Abstract: We propose a new first-order method for minimizing nonconvex functions with a Lipschitz continuous gradient and Hessian. The proposed method is an accelerated gradient descent with two restart mechanisms and finds a solution where the gradient norm is less than $\epsilon$ in $O(\epsilon^{-7/4})$ function and gradient evaluations. Unlike existing first-order methods with similar complexity bounds, our algorithm is parameter-free because it requires no prior knowledge of problem-dependent parameters, e.g., the Lipschitz constants and the target accuracy $\epsilon$. The main challenge in achieving this advantage is estimating the Lipschitz constant of the Hessian using only first-order information. To this end, we develop a new Hessian-free analysis based on two technical inequalities: a Jensen-type inequality for gradients and an error bound for the trapezoidal rule. Several numerical results illustrate that the proposed method performs comparably to existing algorithms with similar complexity bounds, even without parameter tuning.

Lai Tian for the paper "Testing Approximate Stationarity Concepts for Piecewise Affine Functions", co-authored with Anthony Man-Cho So.
Abstract: We study the basic computational problem of detecting approximate stationary points for continuous piecewise affine (PA) functions. Our contributions span multiple aspects, including complexity, regularity, and algorithms. Specifically, we show that testing first-order approximate stationarity concepts, as defined by commonly used generalized subdifferentials, is computationally intractable unless $\cP=\cNP$. To facilitate computability, we consider a polynomial-time solvable relaxation by abusing the convex subdifferential sum rule and establish a tight characterization of its exactness. Furthermore, addressing an open issue motivated by the need to terminate the subgradient method in finite time, we introduce the first oracle-polynomial-time algorithm to detect so-called near-approximate stationary points for PA functions. A notable byproduct of our development in regularity is the first necessary and sufficient condition for the validity of an equality-type (Clarke) subdifferential sum rule. Our techniques revolve around two new geometric notions for convex polytopes and may be of independent interest in nonsmooth analysis. Moreover, some corollaries of our work on complexity and algorithms for stationarity testing address open questions in the literature. To demonstrate the versatility of our results, we complement our findings with applications to a series of structured piecewise smooth functions, including $\rho$-margin-loss SVM, piecewise affine regression, and nonsmooth neural networks.

Speakers
NM

Naoki Marumo

Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Guy Kornowski

Guy Kornowski

Weizmann Institute of Science
Talk title:Dimension dependence in nonconvex optimizationBio:Guy Kornowski is a PhD student at the Weizmann Institute of Science, advised by Prof. Ohad Shamir. During his PhD he interned at Apple ML Research, where he worked with Kunal Talwar and Vitaly Feldman. His research focuses... Read More →
Monday July 21, 2025 5:45pm - 7:15pm PDT
Taper Hall (THH) 101 3501 Trousdale Pkwy, 101, Los Angeles, CA 90089
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Timezone


Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -