Loading…
arrow_back View All Dates
Thursday, July 24
 

8:15am PDT

Auditorium Opens Doors for seating
Thursday July 24, 2025 8:15am - 8:45am PDT
Thursday July 24, 2025 8:15am - 8:45am PDT
USC Bovard Auditorium 3551 Trousdale Pkwy, Los Angeles, CA 90089

8:45am PDT

Last Day Remarks
Thursday July 24, 2025 8:45am - 9:00am PDT
Thursday July 24, 2025 8:45am - 9:00am PDT
USC Bovard Auditorium 3551 Trousdale Pkwy, Los Angeles, CA 90089

9:00am PDT

Plenary 4
Thursday July 24, 2025 9:00am - 10:00am PDT
Speakers
AD

Alexandre d'Aspremont

After dual PhDs from Ecole Polytechnique and Stanford University in optimisation and finance, followed by a postdoc at U.C. Berkeley, Alexandre d'Aspremont joined the faculty at Princeton University as an assistant then associate professor with joint appointments at the ORFE department... Read More →
Thursday July 24, 2025 9:00am - 10:00am PDT
USC Bovard Auditorium 3551 Trousdale Pkwy, Los Angeles, CA 90089

10:00am PDT

Coffee & Snack Break (Provided)
Thursday July 24, 2025 10:00am - 10:30am PDT
Thursday July 24, 2025 10:00am - 10:30am PDT
TBA

10:30am PDT

Parallel Sessions 10A: AI Meets Optimization (Part 3)
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: AI Meets Optimization (Part 3)
Chair: Wotao Yin
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)

Talk 1: TBD
Speaker: Ming Jin
Abstract: TBD

Talk 2: TBD
Speaker: Pascal Van Hentenryck
Abstract: TBD

Talk 3: TBD
Speaker: Samy Wu Fung
Abstract: TBD

Speakers
MJ

Ming Jin

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
PV

Pascal Van Hentenryck

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SW

Samy Wu Fung

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 101 3501 Trousdale Pkwy, 101, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10B: Special Session in Honor of Suvrajeet Sen: Stochastic Mixed-Integer Programming
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Special Session in Honor of Suvrajeet Sen: Stochastic Mixed-Integer Programming
Chair: Lewis Ntaimo
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: On Disjunctive Decomposition for Stochastic Mixed-Integer Programming: A Reflection on Algorithm Development and Applications
Speaker: Lewis Ntaimo
Abstract: Two-stage stochastic mixed-integer programming (SMIP) involves making discrete decisions in the face of future uncertainty and has many applications in science and engineering. However, solving SMIP is very challenging mainly due to its nonconvexity and large-scale nature. In this talk, we reflect on the development of disjunctive decomposition for SMIP initiated by Suvrajeet Sen. Disjuctive decomposition relies on generating disjunctive cutting planes that are shared among scenarios to sequentially convexify the nonconvex expected recourse function. We review the basic theory and derivation of a class of disjunctive decomposition algorithms, and illustrate the algorithms using simple numerical examples. Finally, we discuss computer implementation and application of the algorithms towards solving standard SMIP problems.

Talk 2: An Asymptotically Optimal Coordinate Descent Algorithm for Learning Bayesian Networks from Gaussian Models
Speaker: Simge Kucukyavuz
Abstract: We study the problem of learning Bayesian networks from continuous observational data, generated according to a linear Gaussian structural equation model. We consider an ℓ0-penalized maximum likelihood estimator for this problem which is known to have favorable statistical properties but is computationally challenging to solve, especially for medium-sized Bayesian networks. We propose a new coordinate descent algorithm to approximate this estimator and prove several remarkable properties of our procedure: the algorithm converges to a coordinate-wise minimum, and despite the non-convexity of the loss function, as the sample size tends to infinity, the objective value of the coordinate descent solution converges to the optimal objective value of the ℓ0-penalized maximum likelihood estimator. Finite-sample optimality and statistical consistency guarantees are also established. To the best of our knowledge, our proposal is the first coordinate descent procedure endowed with optimality and statistical guarantees in the context of learning Bayesian networks. Numerical experiments on synthetic and real data demonstrate that our coordinate descent method can obtain near-optimal solutions while being scalable.

Talk 3: A Stochastic Diversion Path Problem
Speaker: Cole Smith
Abstract: We examine a stochastic network optimization problem in which the goal is to modify arc lengths so that a specified path (the “diversion path”) will be optimal with sufficiently high probability. Given modification costs for each arc in the network, the objective in a deterministic form of the problem would be to minimize the sum of modification costs needed to guarantee that the diversion path is optimal. In the stochastic version, each arc length is an independent, uniformly-distributed random variable. (The lower bound on the arc lengths is nonnegative.) Given a parameter 0 < tau
Speakers
avatar for Lewis Ntaimo

Lewis Ntaimo

Professor and Head, Texas A&M University
Name: Lewis NtaimoTitle: Professor and Department HeadAffiliation: Texas A&M UniversityBio:Fun Fact:
avatar for Simge Kucukyavuz

Simge Kucukyavuz

Chair and David A. and Karen Richards Sachs Professor, Northwestern University
Name: Simge KüçükyavuzTitle: Chair and David A. and Karen Richards Sachs ProfessorAffiliation: Northwestern UniversityBio:Simge Küçükyavuz is Chair and David A. and Karen Richards Sachs Professor in the Industrial Engineering and Management Sciences Department at Northwestern... Read More →
CS

Cole Smith

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 201 3501 Trousdale Pkwy, 201, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10C: Mixed-Integer Programming-Based Techniques for the Global Optimization of MINLPs
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Mixed-Integer Programming-Based Techniques for the Global Optimization of MINLPs
Chair: Rohit Kannan
Cluster: Interplay Between Continuous and Discrete Optimization

Talk 1: Discrete nonlinear functions: formulations and applications
Speaker: Taotao He
Abstract: This paper examines nonlinear optimization problems that incorporate discrete decisions. We introduce new improved formulation techniques that take advantage of the simplotope structure present in the domain of the binarization variables. Our technique identifies new polynomially solvable instances for price promotion problem initially studied by Cohen et al. (2021) and allows us to develop a linear programming (LP) formulation for inventory optimization problem under a choice model proposed by Boada-Collado and Martinez-de Albeniz (2020). The techniques we develop rely on ideal formulations for submodular and fractional compositions of discrete functions, improving prior formulations for bilinear products suggested by Adams and Henry (2012). Submodular compositions also generalize L natural functions over bounded domains and our construction provides new insights into Lovasz-extension based formulations for this class of problems while expanding the class of nonlinear discrete optimization problems amenable to linear programming based techniques.

Talk 2: Quadratic Cuts for Non-Convex MINLP in Practice
Speaker: Adrian Göß
Abstract: It is only half the job to find a good solution for a mathematical optimization problem, as one needs to verify its quality by specifying a dual bound. When it comes to mixed-integer nonlinear programming (MINLP), strong prerequisites such as constraint qualifications appear suitable, but may be difficult to verify computationally. In practice, solvers apply local refinement or convexification strategies to retrieve tight dual bounds. However, these concepts require appropriate big-M formulations, generate new sub-problems, or struggle to represent non-convex characteristics in terms of high accuracy, all of which can lead to long running times. As an alternative, we aim to leverage recent advances in mixed-integer quadratically-constrained programming (MIQCP) and propose a global approximation of constraint functions by paraboloids, \ie, univariate quadratic terms. The approximation is retrieved as a solution to a mixed-integer linear programming (MIP) problem. Further, for each nonlinear constraint function, we solve such MIPs and determine small numbers of paraboloids approximating it from either side. A replacement of the nonlinearities with the corresponding quadratic functions leads to a quadratically-constrained relaxation of the original problem. Solving the MIQCP relaxation then leads to a dual bound whose tightness depends on the approximation guarantee of the paraboloids. In summary, this approach enables solvers that are explicitly tailored for quadratic constraints to solve MINLPs to global optimality.

Talk 3: Learning to Accelerate the Global Optimization of QCQPs: Strong Partitioning and an End-to-End Graph ML-Based Approach
Speaker: Rohit Kannan
Abstract: We learn optimal instance-specific heuristics for the global minimization of nonconvex quadratically-constrained quadratic programs (QCQPs). Specifically, we consider partitioning-based convex mixed-integer programming relaxations for nonconvex QCQPs and propose the novel problem of strong partitioning, which aims to optimally partition variable domains without sacrificing global optimality. Given that solving this max-min strong partitioning problem exactly can be highly challenging, we design a local optimization method that leverages the generalized gradients of the value function in the inner minimization problem. However, solving the strong partitioning problem to local optimality can still be computationally expensive. To address this, we propose a simple AdaBoost regression-based machine learning (ML) approximation for homogeneous families of QCQPs. We conduct a detailed computational study on randomly generated QCQP families, including instances of the pooling problem, using the open-source global solver Alpine. Numerical experiments demonstrate that our AdaBoost regression-based approximation of strong partitioning reduces Alpine’s solution time by a factor of 2 to 4.5 on average, with maximum reduction factors ranging from 10 to 200 across different QCQP families. Additionally, we design a novel end-to-end loss function to train a neural network-based policy that directly maps QCQP instance features to corresponding partitioning points, eliminating the need for an expert strong partitioning policy. We also demonstrate how to apply the graph scattering transform to train a single neural network model that generalizes across QCQP families of various sizes. Our results show that this graph scattering-based end-to-end neural network learns a partitioning policy that significantly outperforms the AdaBoost-based approach and generalizes well to larger QCQPs beyond those encountered in training.

Speakers
TH

Taotao He

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AG

Adrian Göß

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RK

Rohit Kannan

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 208 3501 Trousdale Pkwy, 208, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10D: Optimization Meets Generative AI: Insights and New Designs
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Optimization Meets Generative AI: Insights and New Designs
Chair: Minshuo Chen
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: Fine Tuning And Guidance of Diffusion Models
Speaker: Wenpin Tang
Abstract: The past decade has witnessed the success of generative modeling (e.g. GANs, VAEs,...) in creating high quality samples in a wide variety of data modalities. In the first part of this talk, I will briefly introduce the recently developed diffusion models from a continuous-time perspective. Then in the second part, I will discuss three different approaches to fine-tune the diffusion models: conditioning (classifier guidance), stochastic control and reinforcement learning. Each of these approaches will lead to a nice theory with a few application fields. If time permits, I will also discuss the DPO (Direct preference optimization) approach to fine-tuning text-to-image models.

Talk 2: How Does Gradient Descent Learn Features -- A Local Analysis for Regularized Two-Layer Neural Networks
Speaker: Mo Zhou
Abstract: The ability of learning useful features is one of the major advantages of neural networks. Although recent works show that neural networks can operate in a neural tangent kernel (NTK) regime that does not allow feature learning, many works also demonstrate the potential for neural networks to go beyond NTK regime and perform feature learning. Recently, a line of work highlighted the feature learning capabilities of the early stages of gradient-based training. In this paper we consider another mechanism for feature learning via gradient descent through a local convergence analysis. We show that once the loss is below a certain threshold, gradient descent with a carefully regularized objective will capture ground-truth directions. Our results demonstrate that feature learning not only happens at the initial gradient steps, but can also occur towards the end of training.

Talk 3: Theoretical Implications of Training And Sampling Diffusion Models
Speaker: Yuqing Wang
Abstract: Most existing theoretical investigations of the accuracy of diffusion models, albeit significant, assume the score function has been approximated to a certain accuracy, and then use this a priori bound to control the error of generation. In this talk, I will show a quantitative understanding of the whole generation process, i.e., both training and sampling. More precisely, it conducts a non-asymptotic convergence analysis of denoising score matching under gradient descent. In addition, a refined sampling error analysis for variance exploding models is also provided. The combination of these two results yields a full error analysis, which elucidates (again, but this time theoretically) how to design the training and sampling processes for effective generation. For instance, our theory implies a preference toward noise distribution and loss weighting in training that qualitatively agree with the ones used in Karras et al. 2022. It also provides perspectives on the choices of time and variance schedules in sampling: when the score is well trained, the design in Song et al. 2021 is more preferable, but when it is less trained, the design in Karras et al. 2022 becomes more preferable.

Speakers
MC

Minshuo Chen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
WT

Wenpin Tang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MZ

Mo Zhou

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YW

Yuqing Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 210 3501 Trousdale Pkwy, 210, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10E: Data-Driven Optimization with Structured Models
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Data-Driven Optimization with Structured Models
Chair: Krishnakumar Balasubramanian
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: Feature Learning in Two-layer Neural Networks under Structured Data
Speaker: Murat Erdogdu
Abstract: We study the effect of gradient-based optimization on feature learning in two-layer neural networks. We consider a setting where the number of samples is of the same order as the input dimension and show that, when the input data is isotropic, gradient descent always improves upon the initial random features model in terms of prediction risk, for a certain class of targets. Further leveraging the practical observation that data often contains additional structure, i.e., the input covariance has non-trivial alignment with the target, we prove that the class of learnable targets can be significantly extended, demonstrating a clear separation between kernel methods and two-layer neural networks in this regime. We additionally consider sparse settings and show that pruning methods can lead to optimal sample complexity.

Talk 2: Control, Transport and Sampling: Towards Better Loss Design
Speaker: Qijia Jiang
Abstract: Leveraging connections between diffusion-based sampling, optimal transport, and stochastic optimal control through their shared links to the Schrodinger bridge problem, we propose novel objective functions that can be used to transport ν to μ, consequently sample from the target μ, via optimally controlled dynamics. We highlight the importance of the pathwise perspective and the role various optimality conditions on the path measure can play for the design of valid training losses, the careful choice of which offer numerical advantages in implementation. Basing the formalism on Schrodinger bridge comes with the additional practical capability of baking in inductive bias when it comes to Neural Network training.

Talk 3: Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models
Speaker: Mladen Kolar
Abstract: In this work, we consider solving optimization problems with a stochastic objective and deterministic equality constraints. We propose a Trust-Region Sequential Quadratic Programming method to find both first- and second-order stationary points. Our method utilizes a random model to represent the objective function, which is constructed from stochastic observations of the objective and is designed to satisfy proper adaptive accuracy conditions with a high but fixed probability. To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a quadratic approximation of the objective subject to a (relaxed) linear approximation of the problem constraints and a trust-region constraint. To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature of the reduced Hessian matrix, as well as a second-order correction step to address the potential Maratos effect, which arises due to the nonlinearity of the problem constraints. Such an effect may impede the method from moving away from saddle points. Both gradient and eigen step computations leverage a novel parameter-free decomposition of the step and the trust-region radius, accounting for the proportions among the feasibility residual, optimality residual, and negative curvature. We establish global almost sure first- and second-order convergence guarantees for our method, and present computational results on CUTEst problems, regression problems, and saddle-point problems to demonstrate its superiority over existing line-search-based stochastic methods. Joint work with Yuchen Fang, Sen Na, and Michael W Mahoney.

Speakers
KB

Krishnakumar Balasubramanian

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ME

Murat Erdogdu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
QJ

Qijia Jiang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MK

Mladen Kolar

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 212 3501 Trousdale Pkwy, 212, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10F: Optimisation and machine learning in energy
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Optimisation and machine learning in energy
Chair: Hongyu Zhang
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: A Computationally Efficient Cutting Plane Modelling to Generate Alternatives Algorithm
Speaker: Michael Lau
Abstract: Contemporary macro-energy systems modelling is characterized by the need to represent energy systems strategic and operational decisions with high temporal and spatial resolution, which provides more accurate results than more abstracted models. This drive towards greater fidelity, however, conflicts with a push towards greater model representation of inherent complexity in decision-making, including methods like Modelling to Generate Alternatives. Modelling to Generate Alternatives aims to map the feasible space of a model within a cost slack by varying investment parameters without changing the operational constraints, a process which frequently requires hundreds of solutions. For large, highly representative energy system models this is impossible with traditional methods, leading researchers to reduce complexity with either more zonal or temporal aggregation. This research presents a new solution method for Modelling to Generate Alternatives-type problems. Using Cutting Plane methods based on a reformulation of Bender’s Decomposition, we break down the problem structure into a strategic master problem and operational subproblems and pass information between master problems to accelerate convergence with each new objective. We find that our new solution method is several times faster and requires less memory than existing parallelized monolithic Modelling to Generate Alternatives solution methods, enabling rapid computation of a greater number of solutions to highly resolved models.

Talk 2: Multi-timescale stochastic programming with application in power systems
Speaker: Yihang Zhang
Abstract: We introduce a multi-timescale stochastic programming framework for decision-making under multi-timescale uncertainty. Aggregated state variables are used to coordinate decisions across timescales, similar to the role of state variables in a multistage problem. Based on this setup, we describe instantiation strategies that use either multi-horizon scenario trees (to model multi-lag dependence on a timescale) or a specialized value function to fully exploit independence of the randomness within the timescale. We develop decomposition algorithms (price-directive or resource-directive) to incrementally approximate and solve the resulting problem. In addition to techniques used in multistage problems, we describe solution-reusing heuristics to accelerate the solution process by leveraging similarities between subproblems.

Talk 3: A Deep Generative Learning Approach for Two-stage Adaptive Robust Optimization
Speaker: Aron Brenner
Abstract: Two-stage adaptive robust optimization (ARO) is a powerful approach for planning under uncertainty, balancing first-stage decisions with recourse decisions made after uncertainty is realized. To account for uncertainty, modelers typically define a simple uncertainty set over which potential outcomes are considered. However, classical methods for defining these sets unintentionally capture a wide range of unrealistic outcomes, resulting in overly-conservative and costly planning in anticipation of unlikely contingencies. In this work, we introduce AGRO, a solution algorithm that performs adversarial generation for two-stage adaptive robust optimization using a variational autoencoder. AGRO generates high-dimensional contingencies that are simultaneously adversarial and realistic, improving the robustness of first-stage decisions at a lower planning cost than standard methods. To ensure generated contingencies lie in high-density regions of the uncertainty distribution, AGRO defines a tight uncertainty set as the image of “latent" uncertainty sets under the VAE decoding transformation. Projected gradient ascent is then used to maximize recourse costs over the latent uncertainty sets by leveraging differentiable optimization methods. We demonstrate the cost-efficiency of AGRO by applying it to both a synthetic production-distribution problem and a real-world power system expansion setting. We show that AGRO outperforms the standard column-and-constraint algorithm by up to 1.8% in production-distribution planning and up to 11.6% in power system expansion.

Speakers
ML

Michael Lau

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YZ

Yihang Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AB

Aron Brenner

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 156 3518 Trousdale Pkwy, 156, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10G: Novel First-Order Methods via Performance Estimation II
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Novel First-Order Methods via Performance Estimation II
Chair: Alex Wang
Cluster: Conic and Semidefinite Optimization

Talk 1: Computer-assisted design of complexity lower bounds for accelerated composite optimization methods
Speaker: Uijeong Jang
Abstract: The accelerated composite optimization method FISTA (Beck, Teboulle 2009) is suboptimal by a constant factor, and we present a new method OptISTA that improves FISTA by a constant factor of 2. The performance estimation problem (PEP) has recently been introduced as a new computer-assisted paradigm for designing optimal first-order methods. In this work, we establish the exact optimality of OptISTA with a lower-bound construction that extends the semi-interpolated zero-chain construction (Drori, Taylor 2022) to the double-function setup of composite optimization. By establishing exact optimality, our work concludes the search for the fastest first-order methods, with respect to the performance measure of worst-case function value suboptimality, for the proximal, projected-gradient, and proximal-gradient setups involving a smooth convex function and a closed proper convex function.

Talk 2: Automated tight Lyapunov analysis for first-order methods
Speaker: Manu Upadhyaya
Abstract: We present a methodology for establishing the existence of quadratic Lyapunov inequalities for a wide range of first-order methods used to solve convex optimization problems. In particular, we consider (i) classes of optimization problems of finite-sum form with (possibly strongly) convex and possibly smooth functional components, (ii) first-order methods that can be written as a linear system on state-space form in feedback interconnection with the subdifferentials of the functional components of the objective function, and (iii) quadratic Lyapunov inequalities that can be used to draw convergence conclusions. We present a necessary and sufficient condition for the existence of a quadratic Lyapunov inequality within a predefined class of Lyapunov inequalities, which amounts to solving a small-sized semidefinite program. We showcase our methodology on several first-order methods that fit the framework. Most notably, our methodology allows us to significantly extend the region of parameter choices that allow for duality gap convergence in the Chambolle-Pock method.

Talk 3: Accelerating Proximal Gradient Descent via Silver Stepsizes
Speaker: Jinho Bok
Abstract: Surprisingly, recent work has shown that gradient descent can be accelerated without using momentum -- just by judiciously choosing stepsizes. An open question raised by several papers is whether this phenomenon of stepsize-based acceleration holds more generally for constrained and/or composite convex optimization via projected and/or proximal versions of gradient descent. We answer this in the affirmative by proving that the silver stepsize schedule yields analogously accelerated rates in these settings. These rates are conjectured to be asymptotically optimal among all stepsize schedules, and match the silver convergence rate of vanilla gradient descent (Altschuler and Parrilo, 2023), namely O(ε^{−log_ρ 2}) for smooth convex optimization and O(κ^{log_ρ 2}log(1/ε)) under strong convexity, where ε is the precision, κ is the condition number, and ρ=1+sqrt(2) is the silver ratio. The key technical insight is the combination of recursive gluing -- the technique underlying all analyses of gradient descent accelerated with time-varying stepsizes -- with a certain Laplacian-structured sum-of-squares certificate for the analysis of proximal point updates. Joint work with Jason M. Altschuler.

Speakers
AW

Alex Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
UJ

Uijeong Jang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Manu Upadhyaya

Manu Upadhyaya

PhD student, Lund University
JB

Jinho Bok

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 114 3501 Trousdale Pkwy, 114, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10H: Advances in Bilevel Optimization: Algorithms and Applications
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Advances in Bilevel Optimization: Algorithms and Applications
Chair: Mingyi Hong & Prashant Khanduri
Cluster: Nonlinear Optimization

Talk 1: Barrier Function for Bilevel Optimization with Coupled Lower-Level Constraints: Formulation, Approximation and Algorithms
Speaker: Xiaotian Jiang
Abstract: In this paper, we consider bilevel optimization problem where the lower-level has coupled constraints, i.e. the constraints depend both on the upper- and lower-level variables. In particular, we consider two settings for the lower-level problem. The first is when the objective is strongly convex and the constraints are convex with respect to the lower-level variable; The second is when the lower-level is a linear program. We propose to utilize a barrier function reformulation to translate the problem into an unconstrained problem. By developing a series of new techniques, we proved that both the hyperfunction value and hypergradient of the barrier reformulated problem (uniformly) converge to those of the original problem under minimal assumptions. Further, to overcome the non-Lipschitz smoothness of hyperfunction and lower-level problems for barrier reformulated problems, we design an adaptive algorithm that ensures a non-asymptotic convergence guarantee. We also design an algorithm that converges to the stationary point of the original problem asymptotically under certain assumptions. The proposed algorithms require minimal assumptions, and to our knowledge, they are the first with convergence guarantees when the lower-level problem is a linear program.

Talk 2: A Discretization Approach for Low-Dimensional Non-Convex Bilevel Optimization
Speaker: Prashant Khanduri
Abstract: Bilevel optimization has become an invaluable tool in areas like machine learning and operations research. Most modern bilevel algorithms are developed for problems with strongly convex and/or unconstrained lower-level tasks. In this work, we deal with a class of problems where the lower-level task is non-convex and constrained. To solve this class of challenging problems, we propose a novel discretized value-function-based approach wherein the value function and the respective problem reformulation are discretized. Under the proposed approach, the value function’s optimization problem is transformed into an easier convex optimization task via discretization. We tackle the discretized problem using a penalty approach and establish the relation between the KKT points, and the local and global minima of the two problems. We develop a gradient descent-based algorithm to solve the penalty reformulated problem. We establish finite-time convergence guarantees for the developed algorithm. Finally, we perform numerical experiments to confirm the effectiveness of the proposed method.

Talk 3: An Accelerated Gradient Method for Convex Simple Bilevel Optimization
Speaker: Jincheng Cao
Abstract: In this paper, we focus on simple bilevel optimization problems, where we minimize a convex smooth objective function over the optimal solution set of another convex smooth constrained optimization problem. We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem using a cutting plane approach and employs an accelerated gradient-based update to reduce the upper-level objective function over the approximated solution set. We measure the performance of our method in terms of suboptimality and infeasibility errors and provide non-asymptotic convergence guarantees for both error criteria. Specifically, when the feasible set is compact, we show that our method requires at most $\mathcal{O}(\max\{1/\sqrt{\epsilon_{f}}, 1/\epsilon_g\})$ iterations to find a solution that is $\epsilon_f$-suboptimal and $\epsilon_g$-infeasible. Moreover, under the additional assumption that the lower-level objective satisfies the $r$-th Hölderian error bound, we show that our method achieves an iteration complexity of $\tilde{\mathcal{O}}(\max\{\epsilon_{f}^{-\frac{2r-1}{2r}},\epsilon_{g}^{-\frac{2r-1}{2r}}\})$, which matches the optimal complexity of single-level convex constrained optimization when $r=1$.

Speakers
XJ

Xiaotian Jiang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
PK

Prashant Khanduri

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JC

Jincheng Cao

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 116 3501 Trousdale Pkwy, 116, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10I: Advances in Variational Inequalities and Saddle Point Problems
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Advances in Variational Inequalities and Saddle Point Problems
Chair: Afrooz Jalilzadeh
Cluster: Multi-agent Optimization and Games

Talk 1: Convergence Analysis of Stochastic Quasi-Variational Inequalities
Speaker: Afrooz Jalilzadeh
Abstract: While Variational Inequality (VI) is a well-established mathematical framework that subsumes Nash equilibrium and saddle-point problems, less is known about its extension, Quasi-Variational Inequalities (QVI). QVI allows for cases where the constraint set changes as the decision variable varies allowing for a more versatile setting. In this talk, we propose extra-gradient and gradient-based methods for solving Stochastic Quasi-Variational Inequalities (SQVI) and establish a rigorous convergence rate analysis for these methods. Our approach not only advances the theoretical understanding of SQVI but also demonstrates its practical applicability. Specifically, we highlight its effectiveness in reformulating and solving problems such as generalized Nash Equilibrium, bilevel optimization, and saddle-point problems with coupling constraints.

Talk 2: Linear Complementarity Systems for the Morning Commute Problem with Ridesharing and Dynamic Pricing
Speaker: Wei Gu
Abstract: The emerging ridesharing services and relevant infrastructures such as High-Occupancy Toll (HOT) lanes, provide more flexibility for travelers, more opportunities for sustainable transportation systems, and at the same time, more challenges for the classical morning commute problem. To capture traffic dynamics, we propose a modified bottleneck model that avoids time-delayed terms in computing travel times and maintains desired mathematical properties. Then we develop a general mathematical modeling framework for the morning commute problem with ridesharing, including travel modes, infrastructures, and operators. Formulated as Linear Complementary Systems (LCS), the proposed model simultaneously captures travelers’ departure time choice, lane choice between HOT lane and general purpose lane, as well as mode choice between ridesharing and solo driving. We show the solution existence for the LCS-based general modeling framework. To approximate the proposed continuous-time model, a discrete-time model is generated using an implicit time discretization scheme, with the theoretical guarantee to converge back to the original continuous-time form. Analytical solutions for dynamic prices, including drivers’ incomes, passengers’ payments, and HOT lane toll charges, are derived to balance the various demands of travelers, operators, and society. The proposed models and dynamic prices are validated in numerical examples. Results show that we simultaneously benefit travelers, operators, and society toward urban sustainability through ridesharing: smaller travel costs, positive net cash flow and toll collection for ridesharing and HOT lane operators, and better system performance.

Talk 3: Simultaneous Learning and Optimization for Misspecified Saddle Point Problems
Speaker: Erfan Yazdandoost Hamedani
Abstract: With recent technological advancements and data growth, there is increasing interest in optimization problems where key parameters are unknown or misspecified. A common approach, "estimate-then-optimize", involves learning the unknown parameter by optimizing a secondary objective function in a preliminary estimation stage. However, such methods lack asymptotic convergence guarantees, as the parameter is typically estimated within finite time, often resulting in suboptimal performance in the optimization phase. This highlights the need for methods that simultaneously handle learning and optimization. In this talk, we address a class of misspecified convex-concave saddle point (SP) problems, where the objective function contains an unknown vector of parameters, which can be learned through a parallel SP problem. We propose an accelerated primal-dual algorithm, analyzing its convergence guarantee and complexity bound based on the parameter estimate. Additionally, we demonstrate the algorithm's overall complexity under various assumptions on the secondary objective function.

Speakers
AJ

Afrooz Jalilzadeh

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
WG

Wei Gu

PhD Candidate, University of Southern California
Generalized Traffic Equilibrium with Ride-hailing and Customer WaitingWe develop a generalized traffic equilibrium model that considers ride-hailing services provided by Transportation Network Companies (TNCs, e.g., Uber and Lyft) and accounts for customer waiting. The generalized... Read More →
EY

Erfan Yazdandoost Hamedani

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 100 3518 Trousdale Pkwy, 100, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10J: Randomized algorithms beyond minimization
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Randomized algorithms beyond minimization
Chair: Laurent Condat
Cluster: Fixed Points and Variational Inequalities

Talk 1: Splitting the forward-backward algorithm
Speaker: Emanuele Naldi
Abstract: In this talk, we introduce an extension of the forward-backward splitting method, designed to address monotone inclusion problems involving sums of multiple maximal monotone operators, with some of them being cocoercive operators. Our approach builds upon recent developments in splitting methods and offers two significant innovations. First, we generalize the classical forward-backward and Davis-Yin algorithms by incorporating flexible, arbitrary network architectures, making our method well-suited for distributed optimization problems. Second, we relax the common assumption of uniform cocoercivity, allowing the use of larger, more adaptive step sizes based on individual cocoercivity constants, which enables us to accelerate convergence in some instances. We provide a comprehensive convergence rate analysis and demonstrate the practical benefits of this enhanced flexibility through numerical experiments. We investigate and test further the methods introducing also stochasticity in the proposed algorithm.

Talk 2: Solving Hidden Monotone Variational Inequalities with Surrogate Losses
Speaker: Gauthier Gidel
Abstract: Deep learning has proven to be effective in a wide variety of loss minimization problems. However, many applications of interest, like minimizing projected Bellman error and min-max optimization, cannot be modelled as minimizing a scalar loss function but instead correspond to solving a variational inequality (VI) problem. This difference in setting has caused many practical challenges as naive gradient-based approaches from supervised learning tend to diverge and cycle in the VI case. In this talk, I will introduce a principled surrogate-based approach compatible with deep learning to solve VIs. I will show that our surrogate-based approach has three main benefits: (1) under assumptions that are realistic in practice (when hidden monotone structure is present, interpolation, and sufficient optimization of the surrogates), it guarantees convergence, (2) it provides a unifying perspective of existing methods, and (3) is amenable to existing deep learning optimizers like ADAM. I will demonstrate that this surrogate-based approach is effective in min-max optimization and minimizing projected Bellman error. Furthermore, in the deep reinforcement learning case, we propose a novel variant of TD(0) which is more compute and sample efficient.

Talk 3: A Simple Finite-Time Analysis of Temporal Difference Learning with Linear Function Approximation
Speaker: Aritra Mitra
Abstract: We study the finite-time convergence of TD learning with linear function approximation under Markovian sampling. Existing proofs for this setting either assume a projection step in the algorithm to simplify the analysis, or require a fairly intricate argument to ensure stability of the iterates. We ask: Is it possible to retain the simplicity of a projection-based analysis without actually performing a projection step in the algorithm? Our main contribution is to show this is possible via a novel two-step argument. In the first step, we use induction to prove that under a standard choice of a constant step-size α, the iterates generated by TD learning remain uniformly bounded in expectation. In the second step, we establish a recursion that mimics the steady-state dynamics of TD learning up to a bounded perturbation on the order of O(α^2) that captures the effect of Markovian sampling. Combining these pieces leads to an overall approach that considerably simplifies existing proofs. We conjecture that our inductive proof technique will find applications in the analyses of more complex stochastic approximation algorithms, and provide some examples of such applications.

Speakers
avatar for Laurent Condat

Laurent Condat

Senior Research Scientist, King Abdullah University of Science and Technology (KAUST)
Laurent Condat received a PhD in applied mathematics in 2006 from Grenoble Institute of Technology, Grenoble, France. After a postdoc in the Helmholtz Zentrum Muenchen, Munich, Germany, he was hired in 2008 as a permanent researcher by the French National Center for Scientific Research... Read More →
EN

Emanuele Naldi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
GG

Gauthier Gidel

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AM

Aritra Mitra

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 102 3501 Trousdale Pkwy, 102, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10K: Nonconvex methods and SDPs
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Nonconvex methods and SDPs
Chair: Florentin Goyens
Cluster: Nonlinear Optimization

Talk 1: Higher-Order Newton Methods with Polynomial Work per Iteration
Speaker: Amir Ali Ahmadi
Abstract: We present generalizations of Newton’s method that incorporate derivatives of an arbitrary order $d$ but maintain a polynomial dependence on dimension in their cost per iteration. At each step, our $d$th-order method uses semidefinite programming to construct and minimize a sum of squares-convex approximation to the $d$th-order Taylor expansion of the function we wish to minimize. We prove that our $d$th-order method has local convergence of order $d$. This results in lower oracle complexity compared to the classical Newton method. We show on numerical examples that basins of attraction around local minima can get larger as $d$ increases. Under additional assumptions, we present a modified algorithm, again with polynomial cost per iteration, which is globally convergent and has local convergence of order $d$. Co-authors: Abraar Chaudhry and Jeffrey Zhang.

Talk 2: Sharp Global Guarantees for Nonconvex Low-rank Recovery in the Noisy Overparameterized Regime
Speaker: Richard Zhang
Abstract: Rank overparameterization was recently shown to make nonconvex low-rank matrix recovery increasingly benign under the restricted isometry property (RIP), by converting all local minima into global minima that exactly recover the ground truth. But this does not fully explain the practical success of overparameterization, because real algorithms can still become trapped at nonstrict saddle points (approximate second-order points with arbitrarily small negative curvature) even when all local minima are global. Moreover, the result does not accommodate for noisy measurements, but it is unclear whether such an extension is even possible, in view of the many discontinuous and unintuitive behaviors already known for the overparameterized regime. In this paper, we introduce a novel proof technique that unies, simplifies, and strengthens two previously competing approachesone based on escape directions and the other based on the inexistence of counterexampleto provide sharp global guarantees in the noisy overparameterized regime. We show, once local minima have been converted into global minima through slight overparameterization, that near-second-order points achieve the same minimax-optimal recovery bounds (up to small constant factors) as significantly more expensive convex approaches. Our results are sharp with respect to the noise level and the solution accuracy, and hold for both the symmetric parameterization $XX^T$, as well as the asymmetric parameterization $UV^T$ under a balancing regularizer; we demonstrate that the balancing regularizer is indeed necessary.

Talk 3: HALLaR: A New Solver for Solving Huge SDPs
Speaker: Arnesh Sujanani
Abstract: This talk presents a new first-order method for solving large-scale semidefinite programs (SDPs) with bounded domain. It is an inexact augmented Lagrangian (AL) method where the AL subproblems are solved by a hybrid low-rank method. In contrast to the classical low-rank method, the new method finds a near-optimal solution (with provable complexity bounds) of SDP instances satisfying strong duality. Computational results comparing the new method to state-of-the-art solvers on several large SDP instances show that the former finds higher accurate solutions in substantially less CPU time than the latter ones. For example, in less than 20 minutes, our method can solve (on a personal laptop) a maximum stable set SDP with 1 million vertices and 10 million edges within 1e-5 relative accuracy. This is joint work with Renato D.C. Monteiro and Diego Cifuentes.

Speakers
avatar for Florentin Goyens

Florentin Goyens

Postdoc, UCLouvain
Researcher in numerical optimization at UCLouvain in Belgium
AA

Amir Ali Ahmadi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RZ

Richard Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AS

Arnesh Sujanani

Postdoctoral Fellow
Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 118 3501 Trousdale Pkwy, 118, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10L: Optimization and Statistics at Scale
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Optimization and Statistics at Scale
Chair: Mateo Diaz
Cluster: Optimization For Data Science

Talk 1: A superlinearly convergent subgradient method for sharp semismooth problems
Speaker: Vasilis Charisopoulos
Abstract: Nonsmooth optimization problems appear throughout machine learning and signal processing. However, standard first-order methods for nonsmooth optimization can be slow for "poorly conditioned" problems. In this talk, I will present a locally accelerated first-order method that is less sensitive to conditioning and achieves superlinear (i.e., double-exponential) convergence near solutions for a broad family of problems. The algorithm is inspired by Newton's method for solving nonlinear equations.

Talk 2: Commutator Stepsize Schedule
Speaker: Henry Shugart
Abstract: TBD

Talk 3: The radius of statistical efficiency
Speaker: Mateo Diaz
Abstract: Classical results in asymptotic statistics show that the Fisher information matrix controls the difficulty of estimating a statistical model from observed data. In this work, we introduce a companion measure of robustness of an estimation problem: the radius of statistical efficiency (RSE) is the size of the smallest perturbation to the problem data that renders the Fisher information matrix singular. We compute the RSE up to numerical constants for a variety of test bed problems, including principal component analysis, generalized linear models, phase retrieval, bilinear sensing, and matrix completion. In all cases, the RSE quantifies the compatibility between the covariance of the population data and the latent model parameter. Interestingly, we observe a precise reciprocal relationship between the RSE and the intrinsic complexity/sensitivity of the problem instance, paralleling the classical Eckart–Young theorem in numerical analysis.

Speakers
VC

Vasilis Charisopoulos

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HS

Henry Shugart

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MD

Mateo Diaz

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 119 3501 Trousdale Pkwy, 119, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10M: Advances in min-max optimization algorithms for machine learning
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Advances in min-max optimization algorithms for machine learning
Chair: Ahmet Alacaoglu
Cluster: Optimization For Data Science

Talk 1: How to make the gradient descent-ascent converge to local minimax optima
Speaker: Donghwan Kim
Abstract: Can we effectively train a generative adversarial network (GAN) (or equivalently, optimize a minimax problem), similarly to how we successfully train a classification neural network (or equivalently, minimize a function) using gradient methods? Currently, the answer is 'No'. The remarkable success of gradient descent in minimization is supported by theoretical results; under mild conditions, gradient descent converges to a local minimizer, and almost surely avoids strict saddle points. However, comparable theoretical support for minimax optimization is currently lacking. This talk will discuss recent progress in addressing this gap using dynamical systems theory. Specifically, this talk will present new variants of gradient descent-ascent that, under mild conditions, converge to local minimax optima, where the existing gradient descent-ascent methods fail to do so.

Talk 2: Parameter-free second-order methods for min-max optimization
Speaker: Ali Kavis
Abstract: In this talk, I will talk about an adaptive, line-search-free second-order methods with optimal rate of convergence for solving convex-concave min-max problems. By means of an adaptive step size, our algorithms feature a simple update rule that requires solving only one linear system per iteration, eliminating the need for line-search or backtracking mechanisms. Specifically, we base our algorithms on the optimistic method and appropriately combine it with second-order information. Moreover, distinct from common adaptive schemes, we define the step size recursively as a function of the gradient norm and the prediction error in the optimistic update. We first analyze a variant where the step size requires knowledge of the Lipschitz constant of the Hessian. Under the additional assumption of Lipschitz continuous gradients, we further design a parameter-free version by tracking the Hessian Lipschitz constant locally and ensuring the iterates remain bounded. We also evaluate the practical performance of our algorithm by comparing it to existing second-order algorithms for minimax optimization. Inspired by the adaptive design of the step size, we propose a heuristic initialization rule that performs competitively across different problems and scenarios and eliminates the need to fine tune the step size.

Talk 3: TBD
Speaker: Jelena Diakonikolas
Abstract: TBD

Speakers
AA

Ahmet Alacaoglu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
DK

Donghwan Kim

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AK

Ali Kavis

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 157 3518 Trousdale Pkwy, 157, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10N: Nonsmooth PDE Constrained Optimization: Algorithms, Analysis and Applications Part 3
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Nonsmooth PDE Constrained Optimization: Algorithms, Analysis and Applications Part 3
Chair: Harbir Antil
Cluster: PDE-constrained Optimization

Talk 1: An Adaptive Inexact Trust-Region Method for PDE-Constrained Optimization with Regularized Objectives
Speaker: Robert Baraldi
Abstract: We introduce an inexact trust-region method for efficiently solving regularized optimization problems governed by PDEs. In particular, we consider the class of problems in which the objective is the sum of a smooth, nonconvex function and nonsmooth, convex function. Such objectives are pervasive in the literature, with examples being basis pursuit, inverse problems, and topology optimization. The inclusion of nonsmooth regularizers and constraints is critical, as they often perserve physical properties or promote sparsity in the control. Enforcing these properties in an efficient manner is critical when met with computationally intense nature of solving PDEs. A common family of methods that can obtain accurate solutions with considerably smaller mesh sizes are adaptive finite element routines. They are critical in reducing error in solutions as well as mitigating numerical cost of solving the PDE. Our adaptive trust-region method solves the regularized objective while automatically refining the mesh for the PDE. Our method increases accuracy of the gradient and objective via local error estimators and our criticality measure. We present our numerical results on problems in control.

Talk 2: The SiMPL method for density-based topology optimization
Speaker: Dohyun Kim
Abstract: We introduce Sigmoidal mirror descent with a projected latent variable (SiMPL), a novel first-order optimization method for density-based topology optimization. SiMPL ensures point-wise bound preserving design updates and faster convergence than other popular first-order topology optimization methods. By leveraging the (negative) Fermi-Dirac entropy, we define a non-symmetric Bregman divergence that facilitates a simple yet effective update rule with the help of so-called latent variable. SiMPL produces a sequence of pointwise-feasible iterates even when high-order finite elements are used in the discretization. Numerical experiments demonstrates that the method outperforms other popular first-order optimization algorithms. We also present mesh- and order-independent convergence along with possible extension of this method.

Talk 3: Two-level Discretization Scheme for Total Variation in Integer Optimal Control
Speaker: Paul Manns
Abstract: We advance the discretization of the dual formulation of the total variation term with Raviart-Thomas functions which is known from literature for convex problems. Due to our integrality constraints, the previous analysis is not applicable anymore because, when considering a Γ-convergence-type argument, the recovery sequences generally need to attain non-integer, that is, infeasible, values. We overcome this problem by embedding a finer discretization of the input functions. A superlinear coupling of the mesh sizes implies an averaging on the coarser Raviart-Thomas mesh, which enables to recover the total variation of integer-valued limit functions with integer-valued, discretized input functions. In turn, we obtain a Γ-convergence-type result and convergence rates under additional regularity assumptions.

Speakers
HA

Harbir Antil

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RB

Robert Baraldi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
DK

Dohyun Kim

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Paul Manns

Paul Manns

TU Dortmund University
Bio:Paul completed his PhD at the Institute for Mathematical Optimization at Technical University of Braunschweig in 2019. Afterwards, he joined the Mathematics and Computer Science Division of Argonne National Laboratory as James H Wilkinson Fellow in Scientific Computing. In September 2021, Paul moved to TU Dortmund University as assistant professor.Paul's research focus lies on mixed-integer optimization infinite dimensions, in particular, appropriate regularization techniques and trust-region algor... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 256 3518 Trousdale Pkwy, 256, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10O: Optimization on Manifolds
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Optimization on Manifolds
Chair: Maurício Silva Louzeiro
Cluster: Optimization on Manifolds

Talk 1: Variational Problems and Duality on Manifolds
Speaker: Anton Schiela
Abstract: Variational problems play a fundamental role in many areas of applied mathematics, and form the basis of both theoretical results and numerical algorithms. Problems of this class also arise in the context of manifolds and have important applications there. However, to formulate them in an adequate way, refined concepts of differential geometry, in particular a well developed duality theory, are required. In this talk we will give an introduction into these concepts and elaborate, how they can be applied to variational problems, using the example of harmonic mappings. We will also describe some implications of these concepts to the design of numerical solution algorithms.

Talk 2: Newton's method for nonlinear mappings into vector bundles
Speaker: Laura Weigl
Abstract: We consider Newton's method for finding zeros of nonlinear mappings from a manifold $\mathcal X$ into a vector bundle $\mathcal E$. In this setting a connection on $\mathcal E$ is required to render the Newton equation well defined, and a retraction on $\mathcal X$ is needed to compute a Newton update. As applications we will discuss the solution of variational problems involving mappings between manifolds, and, in particular, the numerical computation of geodesics under force fields.

Talk 3: An Adaptive Cubic Regularization quasi-Newton Method on Riemannian Manifolds
Speaker: Maurício Silva Louzeiro
Abstract: A quasi-Newton method with cubic regularization is designed for solving Riemannian unconstrained nonconvex optimization problems. The proposed algorithm is fully adaptive with at most ${\cal O} (\epsilon_g^{-3/2})$ iterations to achieve a gradient smaller than $\epsilon_g$ for given $\epsilon_g$, and at most $\mathcal O(\max\{ \epsilon_g^{-\frac{3}{2}}, \epsilon_H^{-3} \})$ iterations to reach a second-order stationary point respectively. Notably, the proposed algorithm remains applicable even in cases of the gradient and Hessian of the objective function unknown. Numerical experiments are performed with gradient and Hessian being approximated by forward finite-differences to illustrate the theoretical results and numerical comparison.

Speakers
AS

Anton Schiela

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
LW

Laura Weigl

PhD Student, University of Bayreuth
MS

Maurício Silva Louzeiro

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 258 3518 Trousdale Pkwy, 258, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10P: Recent Algorithmic Advances in Multiagent Optimization
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Recent Algorithmic Advances in Multiagent Optimization
Chair: Prashant Khanduri & Haibo Yang
Cluster: Multi-agent Optimization and Games

Talk 1: Theory on Mixture-of-Experts in Continual Learning
Speaker: Sen Lin
Abstract: Continual learning (CL) has garnered significant attention because of its ability to adapt to new tasks that arrive over time. Catastrophic forgetting (of old tasks) has been identified as a major issue in CL, as the model adapts to new tasks. The Mixture-of-Experts (MoE) model has recently been shown to effectively mitigate catastrophic forgetting in CL, by employing a gating network to sparsify and distribute diverse tasks among multiple experts. However, there is a lack of theoretical analysis of MoE and its impact on the learning performance in CL. This paper provides the first theoretical results to characterize the impact of MoE in CL via the lens of overparameterized linear regression tasks. We establish the benefit of MoE over a single expert by proving that the MoE model can diversify its experts to specialize in different tasks, while its router learns to select the right expert for each task and balance the loads across all experts. Our study further suggests an intriguing fact that the MoE in CL needs to terminate the update of the gating network after sufficient training rounds to attain system convergence, which is not needed in the existing MoE studies that do not consider the continual task arrival. Furthermore, we provide explicit expressions for the expected forgetting and overall generalization error to characterize the benefit of MoE in the learning performance in CL. Interestingly, adding more experts requires additional rounds before convergence, which may not enhance the learning performance. Finally, we conduct experiments on both synthetic and real datasets to extend these insights from linear models to deep neural networks (DNNs), which also shed light on the practical algorithm design for MoE in CL.

Talk 2: Multimodal Federated Learning and Communication Compression
Speaker: Aritra Dutta
Abstract: Multi-modal transformers mark significant progress in different domains, but siloed high-quality data hinders their further improvement. To remedy this, federated learning (FL) has emerged as a promising privacy-preserving paradigm for training models without direct access to the raw data held by different clients. Despite its potential, a considerable research direction regarding the unpaired unimodal clients and the transformer architecture in FL remains unexplored. To fill this gap, first, we explore a transfer multi-modal federated learning (MFL) scenario within the vision-language domain, where clients possess data of various modalities distributed across different datasets. We systematically evaluate the performance of existing methods when a transformer architecture is utilized and introduce a novel framework called Federated modality complementary and collaboration (FedCola) by addressing the in-modality and cross-modality gaps among clients. To remedy the communication bottleneck of FedCola during the training, when we used lossy compression schemes, we realized that, unlike unimodal FL, one compression operator throughout the training shows extremely degraded performance. The second part of this talk is dedicated to designing an adaptive compressor scheduler for multimodal FL training. 

Talk 3: Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Speaker: Haibo Yang
Abstract: Federated Learning (FL) offers a promising framework for collaborative and privacy-preserving machine learning across distributed data sources. However, the substantial communication costs associated with FL significantly challenge its efficiency. Specifically, in each communication round, the communication costs scale linearly with the model's dimension, which presents a formidable obstacle, especially in large model scenarios. Despite various communication-efficient strategies, the intrinsic dimension-dependent communication cost remains a major bottleneck for current FL implementations. This paper proposes a novel dimension-free communication algorithm – DeComFL, which leverages the zeroth-order optimization techniques and reduces the communication cost from $\mathcal{O}(d)$ to $\mathcal{O}(1)$ by transmitting only a constant number of scalar values between clients and the server in each round, regardless of the dimension $d$ of the model parameters. Theoretically, in non-convex functions, we prove that our algorithm achieves state-of-the-art rates, which show a linear speedup of the number of clients and local steps under standard assumptions. With additional low effective rank assumption, we can further show the convergence rate is independent of the model dimension $d$ as well. Empirical evaluations, encompassing both classic deep learning training and large language model fine-tuning, demonstrate significant reductions in communication overhead. Notably, DeComFL achieves this by transmitting only around 1MB of data in total between the server and a client to fine-tune a model with billions of parameters.

Speakers
SL

Sen Lin

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AD

Aritra Dutta

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Haibo Yang

Haibo Yang

Assistant Professor, Rochester Institute of Technology
Name: Dr. Haibo YangTitle: Assistant Professor of Distributed Optimization & Federated LearningAffiliation: Rochester Institute of TechnologyBio:Dr. Haibo Yang builds intelligent systems that know when to communicate, when to compute, and when to just quietly wait for the next client... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 106 3501 Trousdale Pkwy, 106, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10Q: Advanced Computational Solvers 1
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Advanced Computational Solvers 1
Chair: Nai-Yuan Chiang
Cluster: Computational Software

Talk 1: Simultaneous approaches for estimation and optimization of hybrid dynamic systems
Speaker: Carl Laird
Abstract: Hybrid modeling is becoming increasingly popular in science and engineering applications where it benefits from first-principles domain knowledge and increasing availability of data to train and validate machine learning components. Notable recent methodologies include the use of neural networks within DAE (differential algebraic equation) systems, namely Neural ODEs and Universal Differential Equations (UDE), which are powerful tools that enable effective hybrid modeling of systems with complex dynamics. In this presentation, we demonstrate the integration of machine learning models with simultaneous approaches for estimation and optimization of hybrid dynamic systems. We investigate both algorithmic aspects for improving computational performance and integration with modeling languages.

Talk 2: Recent Progress in the Cardinal Optimizer
Speaker: Nils-Christian Kempke
Abstract: The Cardinal Optimizer (COPT) is a mathematical optimization solver for large-scale optimization problems. It includes high-performance solvers for LP, MIP and conic programming etc. In this talk, we present the recent advances in COPT, including our work on GPU-accelerated optimization. We highlight some key computational and theoretical improvements we have recently integrated and evaluate their impact on the performance of our solver. We provide an overview of the performance numbers of the latest COPT release for all supported problem classes.

Talk 3: Latest Developments and Improvements in the MindOpt Interior Point Optimizer
Speaker: Kuoling Huang
Abstract: We focus on practical improvements in the MindOpt interior point optimizer, with a particular emphasis on enhancements in the sparse solver. We discuss various algorithmic techniques, such as advanced Cholesky factorization and reordering strategies, that enhance numerical stability and efficiency. Key numerical results on large-scale optimization problems will be presented.

Speakers
CL

Carl Laird

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
NK

Nils-Christian Kempke

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
KH

Kuoling Huang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 214 3501 Trousdale Pkwy, 214, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10R: Quantum Methods for Optimization and Sampling
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Quantum Methods for Optimization and Sampling
Chair: Jiaqi Leng
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)

Talk 1: Optimizing random local Hamiltonians by dissipation
Speaker: Alexander Dalzell
Abstract: Markov chain Monte Carlo (MCMC) is a versatile category of numerical methods that has been fruitfully applied to many instances of classical constraint optimization problems. Recent work has developed a quantum analogue of MCMC—quantum Gibbs sampling algorithms—which can be applied to quantum optimization problems, that is, the task of preparing a low-energy quantum state of a local Hamiltonian. However, since quantum computers do not yet exist, these quantum algorithms cannot yet be tested empirically. In this work, we provide analytical guarantees for a simplified version of the Gibbs sampling algorithm on a wide class of random quantum optimization problems. Specifically, we study the problem of preparing a quantum state that optimizes a local Hamiltonian on a system of either quantum spins or fermions consisting of random all-to-all, k-local interactions. We prove that the simplified Gibbs sampling algorithm achieves a Ω(1/k)-fraction approximation of the optimum energy, giving an exponential improvement on the k-dependence over the prior best (both classical and quantum) algorithmic guarantees. Combined with existing circuit lower bound for such states, our results suggest that finding low-energy states for sparsified (quasi)local spin and fermionic models is quantumly easy but classically nontrivial. This further indicates that quantum Gibbs sampling may be a suitable metaheuristic for optimization problems. This is based on joint work with Joao Basso and Chi-Fang Chen (https://arxiv.org/pdf/2411.02578)

Talk 2: Quantum Langevin Dynamics for Optimization
Speaker: Zherui Chen
Abstract: We initiate the study of utilizing Quantum Langevin Dynamics (QLD) to solve optimization problems, particularly those non-convex objective functions that present substantial obstacles for traditional gradient descent algorithms. Specifically, we examine the dynamics of a system coupled with an infinite heat bath. This interaction induces both random quantum noise and a deterministic damping effect to the system, which nudge the system towards a steady state that hovers near the global minimum of objective functions. We theoretically prove the convergence of QLD in convex landscapes, demonstrating that the average energy of the system can approach zero in the low temperature limit with an exponential decay rate correlated with the evolution time. Numerically, we first show the energy dissipation capability of QLD by retracing its origins to spontaneous emission. Furthermore, we conduct detailed discussion of the impact of each parameter. Finally, based on the observations when comparing QLD with classical Fokker-Plank-Smoluchowski equation, we propose a time-dependent QLD by making temperature and ℏ time-dependent parameters, which can be theoretically proven to converge better than the time-independent case and also outperforms a series of state-of-the-art quantum and classical optimization algorithms in many non-convex landscapes.

Talk 3: Quantum Acceleration of Gibbs Sampling for Continuous Potentials
Speaker: Jiaqi Leng
Abstract: Realizing a random variable corresponding to a Gibbs distribution σ ∝ e(−βV(x)) with a continuous potential V(x) is a prominent task in the computer science and engineering. However, sampling from a high-dimensional Gibbs distribution is in general intractable due to the exponential blowup of computational complexity, a challenge known as the curse of dimensionality. While quantum computers can efficiently process high-dimensional data, existing quantum algorithms for Gibbs sampling exhibit sub-optimal dimension dependence and potentially slow convergence rate, leaving substantial room for quantum advantage for the classical sampling task. In this paper, by reformulating the invariant measure of certain diffusion processes as ground state of quantum Hamiltonians, we propose a novel approach to accelerate Gibbs sampling on quantum computers. In this paper, by reformulating the invariant measure of certain diffusion processes as the ground state of quantum Hamiltonians, we propose a novel approach to accelerating Gibbs sampling on quantum computers using the standard quantum numerical linear algebra toolbox. We apply our framework to two commonly used diffusion processes in the literature: Langevin Diffusion (LD) and Replica Exchange Langevin Diffusion (RELD). By quantizing LD, we obtain a quantum algorithm that matches the best-known classical methods. While RELD has been widely used in practice, its rigorous analysis remains challenging in the classical literature. Our new method bypasses the technical difficulties in the classical convergence analysis of RELD, yielding a quantum algorithm that outperforms all known results in terms of dimension dependence and convergence rate.

Speakers
AD

Alexander Dalzell

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZC

Zherui Chen

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JL

Jiaqi Leng

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 154 3518 Trousdale Pkwy, 154, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10S: Advances in Modeling and Optimization for MDP and Optimal Control
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Advances in Modeling and Optimization for MDP and Optimal Control
Chair: Yan Li & Minda Zhao
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: Beyond absolute continuity: a new class of dynamic risk measures
Speaker: Jincheng Yang
Abstract: The modern theory of risk measures copes with uncertainty by considering multiple probability measures. While it is often assumed that a reference probability measure exists, under which all relevant probability measures are absolutely continuous, there are examples where this assumption does not hold, such as certain distributional robust functionals. In this talk, we introduce a novel class of dynamic risk measures that do not rely on this assumption. We will discuss its convexity, coherence, and time consistency properties.

Talk 2: TBD
Speaker: Yashaswini Murthy
Abstract: TBD

Talk 3: Landscape of Policy Optimization for Finite Horizon MDPs with General State and Action
Speaker: Minda Zhao
Abstract: Policy gradient methods are widely used in reinforcement learning. Yet, the nonconvexity of policy optimization imposes significant challenges in understanding the global convergence of policy gradient methods. For a class of finite-horizon Markov Decision Processes (MDPs) with general state and action spaces, we develop a framework that provides a set of easily verifiable assumptions to ensure the Kurdyka-Łojasiewicz (KŁ) condition of the policy optimization. Leveraging the KŁ condition, policy gradient methods converge to the globally optimal policy with a non-asymptomatic rate despite nonconvexity. Our results find applications in various control and operations models, including entropy-regularized tabular MDPs, Linear Quadratic Regulator (LQR) problems, stochastic inventory models, and stochastic cash balance problems, for which we show an $\epsilon$-optimal policy can be obtained using a sample size in $\tilde{\co}(\epsilon^{-1})$ and polynomial in terms of the planning horizon by stochastic policy gradient methods. Our result establishes the first sample complexity for multi-period inventory systems with Markov-modulated demands and stochastic cash balance problems in the literature.

Speakers
JY

Jincheng Yang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YM

Yashaswini Murthy

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MZ

Minda Zhao

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 155 3518 Trousdale Pkwy, 155, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10T: Optimization on Manifolds
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Optimization on Manifolds
Chair: Christopher Criscitiello
Cluster: Optimization on Manifolds

Talk 1: Estimation of barycenters in convex domains of metric spaces
Speaker: Victor-Emmanuel Brunel
Abstract: In this talk, we study the statistical and algorithmic aspects of barycenter estimation in convex domains of metric spaces. For data that lie in a non-linear metric space, such as a sphere, barycenters are the natural extension of means, and their estimation is a fundamental statistical problem. Here, we assume that the space satisfies a curvature upper bound in Alexandrov’s sense: Then, barycenters are minimizers of geodesically convex functions, yielding good statistical and algorithmic guarantees.

Talk 2: Non-Euclidean Motion Planning with Graphs of Geodesically-Convex Sets
Speaker: Thomas Cohn
Abstract: Mathematical optimization is a central tool for planning efficient, collision-free trajectories for robots. However, such trajectory optimization methods may get stuck in local minima due to inherent nonconvexities in the optimization landscape. The use of mixed-integer programming to encapsulate these nonconvexities and find globally optimal trajectories has recently shown great promise. One such tool is the Graph of Convex Sets framework, which yields tight convex relaxations and efficient approximation strategies that greatly reduce runtimes. These approaches were previously limited to Euclidean configuration spaces, but many robotic configuration spaces of interest are best represented as smooth manifolds. We introduce Graphs of Geodesically-Convex Sets, the natural generalization of GCS to Riemannian manifolds. We leverage a notion of "local" geodesic convexity to avoid the limitations that would otherwise arise from compact manifolds and closed geodesics, which often arise in robot configuration spaces. We demonstrate that this optimization framework encompasses the motion planning problems of interest, and examine when it can and cannot be solved efficiently. In the case of zero-curvature, we are able to reduce the problem to a mixed-integer convex program that can be solved efficiently. On the other hand, positive curvature makes the Riemannian distance function not even locally convex. In the general case, we describe a piecewise-linear approximation strategy to obtain approximate solutions to shortest path problems on manifolds of arbitrary curvature. We also specifically compare different parametrizations of the Lie groups SO(3) and SE(3). We present experiments demonstrating our methodology on various robot platforms, including producing efficient collision-free trajectories for a PR2 bimanual mobile manipulator.

Talk 3: Leveraging Manifold Structure for Fast Algorithms
Speaker: Brighton Ancelin
Abstract: In today's data-driven world, many problems require the processing of high-dimensional data. Methods like Principal Component Analysis (PCA) may help reduce the dimensionality of such problems, but may be ineffective or inefficient when data lies on non-linear manifolds. Some common manifolds even exhibit a natural Euclidean embedding dimension that far exceeds their intrinsic dimension, highlighting the need for more sophisticated approaches. In this talk, we will explore techniques for effectively handling data on such manifolds by operating on this lower intrinsic dimension. Such techniques can lead to much faster and more memory-efficient algorithms. We will first examine a specific problem on Grassmannian manifolds, then generalize to the broader class of quotient manifolds. Attendees may expect to gain a better understanding of how to view their structured data, and how to more efficiently process it.

Speakers
CC

Christopher Criscitiello

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
VB

Victor-Emmanuel Brunel

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Thomas Cohn

Thomas Cohn

PhD Candidate, Massachusetts Institute of Technology
I'm a PhD candidate at MIT, working with Professor Russ Tedrake in the Robot Locomotion Group. My research focus is robot motion planning, primarily using tools from optimization and differential geometry.
BA

Brighton Ancelin

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 158 3518 Trousdale Pkwy, 158, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10U: Approximating derivatives in derivative-free optimization
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Approximating derivatives in derivative-free optimization
Chair: Geovani Grapiglia
Cluster: Derivative-free Optimization

Talk 1: TRFD: A derivative-free trust-region method based on finite differences for composite nonsmooth optimization
Speaker: Geovani Grapiglia
Abstract: In this work we present TRFD, a derivative-free trust-region method based on finite differences for minimizing composite functions of the form $f(x)=h(F(x))$, where $F$ is a black-box function assumed to have a Lipschitz continuous Jacobian, and $h$ is a known convex Lipschitz function, possibly nonsmooth. The method approximates the Jacobian of $F$ via forward finite differences. We establish an upper bound for the number of evaluations of $F$ that TRFD requires to find an $\epsilon$-approximate stationary point. For L1 and Minimax problems, we show that our complexity bound reduces to $\mathcal{O}(n\epsilon^{-2})$ for specific instances of TRFD, where $n$ is the number of variables of the problem. Assuming that $h$ is monotone and that the components of $F$ are convex, we also establish a worst-case complexity bound, which reduces to $\mathcal{O}(n\epsilon^{-1})$ for Minimax problems. Numerical results are provided to illustrate the relative efficiency of TRFD in comparison with existing derivative-free solvers for composite nonsmooth optimization.

Talk 2: Sparse gradients in derivative-free optimization
Speaker: Daniel McKenzie
Abstract: In this talk I’ll survey some of my recent work in extending DFO to the high-dimensional regime, primarily by exploiting sparsity in gradients to construct good gradient approximations cheaply.

Talk 3: How sampling distribution affect gradient-free optimization
Speaker: Liyuan Cao
Abstract: Gradient-free optimization is often used to optimize functions where only function values are accessible. It primarily refers to gradient descent methods where the "gradients" are directional derivatives estimated via finite differences using randomly sampled points. The sampling distribution is often chosen as Gaussian, uniform on sphere, Haar, or uniform along the coordinate directions. This choice has been shown to affect the algorithms' numerical efficiency. It is presented in this talk a theoretical analysis of the algorithms' performance under different sample distribution. The result provides a guidance on how to choose the distribution.

Speakers
GG

Geovani Grapiglia

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
DM

Daniel McKenzie

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
LC

Liyuan Cao

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 108 3501 Trousdale Pkwy, 108, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10V: Modeling Oriented Software and Methods (2)
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Modeling Oriented Software and Methods (2)
Chair: Jean-Paul Watson
Cluster: Computational Software

Talk 1: SNoGloDE: Structural Nonlinear Global Decomposition for Parallel Solution of large-Scale NLPs in Python
Speaker: Georgia Stinchfield
Abstract: TBD

Talk 2: Automated Generation and Modeling of Piecewise-linear Functions in Pyomo
Speaker: Bashar Ammari
Abstract: TBD

Talk 3: TBD
Speaker: TBD TBD
Abstract: TBD

Speakers
JW

Jean-Paul Watson

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
GS

Georgia Stinchfield

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
BA

Bashar Ammari

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
TT

TBD TBD

TBD, TBD
TBD
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 110 3501 Trousdale Pkwy, 110, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10W: Parallel and Distributed Optimization
Thursday July 24, 2025 10:30am - 11:45am PDT
Session: Parallel and Distributed Optimization
Chair: Michel Lahmann
Cluster: nan

Talk 1: Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
Speaker: Artavazd Maranjyan
Abstract: Asynchronous Stochastic Gradient Descent (Asynchronous SGD) is a cornerstone method for parallelizing learning in distributed machine learning. However, its performance suffers under arbitrarily heterogeneous computation times across workers, leading to suboptimal time complexity and inefficiency as the number of workers scales. While several Asynchronous SGD variants have been proposed, recent findings by Tyurin & Richtárik (NeurIPS 2023) reveal that none achieve optimal time complexity, leaving a significant gap in the literature. In this paper, we propose Ringmaster ASGD, a novel Asynchronous SGD method designed to address these limitations and tame the inherent challenges of Asynchronous SGD. We establish, through rigorous theoretical analysis, that Ringmaster ASGD achieves optimal time complexity under arbitrarily heterogeneous and dynamically fluctuating worker computation times. This makes it the first Asynchronous SGD method to meet the theoretical lower bounds for time complexity in such scenarios.

Talk 2: GPU-Based Complete and Rigorous Search for Continuous Global Optimization and Constraint Satisfaction
Speaker: Guanglu Zhang
Abstract: Continuous optimization is of significance in many science and engineering fields. In many practical applications, it is desirable, and sometimes indispensable, to find the global optimum when solving continuous nonconvex optimization problems. However, popular gradient-based methods (e.g., interior point methods) and heuristic methods (e.g., genetic algorithms) often become trapped in local optima and do not take rounding errors into account, leading to inconsistent or incorrect assumptions and applications. Several methods using interval analysis have been introduced to enclose the guaranteed global optimum for continuous nonconvex optimization, but these interval methods are significantly more computationally expensive than gradient-based methods and heuristic methods and therefore are not commonly adopted in practical applications. Importantly, the unprecedented computational power brought by modern Graphics Processing Units (GPUs) offers new opportunities to overcome the computational bottleneck in global optimization using interval analysis. However, existing interval methods cannot utilize the computational power of modern GPUs, because they are designed for CPU-based sequential computing while modern GPUs are designed for massively parallel computing with unique compute and memory architecture. Based on the research gap, we introduce a novel GPU-based, complete, and rigorous method that can efficiently enclose the guaranteed global optimum based on user-specified tolerances for continuous nonconvex optimization problems, especially for large-scale continuous nonconvex optimization problems. Using interval analysis, coupled with the computational power and architecture of GPU, the method iteratively rules out the suboptimal regions in the search domain where the global optimum cannot exist and leaves a finite set of regions where the global optimum must exist. The method is validated by enclosing the global optimum of 10 multimodal benchmark test functions with scalable dimensions, including the Levy function, Ackley function, and Griewank function. These test functions represent grand challenges of global optimization and enclosing the guaranteed global optimum of these test functions with more than 80 variables has not been reported in the literature. Our method successfully encloses the global optimum for each of these ten test functions with up to 10,000 variables using a server with one GPU in reasonable computation time, demonstrating a linear increase in total number of iterations and approximately quadratic increase in computation time with the number of variables in these optimization problems. The method is also applied to solve several practical engineering optimization problems effectively and efficiently. Corresponding Publication Zhang, G., Feng, W., & Cagan, J. (2024). A GPU-Based Parallel Region Classification Method for Continuous Constraint Satisfaction Problems. Journal of Mechanical Design, 146(4), 041705.

Talk 3: A parallelizable ADMM approach for block-structured quadratic programs
Speaker: Michel Lahmann
Abstract: Block-structured quadratic programs (QPs) arise frequently in the context of optimal control problems. The goal of our study is to provide a fast solution to these QPs which is crucial for the successful application of optimal control algorithms to many real-world problems. Existing ADMM-based QP solvers usually split the problem into an equality-constrained QP and a projection onto a box. In general, this is a proven and widely used method. Nevertheless, splitting the problem in this way does not allow the block structure of OPs arising from optimal control problems to be optimally utilized. With our new splitting of the problem we are able to utilize this block structure and solve parts of the problem in parallel. We will introduce the resulting method and show numerical results.

Speakers
AM

Artavazd Maranjyan

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Guanglu Zhang

Guanglu Zhang

Research Scientist, Carnegie Mellon University
Name: Dr. Guanglu ZhangTitle: GPU-Based Complete and Rigorous Search for Continuous Global Optimization and Constraint SatisfactionBio: Dr. Zhang is a research scientist at Carnegie Mellon University. His research in mathematical optimization employs interval analysis, coupled with... Read More →
ML

Michel Lahmann

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 112 3501 Trousdale Pkwy, 112, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10X
Thursday July 24, 2025 10:30am - 11:45am PDT
Thursday July 24, 2025 10:30am - 11:45am PDT
Taper Hall (THH) 215 3501 Trousdale Pkwy, 215, Los Angeles, CA 90089

10:30am PDT

Parallel Sessions 10Y
Thursday July 24, 2025 10:30am - 11:45am PDT
Thursday July 24, 2025 10:30am - 11:45am PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 200 3518 Trousdale Pkwy, 200, Los Angeles, CA 90089

11:45am PDT

Lunch 4 (provided)
Thursday July 24, 2025 11:45am - 1:15pm PDT
American BBQ Buffet
Thursday July 24, 2025 11:45am - 1:15pm PDT
USC Founder's / Hutton Park 3551 Trousdale Pkwy, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11A: First order methods in machine learning
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: First order methods in machine learning
Chair: Yaoliang Yu
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: Independently-Normalized SGD for Generalized-Smooth Nonconvex Optimization
Speaker: Yufeng Yang
Abstract: Recent studies have shown that many nonconvex machine learning problems meet a so-called generalized-smooth condition that extends beyond traditional smooth nonconvex optimization. However, the existing algorithms designed for generalized-smooth nonconvex optimization encounter significant limitations in both their design and convergence analysis. In this work, we first study deterministic generalized-smooth nonconvex optimization and analyze the convergence of normalized gradient descent under the generalized Polyak-Lojasiewicz condition. Our results provide a comprehensive understanding of the interplay between gradient normalization and function geometry. Then, for stochastic generalized-smooth nonconvex optimization, we propose an independently-normalized stochastic gradient descent algorithm, which leverages independent sampling, gradient normalization, and clipping to achieve the standard sample complexity under relaxed assumptions. Experiments demonstrate the fast convergence of our algorithm.

Talk 2: Why does the Adam algorithm work so well for training large language models?
Speaker: Mark Schmidt
Abstract: The success of the Adam optimizer on a wide array of architectures has made it the default in settings where stochastic gradient descent (SGD) performs poorly. However, it is unclear why the gap between Adam and SGD is often big for large language models (LLMs) but small for computer vision benchmarks. Recent work proposed that Adam works better for LLMs due to heavy-tailed noise in the stochastic gradients. We show evidence that the noise is not a major factor in the performance gap between SGD and Adam. Instead, we show that a key factor is the class imbalance found in language tasks. In particular, the large number of low-frequency classes causes SGD to converge slowly but has a smaller effect on Adam and sign descent. We show that a gap between SGD and Adam can be induced by adding a large number of low-frequency classes to computer vision models or even to linear models. We further prove in a simple setting that gradient descent converges slowly while sign descent does not.

Talk 3: On Games with Conflicting Interests
Speaker: Baoxiang Wang
Abstract: To investigate differentiable games (that have strategy spaces in $\reals^n$), we decompose the game into components where the dynamic is well understood. We show that any differentiable game can be decomposed as a direct sum of three parts: the first decomposition as an exact potential part, a near vector potential part, and a non-strategic part; the second as a near potential part, an exact vector potential part, and a non-strategic part. A potential part coincides with potential games described by Monderer and Shapley (1996), known as pure cooperative games. A vector potential game on the other hand represents a game with purely conflicting interests. We show that the individual gradient field is divergence-free, in which case the gradient descent dynamic may either be divergent or recurrent. When the divergence-free game is finite, including harmonic games and important classes of zero-sum games, we show that optimistic variants of classical no-regret learning algorithms converge to an $\epsilon$-approximate Nash equilibrium at a rate of $O(1/\epsilon^2).

Speakers
YY

Yaoliang Yu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MS

Mark Schmidt

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Baoxiang Wang

Baoxiang Wang

Assistant professor, The Chinese University of Hong Kong, Shenzhen
Name: Baoxiang WangTitle: Assistant professorAffiliation: The Chinese University of Hong Kong, ShenzhenBio:Baoxiang Wang is an assistant professor at School of Data Science, The Chinese University of Hong Kong, Shenzhen. Baoxiang works on reinforcement learning and learning theory... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 101 3501 Trousdale Pkwy, 101, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11B: Learning, Robustness, and Fairness
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Learning, Robustness, and Fairness
Chair: Cagil Kocyigit
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: Fairness in Federated Learning
Speaker: Daniel Kuhn
Abstract: We study group fairness regularizers in federated learning with the aim to find a globally fair model in a distributed fashion. Distributed training poses unique challenges because the fairness regularizers based on probability metrics popular in centralized training cannot be decomposed across clients. To circumvent this challenge, we propose a function-tracking scheme for the global fairness regularizer based on a Maximum Mean Discrepancy (MMD), which incurs a small communication overhead. The proposed function-tracking scheme can readily be incorporated into most federated learning algorithms while maintaining rigorous convergence guarantees, as we will exemplify in the context of FedAvg. When enforcing differential privacy, the kernel-based MMD regularization allows for easy analysis via a change of kernel, leveraging an intuitive interpretation of kernel convolution. Numerical experiments validate our theoretical findings.

Talk 2: Robust Offline Policy Learning Under Covariate Shifts
Speaker: Phebe Vayanos
Abstract: We study the problem of distribution shifts in offline policy learning, where the policy training distribution is different from the deployment distribution and may lead to harmful or sub-optimal policy actions at deployment. In real world applications, changes to an allocation system can cause shifts in measured covariates, such as wording changes in survey questions that elicit different responses from individuals experiencing homelessness. As a result, a non-robust allocation policy may incorrectly over or under allocate resources based on the original offline data distribution. Adopting a Wasserstein distributionally robust approach, we learn an allocation policy that is not restricted to any functional form and robust to potential covariate shifts in the population of allocatees.

Talk 3: Learning Robust Risk Scores under Unobserved Confounders
Speaker: Cagil Kocyigit
Abstract: We study the problem of learning robust risk scores from observational data in the presence of unobserved confounders. In the absence of unobserved confounders, a well-known approach to adjust for confounding is inverse probability weighting (IPW) of the data. However, in the presence of unobserved confounders, estimating these weights is challenging, even in large data regimes. We formulate a robust maximum likelihood problem with the objective of maximizing the worst-case likelihood in view of all possible weights within an uncertainty set, which we construct by drawing inspiration from sensitivity analysis in observational data settings. We formulate this problem as a convex optimization problem by leveraging duality techniques rooted in robust optimization. Numerical experiments show that our robust estimates outperform both IPW and direct estimation methods on synthetic data designed to reflect realistic scenarios.

Speakers
DK

Daniel Kuhn

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
PV

Phebe Vayanos

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
CK

Cagil Kocyigit

Assistant Professor, University of Luxembourg
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 201 3501 Trousdale Pkwy, 201, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11C: Recent Advances in Stochastic First-Order Methods
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Recent Advances in Stochastic First-Order Methods
Chair: Tianjiao Li
Cluster: Optimization Under Uncertainty and Data-driven Optimization

Talk 1: Auto-conditioned gradient methods for nonconvex stochastic optimization
Speaker: Guanghui Lan
Abstract: We present a novel class of projected gradient methods, termed auto-conditioned projected gradient (AC-PG) method, for minimizing a smooth but not necessarily convex function over a compact convex set. AC-PG does not require the input of the Lipschitz constant of the objective function or any line search procedure, but can still achieve the best-known iteration complexity associated with the projected gradient method for finding an approximate stationary point of the problem. The key idea is to estimate the Lipschitz constant using first-order information gathered from the previous iterations, and to show that the error caused by underestimating the Lipschitz constant can be properly controlled. We then generalize AC-PG for stochastic nonconvex optimization and derive the well-known sample complexity for stochastic projected gradient descent for a class of nonconvex stochastic optimization problem.

Talk 2: A Novel Catalyst Scheme for Stochastic Minimax Optimization
Speaker: Yan Li
Abstract: We present a proximal-point-based catalyst scheme for simple first-order methods applied to convex minimization and convex-concave minimax problems. In particular, for smooth and (strongly)-convex minimization problems, the proposed catalyst scheme, instantiated with a simple variant of stochastic gradient method, attains the optimal rate of convergence in terms of both deterministic and stochastic errors. For smooth and strongly-convex-strongly-concave minimax problems, the catalyst scheme attains the optimal rate of convergence for deterministic and stochastic errors up to a logarithmic factor. To the best of our knowledge, this reported convergence seems to be attained for the first time by stochastic first-order methods in the literature. We obtain this result by designing and catalyzing a novel variant of stochastic extragradient method for solving smooth and strongly-monotone variational inequality, which may be of independent interest.

Talk 3: Bridging constrained stochastic optimization and statistical learning
Speaker: Wenlong Mou
Abstract: First-order stochastic optimization algorithms and complexity-based statistical risk bounds form the foundation of modern machine learning theory. When seeking optimal parameters within a constrained set, the achievable statistical risk depends on the complexity of the constraint set rather than on the ambient dimension. However, existing first-order stochastic optimization methods often fail to adapt to this complexity, making it challenging to attain instance-dependent optimal risk. In this talk, I will discuss recent advances in constrained stochastic optimization. We examine the complexity of stochastic gradient methods and their variance-reduced counterparts, exploring how these algorithms respond to the geometric structures of constraint sets. I will also present new instance-dependent optimality guarantees for a class of variance-reduced algorithms, highlighting how these results provide concrete guidance on the choice of parameters.

Speakers
avatar for Tianjiao Li

Tianjiao Li

PhD student, Georgia Institute of Technology
Name: Tianjiao LiTitle: PhD studentAffiliation: Georgia Institute of TechnologyBio:Tianjiao Li is a fifth-year PhD candidate in the School of Industrial and Systems Engineering at Georgia Institute of Technology, advised by Prof. George Lan and Prof. Ashwin Pananjady. His research... Read More →
GL

Guanghui Lan

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YL

Yan Li

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
WM

Wenlong Mou

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 208 3501 Trousdale Pkwy, 208, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11D: Nonmonontone inclusions and nonconvex optimization
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Nonmonontone inclusions and nonconvex optimization
Chair: Dimitri Papadimitriou
Cluster: Nonlinear Optimization

Talk 1: Proximal iterations for solving nonmonotone inclusions and applications
Speaker: Dimitri Papadimitriou
Abstract: In a real Hilbert space H, the monotone inclusion problem aims at finding a zero of a set-valued maximally monotone operator A. The term "warped proximal iteration" was recently introduced as generalization of the proximal point iterative algorithm for finding a zero point of a maximally monotone operator A acting on the space H. Nevertheless, the maximal monotonicity of the operator A restricts the applicability of this algorithm to the class of convex optimization problems as well as operator splitting methods for composite monotone inclusions. The solving of general nonmonotone inclusion, i.e., the inclusion where the operator A is nonmonotone, remains an open and challenging research problem. For this purpose, the notion of r-(co)hypomonotonicity has been introduced to guarantee the convergence of the generated sequence. From this perspective, the first objective of this paper is to extend the definition of r-hypomotonicity. The second is to determine the convergence property of the warped proximal iteration as well as its various applications for the solving of (constrained) nonconvex optimization problems. In particular, we place our attention to the finding of Karush Kuhn Tucker (KKT) points for a class of nonconvex quadratic programming (QP) problems with equality constraints. Numerical results illustrate the performance of our method against state-of-the-art QP solvers.

Talk 2: Qualitative Analysis and Boosted DCA for Generalized Multi-Source Weber Problems
Speaker: Tuyen Tran
Abstract: In this talk, we first investigate fundamental qualitative properties of the generalized multi-source Weber problem formulated using the Minkowski gauge function. Then, we apply Nesterov's smoothing and the adaptive Boosted Difference of Convex functions Algorithm (BDCA) to solve both the unconstrained and constrained versions of the generalized multi-source Weber problems. These algorithms are tested in Matlab with real and artificial data sets. We conduct a comprehensive evaluation of the new algorithms and provide insights into its efficiency.

Talk 3: A Primal-Dual Proximal Outer Approximation algorithm
Speaker: Vu Bang
Abstract: In this talk we propose a Proximal Outer Approximation (PrOA) algorithm for solving Mixed Integer Quadratically Constrained Quadratic Programs (MI-QCQP). The proposed sequential method works as follows: starting from the integer-feasible point obtained by solving the continuous relaxation of the original problem, at each iteration, it first solves by means of a primal-dual proximal algorithm an NLP problem to obtain a constraint-feasible pair that is (also) used to compute an upper-bound on the optimal value, it then obtains a new integer-feasible pair by solving an MIQP through iteratively improving first-order approximations of the nonlinear constraints and computes a lower bound using this new pair. The integer-feasible point obtained is then used to solve the NLP problem of the next iteration. Convergence and computational properties are analyzed and compared together with numerical executions on classical QCQP's such as the QKP, the congestion-constrained FLP, and the generalized multi-source Weber problem.

Speakers
DP

Dimitri Papadimitriou

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
TT

Tuyen Tran

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
VB

Vu Bang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 210 3501 Trousdale Pkwy, 210, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11E: Quantum Computing and discrete continuous optimization
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Quantum Computing and discrete continuous optimization
Chair: David Bernal Neira
Cluster: Interplay Between Continuous and Discrete Optimization

Talk 1: Interior-point and first-order methods for quantum entanglement detection
Speaker: Javier Pena
Abstract: We describe two different algorithmic approaches to tackle the following central problem in quantum information science: given a bipartite quantum state, detect if the state is entangled by constructing a suitable "entanglement witness". The crux for both approaches is to solve a challenging convex feasibility problem based on a widely known symmetric extension criteria for entanglement detection. We show that a judicious formulation of the convex feasibility problem can be naturally tackled via both interior-point methods and Frank-Wolfe method. We will discuss the tradeoffs of these two computational approaches.

Talk 2: Prospects for quantum speedups in convex integer programming
Speaker: Brandon Augustino
Abstract: Integer programming constitutes a fundamental class of problems in computer science with countless real-world applications, and their NP-hardness makes them an attractive candidate for large quantum speedups. The most well-known quantum approaches to solve discrete optimization problems either rely on quantum speedups for unstructured search, or rely on the use of penalty functions to incorporate constraints. Unsurprisingly, these techniques do not lead to quantum algorithms that offer speedups over the state-of-the-art classical IP algorithms, which search the feasible region in a highly structured manner. In this talk we discuss the challenges and potential associated with designing faster quantum algorithms for the convex integer programming problem. From a technical standpoint, we discuss new quantum approaches to enumerating lattice points in convex bodies, which is at the core of the quantum IP algorithm we analyze.

Talk 3: Continuous optimization for unconventional continuous computing: Accelerating Continuous Variable Coherent Ising Machines with momentum updates
Speaker: David Bernal Neira
Abstract: The Coherent Ising Machine (CIM) leverages continuous dynamics and physical annealing principles for solving complex Ising problems. Our work introduces modifications to the Continuous Variable CIM (CV-CIM) by integrating momentum and Adam-based updates, targeting accelerated convergence for continuous optimization. Standard CV-CIMs operate with gradient-based updates, which, although effective, can be trapped by local minima or suffer from poor problem conditioning. By enhancing the CIM dynamics with momentum and adaptive techniques, our approach significantly improves convergence rates and solution diversity. Through numerical experiments on benchmark quadratic programming problems, we demonstrate that momentum-CV-CIM and Adam-CV-CIM variations outperform standard CV-CIM in convergence speed and robustness, especially under challenging, poorly conditioned instances. These findings suggest that incorporating classical optimization techniques like momentum and Adam into CV-CIM systems not only enhances computational efficiency but also extends the practical applicability of CIMs for a broader class of continuous optimization problems. This work underscores the potential for hybrid approaches that marry classical optimization strategies with non-conventional computing architectures, promoting more reliable and scalable solutions across diverse fields in optimization. The results are available in https://arxiv.org/abs/2401.12135

Speakers
JP

Javier Pena

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
BA

Brandon Augustino

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
DB

David Bernal Neira

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 212 3501 Trousdale Pkwy, 212, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11F: SDPs and their applications across mathematics
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: SDPs and their applications across mathematics
Chair: Christoph Spiegel
Cluster: Conic and Semidefinite Optimization

Talk 1: Semidefinite programming bounds for packing in geometrical graphs and hypergraphs
Speaker: Fernando Mario de Oliveira Filho
Abstract: Packing problems in geometry, like the sphere-packing problem, can be modeled as independent-set problems in infinite graphs. The Lovász theta number, a semidefinite programming upper bound for the independence number of a graph, can often be extended to such infinite graphs, providing some of the best upper bounds for several geometrical parameters. Perhaps the most famous such application is Viazovska's solution of the sphere-packing problem in dimension 8. In this talk I will discuss the problem of packing regular polygons on the surface of the 2-dimensional sphere and how the Lovász theta number can be used to give upper bounds. I will also discuss the problem of finding large almost-equiangular sets, which is a hypergraph extension of packing.

Talk 2: TBD
Speaker: Pablo A. Parrilo
Abstract: TBD

Talk 3: Shattering triples with permutations
Speaker: Jan Volec
Abstract: Given a 6-tuple of permutations of [n], we say that a triple of points T of [n] is totally shattered if each of the six possible relative orderings of T appears in exactly one of the permutations. We set F(n) to be the maximum fraction of triples that can be totally shattered by a 6-tuple of permutations of [n]. In 2023, Johnson and Wickes showed that the limit of F(n) is at least 17/42 and at most 11/14, and they asked to determine the limit. In this talk, we use the flag algebra method to prove that the limit of F(n) = 10/21. This is a joint work with Daniela Opocenska.

Speakers
FM

Fernando Mario de Oliveira Filho

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
PA

Pablo A. Parrilo

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JV

Jan Volec

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 156 3518 Trousdale Pkwy, 156, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11G: Bilevel Programming and applications
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Bilevel Programming and applications
Chair: Luce Brotcorne
Cluster: Optimization Applications (Communication, Energy, Health, ML, ...)

Talk 1: Optimal Electric Vehicle Charging with Dynamic Pricing, Customer Preferences and Power Peak Reduction
Speaker: Luce Brotcorne
Abstract: We consider a provider of electric vehicle charging stations that operates a network of charging stations and use time varying pricing to maximize profit and reduce the impact on the electric grid. We propose a bilevel model with a single leader and multiple disjoint followers. The provider (leader) sets the price of charging for each station at each time slot, and ensures there is enough energy to charge. The charging choice of each customer is represented by a combination of a preference list of (station, time) pairs and a reserve price. The proposed model takes thus into accounts for the heterogeneity of customers with respect to price sensitivity. We define a single-level reformulation based on a reformulation approach from the literature on product line optimization, and we report computational results that highlight the efficiency of the new reformulation and the potential impact for reducing peaks on the electricity grid.

Talk 2: A Benders decomposition algorithm for thMinimum Spanning Tree Interdiction Problem..
Speaker: Frederic Semet
Abstract: In this presentation, we consider the Minimum Spanning Tree Interdiction (MSTI) problem. This problem is a two-player game between a network operator and an interdictor. The former aims to determine a Minimum Spanning Tree (MST) in a network. Constrained by a budget, the latter seeks to change the network topology to increase the weight of a MST. Two types of interdiction are considered: total and partial interdiction. A total interdicted edge is considered absent, while the weight of a partial interdicted edge is augmented by a predefined amount. The interdictor's budget is modeled as a knapsack constraint. In the first part of the presentation a mathematical formulation is introduced and valid inequalities are proposed to strengthen the model. Since a commercial solver can only solve instances of limited size, we describe a Benders decomposition algorithm in the second part. A MSTI formulation is decomposed into a Master Problem (MP) and a subproblem. A family constraints for the MP, called optimality constraints, are introduced and lifted. A family of valid inequalities is also proposed to tighten the linear relaxation of the MP. The subproblem, parameterized by the solution of the MP, consists only in the computation of a MST. Computational results show that instances with up to 200 nodes can be solved to optimality.

Talk 3: ..
Speaker: Pablo Escalona
Abstract: ..

Speakers
LB

Luce Brotcorne

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
FS

Frederic Semet

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 114 3501 Trousdale Pkwy, 114, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11H: Distributed Learning for Machine Learning
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Distributed Learning for Machine Learning
Chair: Hoi To Wai
Cluster: Multi-agent Optimization and Games

Talk 1: Why batch normalization damage federated learning on non-iid data?
Speaker: Yanmeng Wang
Abstract: As a promising distributed learning paradigm, federated learning (FL) involves training deep neural network (DNN) models at the network edge while protecting the privacy of the edge clients. To train a large-scale DNN model, batch normalization (BN) has been regarded as a simple and effective means to accelerate the training and improve the generalization capability. However, recent findings indicate that BN can significantly impair the performance of FL in the presence of non-i.i.d. data. While several FL algorithms have been proposed to address this issue, their performance still falls significantly when compared to the centralized scheme. Furthermore, none of them have provided a theoretical explanation of how the BN damages the FL convergence. In this work, we present the first convergence analysis to show that under the non-i.i.d. data, the mismatch between the local and global statistical parameters in BN causes the gradient deviation between the local and global models, which, as a result, slows down and biases the FL convergence. In view of this, we develop a new FL algorithm that is tailored to BN, called FedTAN, which is capable of achieving robust FL performance under a variety of data distributions via iterative layer-wise parameter aggregation. Comprehensive experimental results demonstrate the superiority of the proposed FedTAN over existing baselines for training BN-based DNN models.

Talk 2: Fast Two-Time-Scale Stochastic Gradient Method with Applications in Reinforcement Learning
Speaker: Thinh Doan
Abstract: Two-time-scale optimization is a framework that abstracts a range of policy evaluation and policy optimization problems in reinforcement learning (RL). Akin to bi-level optimization under a particular type of stochastic oracle, the two-time-scale optimization framework has an upper level objective whose gradient evaluation depends on the solution of a lower level problem, which is to find the root of a strongly monotone operator. In this work, we propose a new method for solving two-time-scale optimization that achieves significantly faster convergence than the prior arts. The key idea of our approach is to leverage an averaging step to improve the estimates of the operators in both lower and upper levels before using them to update the decision variables. These additional averaging steps eliminate the direct coupling between the main variables, enabling the accelerated performance of our algorithm. We characterize the finite-time convergence rates of the proposed algorithm under various conditions of the underlying objective function, including strong convexity, convexity, Polyak-Lojasiewicz condition, and general non-convexity. These rates significantly improve over the best-known complexity of the standard two-time-scale stochastic approximation algorithm. When applied to RL, we show how the proposed algorithm specializes to novel online sample-based methods that surpass or match the performance of the existing state of the art. Finally, we support our theoretical results with numerical simulations in RL.

Talk 3: Two-timescale Stochastic Primal-dual Algorithms for Decentralized Optimization with Communication Compression
Speaker: Hoi-To Wai
Abstract: This talk presents a two-timescale compressed primal-dual (TiCoPD) algorithm for decentralized optimization. The algorithm is built upon the primal-dual optimization framework developed by a majorization-minimization procedure. The latter naturally suggests the agents to share a compressed difference term during the iteration. Furthermore, the TiCoPD algorithm incorporates a fast timescale mirror sequence for agent consensus on nonlinearly compressed terms with noise, together with a slow timescale primal-dual recursion for optimizing the objective function. We show that the TiCoPD algorithm converges with a constant step size. It also finds an O(1/sqrt{T}) stationary solution after T iterations. Numerical experiments on decentralized training of a neural network validate the efficacy of TiCoPD algorithm.

Speakers
TD

Thinh Doan

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HW

Hoi-To Wai

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 116 3501 Trousdale Pkwy, 116, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11I: Privacy Preserving Collaborative Learning
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Privacy Preserving Collaborative Learning
Chair: Sai Praneeth Karimireddy
Cluster: Optimization for Emerging Technologies (LLMs, Quantum Computing, ...)

Talk 1: Efficient Distributed Optimization under Heavy-Tailed Noise
Speaker: Tian Li
Abstract: In distributed learning, to mitigate communication overhead, local updates are often applied before global aggregation, resulting in a nested optimization approach with inner and outer steps. However, heavy-tailed stochastic gradient noise remains a significant challenge, particularly in attention-based models, hindering effective training. In this work, we propose TailOPT, an efficient framework designed to address heavy-tailed noise by leveraging adaptive optimization and novel clipping techniques. We establish convergence guarantees for the TailOPT framework under heavy-tailed noise with potentially unbounded gradient variance and local updates. Among its variants, we propose a memory- and communication-efficient instantiation (named Bi^2Clip) that performs coordinate-wise clipping from both above and below at both the inner and outer optimizers. Bi^2Clip brings about benefits of adaptive optimization (e.g., Adam) without the cost of maintaining or transmitting additional gradient statistics. Empirically, TailOPT, including Bi^2Clip, demonstrates superior performance on several language tasks and models compared with state-of-the-art methods.

Talk 2: Parameter-Efficient Federated Learning Algorithms for Large-Scale Models
Speaker: Kibaek Kim
Abstract: Federated learning (FL) is a new collaborative training paradigm for training large-scale models across distributed data sources without sharing raw data. However, applying FL to large models presents significant challenges in terms of communication efficiency and resource utilization. In this work, we introduce novel parameter-efficient algorithms tailored to FL with large models. Our approach optimizes model updates by reducing the number of parameters communicated across clients. We will present preliminary results based on the new algorithms.

Talk 3: Data-Centric ML need Statistics and Game Theory
Speaker: Sai Praneeth Karimireddy
Abstract: Data is the most important factor determining the quality of an AI system. Data centric ML is an emerging research direction which constructs metrics to quantify usefulness of data (data valuation / data attribution). However, we show that existing methods do not properly take into account the randomness inherent in ML training. We also show that they are not fit to be used as compensation in data markets since they may not be incentive-compatible.

Speakers
TL

Tian Li

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
KK

Kibaek Kim

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SP

Sai Praneeth Karimireddy

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 100 3518 Trousdale Pkwy, 100, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11J: Advances in first-order methods for large-scale optimization
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Advances in first-order methods for large-scale optimization
Chair: Jiawei Zhang
Cluster: Nonlinear Optimization

Talk 1: Acceleration by Random Stepsizes
Speaker: Jason Altschuler
Abstract: We show that for separable convex optimization, random stepsizes fully accelerate Gradient Descent (GD). Specifically, using iid stepsizes from the Arcsine distribution improves the iteration complexity from O(k) to O(\sqrt{k}), where k is the condition number. No momentum or other modifications to GD are required. This result is incomparable to the (deterministic) Silver Stepsize Schedule, which does not require separability but only achieves partial acceleration O(k^{\log_{1+\sqrt{2}} 2}) \approx O(k^{0.78}). Our starting point is a conceptual connection to potential theory: the variational characterization for the distribution of stepsizes with fastest convergence rate mirrors the variational characterization for the distribution of charged particles with minimal logarithmic potential energy. The Arcsine distribution solves both variational characterizations due to a remarkable "equalizing property'' which in the physical context amounts to a constant potential over space, and in the optimization context amounts to an identical convergence rate for all quadratic functions. A key technical insight is that martingale arguments extend this phenomenon to all separable convex functions.

Talk 2: Last Iterate Convergence of Incremental Methods and Applications in Continual Learning
Speaker: Xufeng Cai
Abstract: Incremental gradient and incremental proximal methods are a fundamental class of optimization algorithms used for solving finite sum problems, broadly studied in the literature. Yet, without strong convexity, their convergence guarantees have primarily been established for the ergodic (average) iterate. Motivated by applications in continual learning, we obtain the first convergence guarantees for the last iterate of both incremental gradient and incremental proximal methods, in general convex smooth (for both) and convex Lipschitz (for the proximal variants) settings. Our oracle complexity bounds for the last iterate nearly match (i.e., match up to a square-root-log or a log factor) the best known oracle complexity bounds for the average iterate, for both classes of methods. We further obtain generalizations of our results to weighted averaging of the iterates with increasing weights and for randomly permuted ordering of updates. We study incremental proximal methods as a model of continual learning with generalization and argue that large amount of regularization is crucial to preventing catastrophic forgetting. Our results generalize last iterate guarantees for incremental methods compared to state of the art, as such results were previously known only for overparameterized linear models, which correspond to convex quadratic problems with infinitely many solutions.

Talk 3: Unveiling Spurious Stationarity and Hardness Results for Bregman Proximal-Type Algorithms
Speaker: Jiajin Li
Abstract: Bregman proximal-type algorithms, such as mirror descent, are popular in optimization and data science for effectively exploiting problem structures and optimizing them under tailored geometries. However, most of existing convergence results rely on the gradient Lipschitz continuity of the kernel, which unfortunately excludes most commonly used cases, such as the Shannon entropy. In this paper, we reveal a fundamental limitation of these methods: Spurious stationary points inevitably arise when the kernel is not gradient Lipschitz. The existence of these spurious stationary points leads to an algorithm-dependent hardness result: Bregman proximal-type algorithms cannot escape from a spurious stationary point within any finite number of iterations when initialized from that point, even in convex settings. This limitation is discovered through the lack of a well-defined stationarity measure based on Bregman divergence for non-gradient Lipschitz kernels. Although some extensions attempt to address this issue, we demonstrate that they still fail to reliably distinguish between stationary and non-stationary points for such kernels. Our findings underscore the need for new theoretical tools and algorithms in Bregman geometry, paving the way for further research.

Speakers
JZ

Jiawei Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JA

Jason Altschuler

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
XC

Xufeng Cai

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JL

Jiajin Li

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 102 3501 Trousdale Pkwy, 102, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11K: Nonsmooth PDE Constrained Optimization: Algorithms, Analysis and Applications Part 2
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Nonsmooth PDE Constrained Optimization: Algorithms, Analysis and Applications Part 2
Chair: Robert Baraldi
Cluster: PDE-constrained Optimization

Talk 1: An Inexact Trust-Region Algorithm for Nonsmooth Risk-Averse Optimization
Speaker: Drew Kouri
Abstract: Many practical problems require the optimization of systems (e.g., differential equations) with uncertain inputs such as noisy problem data, unknown operating conditions, and unverifiable modeling assumptions. In this talk, we formulate these problems as infinite-dimensional, risk-averse stochastic programs for which we minimize a quantification of risk associated with the system performance. For many popular risk measures, the resulting risk-averse objective function is not differentiable, significantly complicating the numerical solution of the optimization problem. Unfortunately, traditional methods for nonsmooth optimization converge slowly (e.g., sublinearly) and consequently are often intractable for problems in which the objective function and any derivative information is expensive to evaluate. To address this challenge, we introduce a novel trust-region algorithm for solving large-scale nonsmooth risk-averse optimization problems. This algorithm is motivated by the primal-dual risk minimization algorithm and employs smooth approximate risk measures at each iteration. In addition, this algorithm permits and rigorously controls inexact objective function value and derivative (when available) computations, enabling the use of inexpensive approximations such as adaptive discretizations. We discuss convergence of the algorithm under mild assumptions and demonstrate its efficiency on various examples from PDE-constrained optimization.

Talk 2: Variational problems with gradient constraints: A priori and a posteriori error identities
Speaker: Rohit Khandelwal
Abstract: Nonsmooth variational problems are ubiquitous in science and engineering, for e.g., fracture modeling and contact mechanics. This talk presents a generic primal-dual framework to tackle these types of nonsmooth problems. Special attention is given to variational problems with gradient constraints. The key challenge here is how to project onto the constraint set both at the continuous and discrete levels. In fact, both a priori and a posteriori error analysis for such nonsmooth problems has remained open. In this talk, on the basis of a (Fenchel) duality theory at the continuous level, an a posteriori error identity for arbitrary conforming approximations of primal-dual formulations is derived. In addition, on the basis of a (Fenchel) duality theory at the discrete level, an a priori error identity for primal (Crouzeix–Raviart) and dual (Raviart–Thomas) formulations is established. The talk concludes by deriving the optimal a priori error decay rates.

Talk 3: Optimal Insulation: Numerical Analysis and Spacecraft
Speaker: Keegan Kirk
Abstract: Given a fixed amount of insulating material, how should one coat a heat-conducting body to optimize its insulating properties? A rigorous asymptotic analysis reveals this problem can be cast as a convex variational problem with a non-smooth boundary term. As this boundary term is difficult to treat numerically, we consider an equivalent (Fenchel) dual variational formulation more amenable to discretization. We propose a numerical scheme to solve this dual formulation on the basis of a discrete duality theory inherited by the Raviart-Thomas and Crouzeix-Raviart finite elements and show that the solution of the original primal problem can be reconstructed locally from the discrete dual solution. We discuss the a posteriori and a priori error analysis of our scheme, derive a posteriori estimators based on convex optimality conditions, and present numerical examples to verify theory. As an application, we consider the design of an optimally insulated spacecraft heat shield.

Speakers
RB

Robert Baraldi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
DK

Drew Kouri

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RK

Rohit Khandelwal

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
KK

Keegan Kirk

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 118 3501 Trousdale Pkwy, 118, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11L: Recent Advances in Shape and Topology Optimization
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Recent Advances in Shape and Topology Optimization
Chair: Volker Schulz
Cluster: PDE-constrained Optimization

Talk 1: Approaches to Boundary-Effect-Dominated Topology Optimization
Speaker: Eddie Wadbro
Abstract: In classical design optimization using the material distribution method (density-based topology optimization), a material indicator function represents the presence or absence of material within the domain. To use this approach for boundary-effect-dominated problems, it is necessary to identify the boundary of the design at each iteration; this talk explores two methods to achieve this. The first involves using a boundary strip indicator function defined on the elements of the computational mesh. The second involves using a boundary indicator function defined on the mesh faces (edges in 2D and facets in 3D). The first method is applied to model a coated structure in a minimum heat compliance problem. The second method optimizes a heat sink, modeled by the Poisson equation with a Newtonian cooling boundary condition. The talk will cover the main ideas behind both approaches and showcase results from both model problems.

Talk 2: Topology optimization of the first truly wave focusing sonic black hole
Speaker: Martin Berggren
Abstract: We apply density-based topology optimization to design an acoustic waveguide possessing broadband wave focusing properties. Although effective as absorbers, previous suggestions of such devices — labeled sonic or acoustic black holes — have been unable to create the focusing effect. The optimization objective is a broadband maximization of the dissipated energy in a small absorbing area towards the end of the waveguide. A challenge with this application is that it is necessary to accurately model viscothermal boundary-layer losses associated with material boundaries. Here we rely on recent progress in the use of density-based topology optimization approaches for boundary-effect-dominated problems. Using such a procedure, we have been able to successfully design what seems to be the first truly broadband wave-focusing sonic black hole.

Talk 3: Shape Optimization Using the Total Generalized Variation of the Normal as Prior
Speaker: Stephan Schmidt
Abstract: The talk discusses how to use a total variation semi-norm on shapes as a prior. The idea is to concentrate curvature changes to edges, which is of great help when non-smooth shapes are to be reconstructed in inverse problems. Unsurprisingly, classical total variation keeps all typical downsides such as stair-casing when applied to manifolds. To this end, the concept of total generalized variation (TGV) by Bredies/Kunisch/Pock is extended to shapes. To implement TGV for shapes, the required separation of the linear component of a geometric variables necessitates non-standard finite elements on the tangent space of manifolds. The methodology is exemplified with applications stemming from geo-electric impedance tomography and mesh inpainting as well as texture denoising.

Speakers
EW

Eddie Wadbro

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Martin Berggren

Martin Berggren

Professor, Umeå University
Name: Martin BerggrenTitle: Professor of Scientific ComputingAffiliation: Department of Computing Science, Umeå UniversityBio:My main area of interest is Computational Design Optimization, in which numerical optimization algorithms are employed in the engineering design process... Read More →
SS

Stephan Schmidt

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 119 3501 Trousdale Pkwy, 119, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11M: Derivative-free optimization for large scale problems
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Derivative-free optimization for large scale problems
Chair: Zaikun Zhang
Cluster: Derivative-free Optimization

Talk 1: Scalable derivative-free optimization algorithms with low-dimensional subspace techniques
Speaker: Zaikun Zhang
Abstract: We re-introduce a derivative-free subspace optimization framework originating from Chapter 5 the thesis [Z. Zhang, On Derivative-Free Optimization Methods, PhD thesis, Chinese Academy of Sciences, Beijing, 2012] of the author under the supervision of Ya-xiang Yuan. At each iteration, the framework defines a (low-dimensional) subspace based on an approximate gradient, and then solves a subproblem in this subspace to generate a new iterate. We sketch the global convergence and worst-case complexity analysis of the framework, elaborate on its implementation, and present some numerical results on solving problems with dimension as high as 10,000. The same framework was presented during ICCOPT 2013 in Lisbon under the title "A Derivative-Free Optimization Algorithm with Low-Dimensional Subspace Techniques for Large-Scale Problems", although it remains nearly unknown to the community until very recently. An algorithms following this framework named NEWUOAs was implemented by Zhang in MATLAB in 2011 (https://github.com/newuoas/newuoas), ported to Module-3 by Nystroem (Intel) in 2017, and included in cm3 in 2019 (https://github.com/modula3/cm3/blob/master/caltech-other/newuoa/src/NewUOAs.m3).

Talk 2: High-dimensional DFO: Stochastic subspace descent and improvements
Speaker: Stephen Becker
Abstract: We describe and analyze a family of algorithms which we call "stochastic subspace descent" which use projections of the gradient onto random subspaces, in a slightly similar spirit to well-known work by Nesterov and Spokoiny. We explain the benefits of subspace projection compared to Gaussian directional derivatives. We present a complete convergence analysis, and show that the method is well suited for high-dimensional problems. We also focus on our very recent work for cheap and automatic stepsize selection, as well as some preliminary results on biased sampling which requires leaving the "projected" paradigm.

Talk 3: Blockwise direct search methods
Speaker: Haitian Li
Abstract: Direct search methods are one class of derivative-free optimization algorithms that evaluate the objective function only based on the comparison of function values. We introduce a new framework of direct search method called blockwise direct search (BDS), which divides the searching set into blocks. For every iteration, each block has its step size. We develop this framework into a solver, which is open-source and easy to use. In numerical experiments, we observe that BDS can also be compared with model based methods in some cases. In addition, our solver shows its efficiency and stability under noise without introducing specific techniques. BDS can also be used to tune the hyperparameters itself to improve the performance.

Speakers
ZZ

Zaikun Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HL

Haitian LI

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 157 3518 Trousdale Pkwy, 157, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11N: Riemannian geometry, optimization, and applications
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Riemannian geometry, optimization, and applications
Chair: Wen Huang
Cluster: Optimization on Manifolds

Talk 1: A Riemannian Proximal Newton-CG Method
Speaker: Wen Huang
Abstract: The proximal gradient method and its variants have been generalized to Riemannian manifolds for solving optimization problems in the form of $f + g$, where $f$ is continuously differentiable and $g$ may be nonsmooth. However, most of them do not have local superlinear convergence. Recently, a Riemannian proximal Newton method has been developed for optimizing problems in this form with $\mathcal{M}$ being a compact embedded submanifold and $g(x)= \lambda \|x\|_1$. Although this method converges superlinearly locally, global convergence is not guaranteed. The existing remedy relies on a hybrid approach: running a Riemannian proximal gradient method until the iterate is sufficiently accurate and switching to the Riemannian proximal Newton method. This existing approach is sensitive to the switching parameter. In this talk, we propose a Riemannian proximal Newton-CG method that merges the truncated conjugate gradient method with the Riemannian proximal Newton method. The global convergence and local superlinear convergence are proven. Numerical experiments show that the proposed method outperforms other state-of-the-art methods.

Talk 2: Manifold Identification and Second-Order Algorithms for l1-Regularization on the Stiefel Manifold
Speaker: Shixiang Chen
Abstract: In this talk, we will discuss manifold identification for the l1-regularization problem on the Stiefel manifold. First, we will demonstrate that the intersection of the identified manifold with the Stiefel manifold forms a submanifold. Building on this, we will propose a novel second-order retraction-based algorithm specifically designed for the intersected submanifold. Numerical experiments confirm that the new algorithm exhibits superlinear convergence.

Talk 3: A Riemannian Accelerated Proximal Gradient Method
Speaker: Shuailing Feng
Abstract: Riemannian accelerated gradient methods have been widely studied for smooth problem, but whether accelerated proximal gradient methods for nonsmooth composite problem on Riemannian manifolds can achieve theoretically acceleration remains unclear. Moreover, existing Riemannian accelerated gradient methods address geodesically convex and geodesically strongly convex cases separately. In this work, we introduce a unified Riemannian accelerated proximal gradient method with a rigorous convergence rate analysis for optimization problem of the form $F(x) = f(x) + h(x)$ on manifolds, where $f$ is either geodesically convex or geodesically strongly convex, and $h$ is weakly retraction-convex. Our analysis shows that the proposed method achieves acceleration under appropriate conditions. Additionally, we introduce a safeguard mechanism to ensure the convergence of the Riemannian accelerated proximal gradient method in non-convex settings. Numerical experiments demonstrate the effectiveness and theoretical acceleration of our algorithms.

Speakers
WH

Wen Huang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SC

Shixiang Chen

Assistant Professor, University of Science and Technology of China
I work on nonconvex optimizaiton.
SF

Shuailing Feng

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 256 3518 Trousdale Pkwy, 256, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11O: Riemannian geometry, optimization, and applications
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Riemannian geometry, optimization, and applications
Chair: Wen Huang
Cluster: Optimization on Manifolds

Talk 1: Strong Convexity of Sets in Riemannian Manifolds
Speaker: Damien Scieur
Abstract: Convex curvature properties are important in designing and analyzing convex optimization algorithms in the Hilbertian or Riemannian settings. In the case of the Hilbertian setting, strongly convex sets are well studied. Herein, we propose various definitions of strong convexity for uniquely geodesic sets in a Riemannian manifold. We study their relationship, propose tools to determine the geodesic strongly convex nature of sets, and analyze the convergence of optimization algorithms over those sets. In particular, we demonstrate that the Riemannian Frank-Wolfe algorithm enjoys a global linear convergence rate when the Riemannian scaling inequalities hold

Talk 2: Velocity-Based Karcher Mean on the Special Orthogonal Group: A Generalized Karcher Mean for Riemannian Manifolds
Speaker: Zhifeng Deng
Abstract: Special orthogonal group SOn consists of orthogonal matrices with the determinant 1 that realize rotations on n variables. It arises in many important applications, especially from the computer vision background where the rotations of a coordinates are used to described spacial motions of an object. Averaging a given data set is one of the most important computational tasks and the Karcher mean formulation in a metric space with the associated algorithms are naturally adapted for the averaging task on SOn as the Riemannian structure equips SOn with a metric space. Unfortunately, the distance induced by the Riemannian metric is not differentiable at cut loci which makes the Karcher mean formulation on SOn, and in general manifold, a non-convex optimization problem. In this presentation, we investigate the consequences of the non-convex optimization, with the SO2 circle as an example, and address the key issue in the discontinuous Riemannian logarithm that retrieves velocities in the tangent space from the points in the manifold. We propose a generalized Karcher mean formulation for Riemannian manifolds, namely the velocity-based Karcher mean, to overcome the discontinuity issue of retrieving velocities. This generalized Karcher mean subsumes the original Karcher mean and includes extra points as velocity-based Karcher mean if they emanate (non-minimal) geodesics to that data set with velocities summed up to zero. Then, we develop the algorithmic primitives, including a nearby matrix logarithm, to solve the velocity-based Karcher mean on SOn in a smooth manner such that the computed velocity-based Karcher mean smoothly depends on the data set and the convergence towards a computed mean can be controlled by the velocities emanated from the initial guess. Finally, numerical experiments demonstrate the advantages of the velocity-based Karcher mean on SOn in applications with smooth constraints.

Talk 3: Riemannian Federated Learning via Averaging Gradient Stream
Speaker: Zhenwei Huang
Abstract: In recent years, federated learning has garnered significant attention as an efficient and privacy-preserving distributed learning paradigm. In the Euclidean setting, federated averaging (FedAvg) and its variants are a class of efficient algorithms for expected (empirical) risk minimization. In this talk, we introduces and analyzes a Riemannian federated averaging gradient stream (RFedAGS) algorithm, which is a generalization of FedAvg, to problems defined on a Riemannian manifold. Under standard assumptions, the convergence rate of RFedAGS with fixed step sizes is proven to be sublinear for an approximate stationary solution. If decaying step sizes are used, the global convergence is established. Furthermore, assuming that the objective obeys the Riemannian Polyak-{\L}ojasiewicz property, the optimal gaps generated by RFedAGS with fixed step size are linearly decreasing up to a tiny upper bound, meanwhile, if decaying step sizes are used, then the gaps sublinearly vanish. Unlike the existing Riemannian federated learning whose convergence analysis only allows one agent or one step in local update, the proposed RFedAGS allows multiple agents and multiple-step local updates. Thus, RFedAGS theoretically guarantees a reduction of outer iterations and therefore reduces communication costs. Numerical simulations conducted on synthetic and real-world data demonstrate the performance of the proposed RFedAGS.

Speakers
WH

Wen Huang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
DS

Damien Scieur

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZD

Zhifeng Deng

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZH

Zhenwei Huang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 258 3518 Trousdale Pkwy, 258, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11P: Advanced Computational Solver 2
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Advanced Computational Solver 2
Chair: Jingyi Wang
Cluster: Computational Software

Talk 1: Implicit Scenario Reduction for Superquantile-Constrained Optimization: Applications to Energy Systems
Speaker: Jake Roth
Abstract: We discuss an efficient and scalable second-order solver for solving large-scale optimization problems with superquantile (aka conditional value at risk) constraints based on the augmented Lagrangian and semismooth Newton method framework. Unlike empirical risk models, superquantile models have non-separable constraints that make typical first-order algorithms difficult to scale. In contrast, our computational approach scales well in terms of the number of training data due to an implicit first- and second-order sparsity associated with the superquantile operator. In particular, only a fraction of the set of scenarios contributes second-order information, resulting in computations that can be performed in a reduced space. When evaluating the risk of each scenario is expensive, the relative cost of second-order information is diminished. Our solver is expected to help control the risk of adverse events for safety-critical applications in the power grid.

Talk 2: Exponential cone optimization with COPT
Speaker: Joachim Dahl
Abstract: Recently there has been increasing interest in conic optimization over non-symmetric cones. One important example is the exponential cone, which has recently been implemented in leading commercial solvers including COPT. In this talk we give an overview of the algorithm implemented in COPT and present numerical results showing the advantage and feasibility of embedding conic representable sets involving exponentials into a conic algorithm.

Talk 3: Randomized Algorithms for Bayesian Inversion and Data Acquisition in Predictive Digital Twins
Speaker: Vishwas Rao
Abstract: A digital twin couples computational models with a physical counterpart to create a system that is dynamically updated through bidirectional data flows as conditions change. Data Assimilation and Optimal Experimental Design (OED) provide a systematic means of updating the computational model and acquiring information as the physical system evolves. This talk will describe scalable preconditioners and solvers for Bayesian inversion using different randomization techniques. The proposed techniques are amenable to parallelization and drastically reduce the required number of model evaluations. We also develop theoretical guarantees on the condition number. Additionally, the talk will describe connections between OED for Bayesian linear inverse problems and the column subset selection problem (CSSP) in matrix approximation and derive bounds, both lower and upper, for the D-optimality criterion via CSSP for the independent and colored noise cases. We demonstrate the performance and effectiveness of our methodology on a variety of test problems such as Burgers and quasi-geostrophic equations

Speakers
JR

Jake Roth

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JD

Joachim Dahl

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
VR

Vishwas Rao

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 106 3501 Trousdale Pkwy, 106, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11Q: Online Learning and Optimization over Symmetric Cones
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Online Learning and Optimization over Symmetric Cones
Chair: Antonios Varvitsiotis
Cluster: Conic and Semidefinite Optimization

Talk 1: SEMIDEFINITE NETWORK GAMES: MULTIPLAYER MINIMAX AND COMPLEMENTARITY PROBLEMS
Speaker: Constantin Ickstadt
Abstract: Network games provide a powerful framework for modeling agent in- teractions in networked systems, where players are represented by nodes in a graph, and their payoffs depend on the actions taken by their neighbors. Extending the framework of network games, in this work we introduce and study semidefinite net- work games. In this model, each player selects a positive semidefinite matrix with trace equal to one, known as a density matrix, to engage in a two-player game with every neighboring node. The player’s payoff is the cumulative payoff acquired from these edge games. Network semidefinite games are of interest because they provide a simplified framework for representing quantum strategic interactions. Initially, we focus on the zero-sum setting, where the sum of all players’ payoffs is equal to zero. We establish that in this class of games, Nash equilibria can be characterized as the projection of a spectrahedron. Furthermore, we demonstrate that determining whether a game is a semidefinite network game is equivalent to deciding if the value of a semidefinite program is zero. Beyond the zero-sum case, we characterize Nash equilibria as the solutions of a semidefinite linear complementarity problem.

Talk 2: Symmetric Cone Eigenvalue Optimization: Expressivity and Algorithms through Equilibrium Computation
Speaker: Jiaqi Zheng
Abstract: We investigate eigenvalue optimization problems over symmetric cones, a broad generalization of the matrix eigenvalue optimization problems. We show that symmetric cone eigenvalue optimization problems are highly expressive, capable of modeling a wide range of problems including nearest point problems, the fastest mixing Markov processes, robust regression, and computing the diameter of a convex set. From an algorithmic perspective, we show that these problems can be reformulated as two-player zero-sum or common-interest games over symmetric cones, enabling us to design algorithms for solving them through the lens of equilibrium computation. We implement these algorithms and assess their effectiveness in the aforementioned applications.

Talk 3: Optimistic Online Learning for Symmetric Cone Games
Speaker: Anas Bakarat
Abstract: Optimistic online learning algorithms have led to significant advances in equilibrium computation, particularly for two-player zero-sum games, achieving an iteration complexity of O(1/ϵ) to reach an ϵ-saddle point. These advances have been established in normal-form games, where strategies are simplex vectors, and quantum games, where strategies are trace-one positive semidefinite matrices. We extend optimistic learning to symmetric cone games (SCGs), a class of two-player zero-sum games where strategy spaces are generalized simplices—trace-one slices of symmetric cones. A symmetric cone is the cone of squares of a Euclidean Jordan Algebra; canonical examples include the nonnegative orthant, the second-order cone, the cone of positive semidefinite matrices, and their direct sums—all fundamental to convex optimization. SCGs unify normal-form and quantum games and, as we show, offer significantly greater modeling flexibility, allowing us to model applications such as distance metric learning problems and the Fermat-Weber problem. To compute approximate saddle points in SCGs, we introduce the Optimistic Symmetric Cone Multiplicative Weights Update algorithm and establish an iteration complexity of O(1/ϵ) to reach an ϵ-saddle point. Our analysis builds on the Optimistic Follow-the-Regularized-Leader framework, with a key technical contribution being a new proof of the strong convexity of the symmetric cone negative entropy with respect to the trace-one norm—a result that may be of independent interest.

Speakers
AV

Antonios Varvitsiotis

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 214 3501 Trousdale Pkwy, 214, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11R: Communication-efficient distributed optimization
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Communication-efficient distributed optimization
Chair: Laurent Condat
Cluster: Nonlinear Optimization

Talk 1: Local Exact Diffusion for Decentralized Stochastic Optimization
Speaker: Sulaiman Alghunaim
Abstract: Distributed optimization methods featuring local updates have gained significant attention for their potential to cut down the communication costs in distributed systems. In these algorithms, nodes perform multiple local updates based on their data before exchanging estimation information. Although many studies have explored distributed local methods with centralized network connections, fewer have focused on decentralized networks. In this study, we introduce and examine a locally updated decentralized method named Local Exact-Diffusion (LED). This method allows each node to perform either fixed or random local updates, with communication occurring randomly at each iteration. We establish the convergence of LED in both convex and nonconvex settings for the stochastic online environment. Our convergence rate surpasses those of existing decentralized methods. Specializing the network to a centralized setup, we achieve the state-of-the-art bound for centralized methods. Additionally, we connect LED to various other independently studied distributed methods, including Scaffnew, FedGate, and VRL-SGD. Our analysis extends to these methods, and notably, we demonstrate that Scaffnew achieves a linear speedup property, previously unattained, where the communication complexity is inversely proportional with the number of nodes. This finding shows that Scaffnew can achieve linear speedup and potentially reduce communication overhead. In the strongly convex setting, we further prove that Scaffnew can achieve linear speedup with network-independent step sizes. Lastly, we numerically explore the advantages of local updates in decentralized networks and validate the effectiveness of our proposed method.

Talk 2: In-depth Analysis of Low-rank Matrix Factorisation in a Federated Setting
Speaker: Constantin Philippenko
Abstract: We analyze a distributed algorithm to compute a low-rank matrix factorization on several clients, each holding a local dataset. Considering a power initialization, we rewrite the previous smooth non-convex problem into a smooth strongly-convex problem that we solve using a parallel Nesterov gradient descent potentially requiring a single step of communication at the initialization step. We obtain a global-local matrix factorization: one part contains information on features shared by all clients (item embedding), while the other part captures the unique characteristics of each client (user embeddings). We provide a linear rate of convergence of the excess loss which depends on the truncated condition number, this result improves the rates of convergence given in the literature, which depend on the condition number. We provide an upper bound on the Frobenius-norm error of reconstruction under the power initialization strategy. We complete our analysis with experiments on both synthetic and real data.

Talk 3: LoCoDL: Communication-Efficient Distributed Optimization with Local Training and Compression
Speaker: Laurent Condat
Abstract: In distributed optimization, and even more in federated learning, communication is the main bottleneck. We introduce LoCoDL, a communication-efficient algorithm that leverages the two techniques of Local training, which reduces the communication frequency, and Compression with a large class of unbiased compressors that includes sparsification and quantization strategies. LoCoDL provably benefits from the two mechanisms and enjoys a doubly-accelerated communication complexity, with respect to the condition number of the functions and the model dimension, in the general heterogenous regime with strongly convex functions.

Speakers
SA

Sulaiman Alghunaim

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
CP

Constantin Philippenko

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Laurent Condat

Laurent Condat

Senior Research Scientist, King Abdullah University of Science and Technology (KAUST)
Laurent Condat received a PhD in applied mathematics in 2006 from Grenoble Institute of Technology, Grenoble, France. After a postdoc in the Helmholtz Zentrum Muenchen, Munich, Germany, he was hired in 2008 as a permanent researcher by the French National Center for Scientific Research... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 154 3518 Trousdale Pkwy, 154, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11S: Interplay between Online Convex Optimization and Statistics
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Interplay between Online Convex Optimization and Statistics
Chair: Jun-Kun Wang
Cluster: Optimization For Data Science

Talk 1: On the near-optimality of betting confidence sets for bounded means
Speaker: Shubhanshu Shekhar
Abstract: Constructing nonasymptotic confidence intervals (CIs) for the mean of a univariate distribution from independent and identically distributed (i.i.d.) observations is a fundamental task in statistics. For bounded observations, a classical nonparametric approach proceeds by inverting standard concentration bounds, such as Hoeffding's or Bernstein's inequalities. Recently, an alternative betting-based approach for defining CIs and their time-uniform variants called confidence sequences (CSs), has been shown to be empirically superior to the classical methods. In this paper, we provide theoretical justification for this improved empirical performance of betting CIs and CSs. Our main contributions are as follows: (i) We first compare CIs using the values of their first-order asymptotic widths (scaled by ), and show that the betting CI of Waudby-Smith and Ramdas (2023) has a smaller limiting width than existing empirical Bernstein (EB)-CIs. (ii) Next, we establish two lower bounds that characterize the minimum width achievable by any method for constructing CIs/CSs in terms of certain inverse information projections. (iii) Finally, we show that the betting CI and CS match the fundamental limits, modulo an additive logarithmic term and a multiplicative constant. Overall these results imply that the betting CI~(and CS) admit stronger theoretical guarantees than the existing state-of-the-art EB-CI~(and CS); both in the asymptotic and finite-sample regimes.

Talk 2: Confidence sequences via online learning
Speaker: Kwang-Sung Jun
Abstract: Confidence sequence provides ways to characterize uncertainty in stochastic environments, which is a widely-used tool for interactive machine learning algorithms and statistical problems including A/B testing, Bayesian optimization, reinforcement learning, and offline evaluation/learning. In these problems, constructing confidence sequences that are tight without losing correctness is crucial since it has a dramatic impact on the performance of downstream tasks. In this talk, I will present how to leverage results from online learning to derive confidence sequences that are provably and numerically tight. First, I will present an implicitly-defined confidence sequence for bounded random variables, which induces an empirical Bernstein-style confidence bound (thus adapts to the variance) and is provably better than the KL divergence-based confidence bound simultaneously, unlike the standard empirical Bernstein bound. Our confidence bound is never vacuous, can be efficiently computed, and provides state-of-the-art numerical performance. Second, I will turn to generalized linear models and how leveraging online learning helps develop (i) improved confidence sets for logistic linear models and (ii) noise-adaptive confidence sets for linear models with sub-Gaussian and bounded noise respectively. These lead to improved regret bounds in (generalized) linear bandit problems. I will conclude with open problems in confidence sequences and the role that online learning plays for them.

Talk 3: Sequential Hypothesis Testing via Online Learning and Optimization
Speaker: Jun-Kun Wang
Abstract: Online convex optimization (a.k.a. no-regret learning) concerns a scenario where an online learner commits to a point at each round before receiving the loss function. The learner's goal is to minimize the regret, defined as the gap between the cumulative losses and that of a clairvoyant who knows the sequence of the loss functions in advance. In this talk, I will first review a very neat known result in the literature that casts non-parametric sequential hypothesis testing as an online convex optimization problem, where an online learner tries to bet whether the null hypothesis is true or false, and a tighter regret bound suggests a faster stopping time to reject the null when the alternative is true. Then, I will show the relevant techniques can be used to design algorithms with strong statistical guarantees with applications such as online detecting LLM-generated texts, auditing fairness, and detecting distribution shifts. After that, I will introduce a new algorithm that overcomes the limitations of the existing methods and potentially leads to a faster rejection time under the alternative while controlling the false positive rate.

Speakers
SS

Shubhanshu Shekhar

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
KJ

Kwang-Sung Jun

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JW

Jun-Kun Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 155 3518 Trousdale Pkwy, 155, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11T: Matrix optimization and geometry of curved spaces
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Matrix optimization and geometry of curved spaces
Chair: Florentin Goyens
Cluster: Optimization on Manifolds

Talk 1: Wrapped Gaussian distributions on SDP matrices and their estimation
Speaker: Florian Yger
Abstract: A common assumption for Euclidean data is that the underlying distribution is Gaussian, as the Gaussian distribution is both well-studied and it enables efficient parameter estimation from samples. In this work, we extend the concept of Gaussian distributions to the Riemannian manifold of Symmetric Positive Definite (SPD) matrices, with a focus on generalizing non-isotropic Gaussian distributions for complex statistical modeling in this space. We propose a wrapped Gaussian model, constructed by mapping a Euclidean Gaussian in a tangent space onto the SPD manifold via the exponential map. After defining this wrapped Gaussian distribution, we address the issue of non-identifiability of our model by establishing an equivalence relation among parameter sets that yield the same distribution. We then show that the parameters of a wrapped Gaussian can be estimated from sample data using a maximum likelihood estimator, optimized on a product manifold. Additionally, we reinterpret existing classifiers on the SPD manifold through a probabilistic framework and introduce new probabilistic classifiers based on wrapped Gaussian models. Finally, we present experimental results, both synthetic and real, to evaluate the parameter estimation accuracy and classification performance of our wrapped classifiers.

Talk 2: The ultimate upper bound on the injectivity radius of the Stiefel manifold
Speaker: Simon Mataigne
Abstract: We exhibit conjugate points on the Stiefel manifold endowed with any member of the family of Riemannian metrics introduced by Hüper et al. (2021). This family contains the well-known canonical and Euclidean metrics. An upper bound on the injectivity radius of the Stiefel manifold in the considered metric is then obtained as the minimum between the length of the geodesic along which the points are conjugate and the length of certain geodesic loops. Numerical experiments support the conjecture that the obtained upper bound is in fact equal to the injectivity radius. Authors: Pierre-Antoine Absil, Simon Mataigne

Talk 3: Deterministic and Randomized Direct Search on Riemannian Manifolds with Complexity Guarantees
Speaker: Florentin Goyens
Abstract: In this work, we investigate the problem of minimizing a nonconvex objective function defined on a Riemannian manifold, where derivative information is unavailable or impractical to compute. To address this, we consider the direct search methodology—a class of derivative-free optimization algorithms—in the Riemannian setting. We present a Riemannian adaptation of the deterministic direct search method and analyze its performance, deriving a global complexity bound on the number of function evaluations required to reach an approximate critical point. Building on this, we introduce a randomized variant of the Riemannian direct search algorithm, which operates by generating search directions in a random lower-dimensional subspace of the tangent space. Finally, we present numerical experiments to illustrate the effectiveness and computational advantages of both the deterministic and randomized Riemannian direct search methods. Co-authors: Bastien Cavarretta, Clément Royer and Florian Yger.

Speakers
FY

Florian Yger

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SM

Simon Mataigne

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Florentin Goyens

Florentin Goyens

Postdoc, UCLouvain
Researcher in numerical optimization at UCLouvain in Belgium
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 158 3518 Trousdale Pkwy, 158, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11U: Grid Optimization (GO) Competition
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Grid Optimization (GO) Competition
Chair: Jesse Holzer
Cluster: Computational Software

Talk 1: Multi-Period Reserve-Constrained AC Optimal Power Flow on High Performance Computing
Speaker: Yong Fu
Abstract: The power grid is becoming more diverse and integrated with high-level distributed energy resources, creating new grid management challenges of large-scale, nonlinear, and non-convex problem modeling, complex and time-consuming computation. This work focuses on solutions for integrating operating reserves into system operation scheduling to co-optimize power generation, delivery, and reserve allocation over multiple time periods. The presented multi-period reserve constrained AC optimal power flow (M-ROPF) problem is challenged by the expanded model scale with linked decision variables across time periods, additional and comprehensive reserve requirements, and the inherent nonlinearity of power systems. We propose a parallel method for a fast, efficient, and reliable solution to M-ROPF. To decompose the problem, we reformulate the original problem by introducing auxiliary bounding variables for both producing and consuming devices, converting the hard and direct ramping up and down constraints to the soft and indirect real power dispatching boundaries. This reformulation yields a decomposable problem structure that can be solved by augmented Lagrangian relaxation-based decomposition methodologies. We use the alternating direction of method of multipliers to decompose the problem into two modules: the single period reserve constrained AC optimal power flow (S-RCOPF) module, and the single device ramping limit (S-DRL) module. The S-RCOPF module co-optimizes power dispatch and reserve allocation of devices by period, while satisfying AC network constraints, system reserve requirements, as well as the auxiliary real power dispatching boundaries for devices. It can be efficiently solved by an accelerated primal-dual interior point method that we developed. The S-DRL module determines the optimal auxiliary real power dispatching boundaries for a device which meets its intertemporal coupling constraints. It can be rapidly solved by quadratic programming. These solver modules can be processed in parallel, ensuring scalability across time periods. The solver guarantees feasibility through the whole iterative process, and achieves optimality in a limited time. The proposed parallel method is implemented and verified on the HPC platform. We will discuss multiple technical issues to enhance computational efficiency on multicore resources, such as task mapping strategy, communication and synchronization of tasks, and computational efficiency with increased computing processors. The effectiveness of the proposed solution is shown on datasets from the DOE ARPA-E Grid Optimization (GO) Competition Challenge 3, ranging from a small 73-bus system to a large-scale 23,643-bus system, and with different dispatch horizons including the real-time market with 8-hour look ahead, day-ahead market with 48-hour look ahead, and week-ahead advisory with 7-day look ahead. The results show a potential to meet the growing and diverse demands of the future electricity grid.

Talk 2: A Parallelized, Adam-Based Solver for Reserve and Security Constrained AC Unit Commitment
Speaker: Samuel Chevalier
Abstract: Power system optimization problems which include the nonlinear AC power flow equations require powerful and robust numerical solution algorithms. Within this sub-field of nonlinear optimization, interior point methods have come to dominate the solver landscape. Over the last decade, however, a number of efficient numerical optimizers have emerged from the field of Machine Learning (ML). One algorithm in particular, Adam, has become the optimizer-of-choice for a massive percentage of ML training problems (including, e.g., the training of GPT-3), solving some of the largest unconstrained optimization problems ever conceived of. Inspired by such progress, this talk presents a parallelized Adam-based numerical solver to overcome one of the most challenging power system optimization problems: security and reserve constrained AC Unit Commitment. The resulting solver, termed QuasiGrad, recently competed in the third ARPA-E Grid Optimization (GO3) competition. In the day-ahead market clearing category (with systems ranging from 3 to 23,643 buses over 48 time periods), QuasiGrad's aggregated market surplus scores were within 5% of the winningest market surplus scores. The QuasiGrad solver is now released as an open-source Julia package, and the internal gradient-based solver (Adam) can easily be substituted for other ML-inspired solvers (e.g., AdaGrad, AdaDelta, RMSProp, etc.).

Talk 3: Grid Optimization Competition and AC Unit Commitment
Speaker: Jesse Holzer
Abstract: The Grid Optimization (GO) Competition has posed challenge problems combining AC optimal power flow (ACOPF) and unit commitment (UC) into a single problem. UC typically is posed as a mixed integer linear programming (MILP) problem to determine the startup and shut down schedules along with real power output profiles of a set of generators in a power system so as to meet a given load over a planning horizon partitioned into multiple intervals. The physics of power flow through the network is represented by a linear model. In ACOPF, the network physics is modeled with a more accurate but nonlinear AC formulation, permitting the resolution of voltage and reactive power, but the discrete variables of generator commitment are either fixed to prescribed values or relaxed to continuous variables. Thus ACOPF is typically a nonlinear programming (NLP) problem in continuous variables. A day ahead UC with AC power flow modeling uses a more accurate power flow physics model to get a better day ahead commitment schedule. A real time ACOPF with unit commitment for fast start generators uses a more accurate representation of the actual capabilities and costs of the generators at the five to fifteen minute time scale. We describe two classes of solver approaches for a combined UC and ACOPF problem in the illustrative example of a single period UC model with AC power flow and balance constraints, voltage bounds, and limits on apparent power flow over transmission lines. In an NLP-based approach, we model the AC physics through nonlinear equations and use quadratic constraints to force the commitment variables to take integer values. In an MILP-based approach, we model the commitment decisions with binary variables and iteratively apply linear constraints derived from AC power flow subproblems to enforce the AC physics aspects of the model. We show computational results from these two approaches and discuss their advantages and disadvantages.

Speakers
YF

Yong Fu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
SC

Samuel Chevalier

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JH

Jesse Holzer

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 108 3501 Trousdale Pkwy, 108, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11V: Nonsmooth Dynamical Systems
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Nonsmooth Dynamical Systems
Chair: Emilio Vilches
Cluster: Nonsmooth Optimization

Talk 1: Understanding accelerated gradient methods through high-order ODEs
Speaker: Samir Adly
Abstract: We study convex optimization problems with smooth objectives by looking at how damped inertial dynamics, driven by the gradient, can be discretized in time. This leads to three well-known accelerated algorithms: Nesterov’s method (NAG), Ravine Accelerated Gradient (RAG), and the IGAHD method introduced by Attouch, Chbani, Fadili, and Riahi. The IGAHD method uses Hessian-driven damping to reduce oscillations that often appear in inertial methods. By analyzing continuous-time models (ODEs) of these algorithms at different levels of resolution (orders $p=0,1,2$), we gain a better understanding of their behavior. All three methods share the same low-resolution model (order 0), which corresponds to the Su–Boyd–Candès ODE for NAG. However, when we go to higher resolution (order 2), we show that NAG and RAG follow different dynamics. This is a new result and shows that NAG and RAG should not be considered equivalent. We also present numerical experiments. In terms of number of iterations, IGAHD performs best. RAG is slightly better than NAG on average. When looking at CPU-time, NAG and RAG are faster than IGAHD. All three methods show similar results when comparing gradient norms.

Talk 2: A Newton-Like Dynamical System for Nonsmooth and Nonconvex Optimization
Speaker: Juan Guillermo Garrido
Abstract: This work investigates a dynamical system functioning as a nonsmooth adaptation of the continuous Newton method, aimed at minimizing the sum of a primal lower-regular and a locally Lipschitz function, both potentially nonsmooth. The classical Newton method’s second-order information is extended by incorporating the graphical derivative of a locally Lipschitz mapping. Specifically, we analyze the existence and uniqueness of solutions, along with the asymptotic behavior of the system's trajectories. Conditions for convergence and respective convergence rates are established under two distinct scenarios: strong metric subregularity and satisfaction of the Kurdyka-Łojasiewicz inequality.

Talk 3: (Cancelled)
Speaker: Emilio Vilches
Abstract: Cancelled

Speakers
JG

Juan Guillermo Garrido

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
EV

Emilio Vilches

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 110 3501 Trousdale Pkwy, 110, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11W: Distributionally Robust Optimization (DRO) I
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Distributionally Robust Optimization (DRO) I
Chair: Man Yiu Tsang
Cluster: nan

Talk 1: On the trade-off between distributional belief and ambiguity: Conservatism, finite-sample guarantees, and asymptotic properties
Speaker: Man Yiu Tsang
Abstract: We present a new data-driven trade-off (TRO) approach for modeling uncertainty that serves as a middle ground between the optimistic approach, which adopts a distributional belief, and the pessimistic distributionally robust optimization approach, which hedges against distributional ambiguity. We equip the TRO model with a TRO ambiguity set characterized by a size parameter controlling the level of optimism and a shape parameter representing distributional ambiguity. We first show that constructing the TRO ambiguity set using a general star-shaped shape parameter with the empirical distribution as its star center is necessary and sufficient to guarantee the hierarchical structure of the sequence of TRO ambiguity sets. Then, we analyze the properties of the TRO model, including quantifying conservatism, quantifying bias and generalization error, and establishing asymptotic properties. Specifically, we show that the TRO model could generate a spectrum of decisions, ranging from optimistic to conservative decisions. Additionally, we show that it could produce an unbiased estimator of the true optimal value. Furthermore, we establish the almost-sure convergence of the optimal value and the set of optimal solutions of the TRO model to their true counterparts. We exemplify our theoretical results using stylized optimization problems.

Talk 2: Generalization Bound Analysis of Nonconvex Minimax Optimization and Beyond
Speaker: Siqi Zhang
Abstract: In this work, we systematically investigate the generalization bounds of algorithms that solve nonconvex–(strongly)–concave (NC-SC/NC-C) stochastic minimax optimization, measured by the stationarity of primal functions. We first establish algorithm-agnostic generalization bounds via uniform convergence between the empirical and population minimax problems, thereby deriving sample complexities for achieving generalization. We then explore algorithm-dependent generalization bounds using algorithmic stability arguments. In particular, we introduce a novel stability notion for minimax problems and build its connection to generalization bounds. Consequently, we establish algorithm-dependent generalization bounds for stochastic gradient descent ascent (SGDA) and more general sampling-based algorithms. We will also discuss some extensions of these results to more general settings.

Talk 3: Optimized Dimensionality Reduction for Moment-based Distributionally Robust Optimization
Speaker: Kai Pan
Abstract: Moment-based distributionally robust optimization (DRO) provides an optimization framework to integrate statistical information with traditional optimization approaches. Under this framework, one assumes that the underlying joint distribution of random parameters runs in a distributional ambiguity set constructed by moment information and makes decisions against the worst-case distribution within the set. Although most moment-based DRO problems can be reformulated as semidefinite programming (SDP) problems that can be solved in polynomial time, solving high-dimensional SDPs is still time-consuming. Unlike existing approximation approaches that first reduce the dimensionality of random parameters and then solve the approximated SDPs, we propose an optimized dimensionality reduction (ODR) approach by integrating the dimensionality reduction of random parameters with the subsequent optimization problems. Such integration enables two outer and one inner approximations of the original problem, all of which are low-dimensional SDPs that can be solved efficiently, providing two lower bounds and one upper bound correspondingly. More importantly, these approximations can theoretically achieve the optimal value of the original high-dimensional SDPs. As these approximations are nonconvex SDPs, we develop modified Alternating Direction Method of Multipliers (ADMM) algorithms to solve them efficiently. We demonstrate the effectiveness of our proposed ODR approach and algorithm in solving multiproduct newsvendor and production-transportation problems. Numerical results show significant advantages of our approach regarding computational time and solution quality over the three best possible benchmark approaches. Our approach can obtain an optimal or near-optimal (mostly within 0.1%) solution and reduce the computational time by up to three orders of magnitude. Paper reference: Jiang, S., Cheng, J., Pan, K., & Shen, Z. J. M. (2023). Optimized dimensionality reduction for moment-based distributionally robust optimization. arXiv preprint arXiv:2305.03996.

Speakers
MY

Man Yiu Tsang

Name: Man Yiu (Tim) TsangTitle: Ph.D. CandidateAffiliation: Lehigh UniversityBio:Man Yiu (Tim) Tsang is a Ph.D. candidate in Industrial and Systems Engineering at Lehigh University under the supervision of Prof. Karmel S. Shehadeh. He obtained his BSc and MPhil in Risk Management... Read More →
SZ

Siqi Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Kai Pan

Kai Pan

Associate Professor, The Hong Kong Polytechnic University
Kai Pan is currently an Associate Professor in Operations Management at the Faculty of Business of The Hong Kong Polytechnic University (PolyU) and the Director of the MSc Program in Operations Management (MScOM). He serves as a Secretary/Treasurer for the INFORMS Computing Socie... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 112 3501 Trousdale Pkwy, 112, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11X: Nonsmooth Optimization - Algorithms
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Session: Nonsmooth Optimization - Algorithms
Chair: Dânâ Davar
Cluster: nan

Talk 1: TRFD: A derivative-free trust-region method based on finite differences for composite nonsmooth optimization
Speaker: Dânâ Davar
Abstract: In this work we present TRFD, a derivative-free trust-region method based on finite differences for minimizing composite functions of the form $f(x)=h(F(x))$, where $F$ is a black-box function assumed to have a Lipschitz continuous Jacobian, and $h$ is a known convex Lipschitz function, possibly nonsmooth. The method approximates the Jacobian of $F$ via forward finite differences. We establish an upper bound for the number of evaluations of $F$ that TRFD requires to find an $\epsilon$-approximate stationary point. For L1 and Minimax problems, we show that our complexity bound reduces to $\mathcal{O}(n\epsilon^{-2})$ for specific instances of TRFD, where $n$ is the number of variables of the problem. Assuming that $h$ is monotone and that the components of $F$ are convex, we also establish a worst-case complexity bound, which reduces to $\mathcal{O}(n\epsilon^{-1})$ for Minimax problems. Numerical results are provided to illustrate the relative efficiency of TRFD in comparison with existing derivative-free solvers for composite nonsmooth optimization. This is a joint work with Geovani Grapiglia. The arXiv manuscript can be found at https://arxiv.org/abs/2410.09165

Talk 2: AN AUGMENTED LAGRANGIAN PRIMAL-DUAL SEMISMOOTH NEWTON METHOD FOR MULTI-BLOCK COMPOSITE OPTIMIZATION
Speaker: Zhanwang Deng
Abstract: In this talk, we develop a novel primal-dual semismooth Newton method for solving linearly constrained multi-block convex composite optimization problems which include commonly used models in image processing and conic programming. First, a differentiable augmented Lagrangian (AL) function is constructed by utilizing the Moreau envelopes of the nonsmooth functions. It enables us to derive an equivalent saddle point problem and establish the strong AL duality under the Slater’s condition. Consequently, a semismooth system of nonlinear equations is formulated to characterize the optimality of the original problem instead of the inclusion-form KKT conditions. We then develop a semismooth Newton method, called ALPDSN, which uses purely second-order steps and a nonmonotone line search based globalization strategy. Through a connection to the inexact first-order steps when the regularization parameter is sufficiently large, the global convergence of ALPDSN is established. Under the regularity conditions, partial smoothness, the local error bound, and the strict complementarity, we show that both the primal and the dual iteration sequences possess a superlinear convergence rate and provide concrete examples where these regularity conditions are met. Numerical results on the image restoration with two regularization terms, the corrected tensor nuclear norm problem and semidefinite programming on Mittelmann benchmark are presented to demonstrate the high efficiency and robustness of our ALPDSN. Furthermore, we will also design a software to solve a class of convex composite optimization in the future.

Talk 3: Anderson acceleration in nonsmooth problems: local convergence via active manifold identification
Speaker: Hao Wang
Abstract: Anderson acceleration is an effective technique for enhancing the efficiency of fixed- point iterations; however, analyzing its convergence in nonsmooth settings presents significant challenges. In this paper, we investigate a class of nonsmooth optimization algorithms characterized by the active manifold identification property. This class includes a diverse array of methods such as the proximal point method, proximal gradient method, proximal linear method, proximal coordinate descent method, Douglas-Rachford splitting (or the alternating direction method of multipliers), and the iteratively reweighted L1 method, among others. Under the assumption that the optimization problem possesses an active manifold at a stationary point, we establish a local R-linear convergence rate for the Anderson-accelerated algorithm. Our extensive numerical experiments further highlight the robust performance of the proposed Anderson-accelerated methods.

Speakers
ZD

Zhanwang Deng

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HW

Hao Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Taper Hall (THH) 215 3501 Trousdale Pkwy, 215, Los Angeles, CA 90089

1:15pm PDT

Parallel Sessions 11Y
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Thursday July 24, 2025 1:15pm - 2:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 200 3518 Trousdale Pkwy, 200, Los Angeles, CA 90089

2:30pm PDT

Coffee & Snack Break (Provided)
Thursday July 24, 2025 2:30pm - 3:00pm PDT
Thursday July 24, 2025 2:30pm - 3:00pm PDT
TBA

3:00pm PDT

Parallel Semi-Plenary Talk 4A
Thursday July 24, 2025 3:00pm - 4:00pm PDT
Speakers
TH

Tim Hoheisel

Tim Hoheisel received his doctorate of mathematics from Julius-Maximilians University (Würzburg) under the supervision of Christian Kanzow in 2009. He was a postdoctoral researcher there until 2016. During this time, he was a visiting professor at Heinrich-Heine University (Düsseldorf... Read More →
Thursday July 24, 2025 3:00pm - 4:00pm PDT
Taper Hall (THH) 101 3501 Trousdale Pkwy, 101, Los Angeles, CA 90089

3:00pm PDT

Parallel Semi-Plenary Talk 4B
Thursday July 24, 2025 3:00pm - 4:00pm PDT
Speakers
MH

Mingyi Hong

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 3:00pm - 4:00pm PDT
Taper Hall (THH) 201 3501 Trousdale Pkwy, 201, Los Angeles, CA 90089

4:00pm PDT

Break
Thursday July 24, 2025 4:00pm - 4:15pm PDT
Thursday July 24, 2025 4:00pm - 4:15pm PDT
TBA

4:15pm PDT

Parallel Sessions 12A: Distributionally Robust Optimization (DRO) II
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Distributionally Robust Optimization (DRO) II
Chair: Yiling Zhang
Cluster: nan

Talk 1: Distributionally robust standard quadratic optimization with Wasserstein ambiguity
Speaker: Daniel de Vicente
Abstract: The standard quadratic optimization problem (StQP) consists of minimizing a quadratic form over the standard simplex. If the quadratic form is neither convex nor concave, the StQP is NP-hard. This problem has many interesting applications ranging from portfolio optimization to machine learning. Sometimes, the data matrix is uncertain but some information about its distribution can be inferred, e.g. the first two moments or else a reference distribution (typically, the empirical distribution after sampling). In distributionally robust optimization, the goal is to minimize over all possible distributions in an ambiguity set defined based upon above mentioned characteristics. We will explore two versions: the distributionally robust stochastic StQP focussing on expectations, and the distributionally robust chance constrained StQP, both with an ambiguity set based upon maximal Wasserstein distance to the sampled distribution.

Talk 2: A Primal Perspective on Distributionally Robust Two-Stage Problems with Integer Recourse
Speaker: Yiling Zhang
Abstract: In this talk, we introduce and study a two-stage distributionally two-stage linear problems with integer recourse, where the objective coefficients are random. The random parameters follow the worst-case distribution belonging to an a second-order conic representable ambiguity set of probability distributions. We show that the worst-case recourse objective, under various risk measures, can be formulated as a conic program from a primal perspective. This method also provides additional information on the probability of attaining an integer recourse solution, extending the concept of persistency studied in Bertsimas et al. (2006). Unlike the marginal moment sets used in Bertsimas et al. 2006), the second-order conic representable ambiguity sets in our method offers greater flexibility by incorporating more distributional information. Furthermore, this method enables column constraint generation methods for solving two-stage problems with integer recourse.

Talk 3: Distributionally Robust Nonlinear Optimization
Speaker: Judith Brugman
Abstract: Distributionally robust optimization (DRO) provides a powerful framework for handling uncertainty when only partial information, such as mean, variance and support, is available. Instead of assuming full knowledge of the probability distribution, DRO seeks solutions that perform well under the worst-case distribution within an ambiguity set. While DRO problems can be reformulated as robust optimization (RO) problems, making them more tractable while maintaining theoretical guarantees, solving the resulting RO problem remains challenging. Wiesemann et al. (2014) address this problem for a very rich class of ambiguity sets, but relies on a max-of-linear-functions assumption on the cost function, limiting its applicability. In our work, we extend this approach to a much broader class of cost functions, including all convex and twice continuously differentiable functions. By leveraging the Reformulation-Perspectification Technique with Branch and Bound (RPT-BB) for RO, which combines relaxation-based partitioning with branch-and-bound techniques, we show that DRO problems can be efficiently solved even for highly nonlinear functions. To demonstrate the practical relevance of this approach, I will focus on the appointment scheduling problem, where our method not only generalizes existing results but also improves computational efficiency. I will conclude with a discussion on broader applications of our framework in other domains.

Speakers
DD

Daniel de Vicente

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YZ

Yiling Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JB

Judith Brugman

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 101 3501 Trousdale Pkwy, 101, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12B: Optimization on Manifolds and Geometric Approaches
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Optimization on Manifolds and Geometric Approaches
Chair: Ian McPherson
Cluster: nan

Talk 1: Convergence Rates for Riemannian Proximal Bundle Methods
Speaker: Ian McPherson
Abstract: We propose a novel class of Riemannian proximal bundle methods for optimization on Hadamard manifolds, providing convergence rates for this new approach. Our assumptions are weak and we are able to relax the reliance on exponential maps and parallel transports, requiring only first-order retractions and vector transports. To our knowledge, these are the first non-asymptotic convergence rates for Riemannian proximal bundle methods, extending arguments that achieve optimal rates in the Euclidean case. Moreover, we show given exponential maps and parallel transports we recover the exact same rates as the Euclidean setting.

Talk 2: Incremental minimization in nonpositively curved geodesic spaces
Speaker: Ariel Goodwin
Abstract: Subgradient methods for minimizing geodesically convex functions on Hadamard manifolds have gained interest in recent years, but fall short in two respects: their complexity relies unavoidably on a lower curvature bound for the space, and they do not generalize well to metric spaces in the absence of local linearity. Complete geodesic metric spaces of nonpositive curvature, called Hadamard spaces, prove useful in modelling many applications and have a rich geometric structure enabling theoretical and computational aspects of convex optimization. It has recently been shown that a restricted class of functions on Hadamard spaces can be effectively minimized using an iteration resembling a subgradient method, with the same complexity result as the classical Euclidean subgradient method. In this work we propose a related class of functions which we call Busemann convex, admitting a notion of subgradient that is attuned to the geometry of the space. Many functions defined in terms of basic metric quantities are Busemann convex, and their subgradients are readily computed in terms of geodesics. We address the minimization of sums of Busemann convex functions with an incremental subgradient-style method and associated complexity result. To illustrate the algorithm, we numerically compute medians of trees in the BHV phylogenetic tree space. This is joint work with Adrian Lewis, Genaro López-Acedo, and Adriana Nicolae. A preprint is available at https://arxiv.org/abs/2412.06730.

Talk 3: Interior Riemannian subgradient flow over manifold with boundary
Speaker: Kuangyu Ding
Abstract: We study a nonsmooth nonconvex optimization problem defined over a manifold with boundary, where the feasible set is given by the intersection of the closure of an open set and a smooth manifold. By endowing the open set with a Riemannian metric induced by a barrier function, we obtain a Riemannian subgradient flow—formulated as a differential inclusion—that remains strictly within the interior of the feasible set. This continuous dynamical system unifies two classes of iterative optimization methods, namely the Hessian barrier method and Bregman-type methods, by revealing that these methods can be interpreted as discrete approximations of the continuous flow. We analyze the long-term behavior of the trajectories generated by this dynamical system and show that many properties of the Hessian barrier and Bregman-type methods can be more insightfully understood through these of the continuous trajectory. For instance, the notorious spurious stationary points observed in Bregman-type methods are interpreted as stable equilibria of the dynamical system that do not correspond to true stationary points of the original problem. We prove that these spurious stationary points can be avoided if the strict complementarity condition holds. In the absence of this regularity condition, we propose a random perturbation strategy that ensures the trajectory converges (subsequentially) to an approximate stationary point. Building on these insights, we introduce an iterative Riemannian subgradient method—a form of interior point approach—that generalizes the existing Hessian barrier method for solving nonsmooth nonconvex optimization problems.

Speakers
IM

Ian McPherson

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
AG

Ariel Goodwin

Cornell University Center for Applied Mathematics, Ph.D. Student.
KD

Kuangyu Ding

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 201 3501 Trousdale Pkwy, 201, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12C: Convex Relaxations for Discrete & Combinatorial Optimization
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Convex Relaxations for Discrete & Combinatorial Optimization
Chair: Soobin Choi
Cluster: nan

Talk 1: Rank-one Convexification for Convex Quadratic Optimization with Sign Indicators
Speaker: Soobin Choi
Abstract: We investigate convexification in convex quadratic optimization with sign indicators, where the sign of each continuous variable in the quadratic objective is controlled by a binary variable. First, we derive the convex hull for the epigraph of a quadratic function defined by a rank-one matrix. Using this rank-one convexification, we develop copositive and semi-definite relaxations for general convex quadratic functions. Leveraging these findings, we reformulate the support vector machine problem with 0--1 loss and conduct numerical experiments on synthetic and real instances.

Talk 2: Tightening Quadratic Convex Relaxations for the AC Optimal Transmission Switching Problem
Speaker: Cheng Guo
Abstract: The Alternating Current Optimal Transmission Switching (ACOTS) problem incorporates line switching decisions into the AC Optimal Power Flow framework, offering benefits in reducing costs and enhancing reliability. ACOTS optimization models contain discrete variables and nonlinear, non-convex constraints, which make it difficult to solve. We develop strengthened quadratic convex (QC) relaxations for ACOTS, where we tighten the relaxation with several new valid inequalities, including a novel kind of on/off cycle-based polynomial constraints by taking advantage of the network structure. We demonstrate theoretical tightness, and efficiently incorporate on/off cycle-based polynomial constraints through disjunctive programming-based cutting planes. Our method results in the tightest QC-based ACOTS relaxation to date. We additionally propose a novel maximum spanning tree-based heuristic to improve the computational performance by fixing certain lines to be switched on. Tests on large-scale instances with up to 2,312 buses demonstrate substantial performance gains. This work in under minor revision at INFORMS Journal on Computing, and is available on arXiv: https://arxiv.org/abs/2212.12097

Talk 3: Solving Large Flag Algebra Problems
Speaker: Bernard Lidický
Abstract: The Flag Algebra Method is a tool in extremal combinatorics, which has been very useful to tackle open problems for graphs, hypergraphs, permutations, pointsets and others. The method establishes combinatorial inequalities via sums of squares, which are usually obtained by a semidefinite relaxation of the original combinatorial problem. A typical application of the method in graph theory starts with fixing a parameter $n$ followed by constructing a semidefinite program based on the (subgraph) structure of all $n$-vertex graphs. In particular, the number of constraints in the resulting program is at least the number of all $n$-vertex graphs upto isomorphism. These numbers grow very quickly; for example, they are 12346, 274668 and 12005168 for $n$ being 8, 9 and 10, respectively. While solving the resulting program for $n=8$ is an easy task on any modern computer, the program for $n \geq 10$ seems out of reach, mainly because the SDP solvers need too much RAM. Yet solving such programs for as large $n$ as possible is desirable because they typically yield better bounds for the original combinatorial problems. We bypass the aforementioned memory requirements by linear programming with cutting-planes. Although solving semidefinite programs using this approach has no optimality guarantee, it turned out that in many flag algebra instances much less memory is needed to significantly improve bounds on numerous well-known hypergraph Turán problems. This is a joint work with Jan Volec.

Speakers
SC

Soobin Choi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
CG

Cheng Guo

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
BL

Bernard Lidický

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 208 3501 Trousdale Pkwy, 208, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12D: Sparsity Optimization
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Sparsity Optimization
Chair: Jiachang Liu
Cluster: nan

Talk 1: (Canceled) A PATH-BASED APPROACH TO CONSTRAINED SPARSE OPTIMIZATION
Speaker: Nadav Hallak
Abstract: This talk presents a path-based approach for the minimization of a continuously differentiable function over sparse symmetric sets, which is a hard problem that exhibits a restrictiveness-hierarchy of necessary optimality conditions. To achieve the more restrictive conditions in the hierarchy, contemporary methods require a support optimization oracle that must exactly solve the problem in smaller dimensions. The presented path-based approach achieves a path-based optimality condition, which is well placed in the restrictiveness-hierarchy, and a method to achieve it that does not require a support optimization oracle and, moreover, is projection-free. New results for the regularized linear minimization problem over sparse symmetric sets and their implications are also presented.

Talk 2: Scalable First-order Method for Certifying Optimal k-Sparse GLMs
Speaker: Jiachang Liu
Abstract: This paper investigates the problem of certifying optimality for sparse generalized linear models (GLMs), where sparsity is enforced through an l0 cardinality constraint. While branch-and-bound (BnB) frameworks can certify optimality by pruning nodes using dual bounds, existing methods for computing these bounds are either computationally intensive or exhibit slow convergence, limiting their scalability to large-scale problems. To address this challenge, we propose a first-order proximal gradient algorithm designed to solve the perspective relaxation of the problem within a BnB framework. Specifically, we formulate the relaxed problem as a composite optimization problem and demonstrate that the proximal operator of the non-smooth component can be computed exactly in log-linear time complexity, eliminating the need to solve a computationally expensive second-order cone program. Furthermore, we introduce a simple restart strategy that enhances convergence speed while maintaining low per-iteration complexity. Extensive experiments on synthetic and real-world datasets show that our approach significantly accelerates dual bound computations and is highly effective in providing optimality certificates for large-scale problems. This is joint work with Andrea Lodi and Soroosh Shafiee. A preprint of this work can be found at https://arxiv.org/abs/2502.09502.

Talk 3: Sparse quadratic optimization via cardinality constraints
Speaker: Ademir Ribeiro
Abstract: We study the sparse regularized quadratic optimization problem where the sparsity level is controlled by means of the cardinality constraints. However, due to the existence of this kind of constraint (which is not continuous neither convex), it is very difficult to directly solve the problem. Recently, several researchers have addressed these type of problems using a common strategy, a continuous relaxation reformulation of the problem. In this work we propose a different approach in which the feasible set of the problem is decomposed into simpler sets, all of them meeting the cardinality constraint. Then, in order to overcome the combinatorial difficulty induced by this decomposition, we make use of an auxiliary and simple problem of maximizing a concave piecewise quadratic function. The solution of this problem, obtained by a subgradient method, is then used to find the solution of the original problem. Numerical experiments show that this strategy may be successful for a significant number of problems. Throughout this work we establish nonlinear optimization results in order to provide a rigorous mathematical analysis of the ideas involving the reformulation of problem, the proposed method and its convergence properties. Beck A, Eldar YC (2013) Sparsity constrained nonlinear optimization: Optimality conditions and algorithms. SIAM J. Optim. 23(3):1480–1509. Nesterov Y (2004) Introductory Lectures on Convex Optimization – Basic Course (Kluwer Academic Publishers). Vreugdenhil R, Nguyen VA, Eftekhari A, Esfahani PM (2021) Principal component hierarchy for sparse quadratic programs. Meila M, Zhang T, eds., Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, 10607–10616 (PMLR).

Speakers
NH

Nadav Hallak

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Jiachang Liu

Jiachang Liu

Assistant Research Professor, Cornell University
I am an assistant research professor (postdoc) at the Center for Data Science for Enterprise and Society (CDSES) at Cornell University. My hosts are Professor Andrea Lodi and Professor Soroosh Shafiee.Prior to joining Cornell, I completed my Ph.D. in Electrical and Computer Engineering... Read More →
AR

Ademir Ribeiro

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 210 3501 Trousdale Pkwy, 210, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12E: Sparse and Low-rank Optimization
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Sparse and Low-rank Optimization
Chair: Ishy Zagdoun
Cluster: nan

Talk 1: On solving a rank regularized minimization problem via equivalent factorized column-sparse regularized models
Speaker: Wenjing Li
Abstract: Rank regularized minimization problem is an ideal model for the low-rank matrix completion/recovery problem. The matrix factorization approach can transform the high-dimensional rank regularized problem to a low-dimensional factorized column-sparse regularized problem. The latter can greatly facilitate fast computations in applicable algorithms, but needs to overcome the simultaneous non-convexity of the loss and regularization functions. In this talk, we consider the factorized column-sparse regularized model. Firstly, we optimize this model with bound constraints, and establish a certain equivalence between the optimized factorization problem and rank regularized problem. Further, we strengthen the optimality condition for stationary points of the factorization problem and define the notion of strong stationary point. Moreover, we establish the equivalence between the factorization problem and its a nonconvex relaxation in the sense of global minimizers and strong stationary points. To solve the factorization problem, we design two types of algorithms and give an adaptive method to reduce their computation. The first algorithm is from the relaxation point of view and its iterates own some properties from global minimizers of the factorization problem after finite iterations. We give some analysis on the convergence of its iterates to the strong stationary point. The second algorithm is designed for directly solving the factorization problem. We improve the PALM algorithm introduced by Bolte et al. (Math Program Ser A 146:459-494, 2014) for the factorization problem and give its improved convergence results. Finally, we conduct numerical experiments to show the promising performance of the proposed model and algorithms for low-rank matrix completion. The corresponding publication is Math Program Ser A, 2024, doi: 10.1007/s10107-024-02103-1.

Talk 2: Convergence of accelerated singular value shrinkage algorithm for nonconvex low-rank matrix regularization problem
Speaker: LIU Yanyan
Abstract: This paper develops a new framework to design and analyse singular value shrinkage algorithm and its variants built on the notion of singular value shrinkage operator for the nonconvex low-rank regularization problem. The truncation technique, Nesterov's acceleration and heavy-ball method are chosen to accelerate traditional singular value shrinkage algorithm. We establish their convergence to an apporixmate true low-rank solution of nonconvex regularization problem under restricted isometry condition and some mild parameter assumptions. Numerical results based on sythetical data and real data show that the proposed algorithms are competitive to the state-of-the-art algorithms in terms of efficiency and accuracy.

Talk 3: Projecting onto a Capped Rotated Second-Order Cone for Sparse Optimization
Speaker: Ishy Zagdoun
Abstract: This paper presents a closed-form expression for the projection onto a capped rotated second-order cone, a convex set that emerges as part of the Boolean relaxation in sparse regression problems and, more broadly, in the perspective relaxation of mixed-integer nonlinear programs (MINLP) with binary indicator variables. The closed-form expression is divided into three cases, one of which simplifies to the projection onto a standard second-order cone (a widely used projection with a well-known solution involving three additional cases). The nontrivial solutions for the other two cases include the necessary and sufficient conditions for when the projection lies along the intersection of the rotated cone and a facet of a box. The ability to rapidly compute the projection onto a capped rotated second-order cone facilitates the development of efficient methods for solving the continuous relaxation of sparse optimization problems. These problems typically involve a Cartesian product of many such sets. Important machine learning tasks that benefit from this closed form expression include solving the continuous relaxation of a sparse regression using projected gradient methods and the accelerated variant of this method (FISTA). Since the closed-form expression is derived in a general manner, it can be seamlessly extended to regression problems with group sparsity constraints, involving cones of a dimension beyond the three-dimensional cone used in standard sparsity penalties and constraints.

Speakers
avatar for Wenjing Li

Wenjing Li

Associate Researcher, Harbin Institute of Technology
Name: Dr. Wenjing LiTitle: Associate ResearcherAffiliation: Harbin Institute of TechnologyBio:Dr. Wenjing Li is an associate researcher at Harbin Institute of Technology. Under the guidance of Prof. Wei Bian, she obtained her Master's and PhD degrees from the School of Mathematics... Read More →
LY

LIU Yanyan

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 212 3501 Trousdale Pkwy, 212, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12F: PDE-Constrained Optimization and Optimal Control
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: PDE-Constrained Optimization and Optimal Control
Chair: Henrik Wyschka
Cluster: nan

Talk 1: Numerical Solution of p-Laplace Problems for Shape Optimization
Speaker: Henrik Wyschka
Abstract: Shape optimization constrained to partial differential equations is a vibrant field of research with high relevance for industrial-grade applications. Recent developments suggest that using a p-harmonic approach to determine descent directions is superior to classical Hilbert space methods. This applies in particular to the representation of kinks and corners in occurring shapes. However, the approach requires the efficient solution of a p-Laplace problem in each descent step. Therefore, we extend an algorithm based on an interior-point method without resorting to homotopy techniques for high orders p. Further, we discuss modifications for the limit setting. A key challenge in this regard is that the Lipschitz deformations obtained as solutions in limit setting are in general non-unique. Thus, we focus on solutions which are in a sense limits to solutions for finite p and aim to preserve mesh quality throughout the optimization. Building upon this work, we also aim to reduce the number of outer iterations and thus calls of the algorithm by proposing a trust-region method. Due to the structure of the algorithm for finite p, we are able to introduce a constraint on the gradient of the solution naturally. Consequently, the obtained deformation fields also fulfill a trust-radius in terms of the Lipschitz topology.

Talk 2: A Variational and Adjoint Calculus for Optimal Control of the Generalized Riemann Problem for Hyperbolic Systems of Conservation Laws
Speaker: Jannik Breitkopf
Abstract: In this talk, we analyze optimal control problems for quasilinear strictly hyperbolic systems of conservation laws where the control is the initial state of the system. The problem is interesting, for example, in the context of fluid mechanics or traffic flow modelling. Similar problems for scalar conservation laws have already been studied. However, the case of hyperbolic systems is more involved due to the coupling of the characteristic fields. We begin our analysis by considering the Generalized Riemann Problem, which has a piecewise smooth initial state with exactly one discontinuity. This is a natural choice since it is well known that solutions to hyperbolic conservation laws generally develop discontinuities even for smooth data. For piecewise $C^1$ initial data we obtain the existence, uniqueness and stability of an entropy solution by a careful fixed point argument built on the associated Riemann Problem with piecewise constant initial states. The construction yields insights into the structure and regularity of the solution and provides a foundation to derive differentiability results of the control-to-state mapping. The entropy solution is piecewise $C^1$. Its smooth parts are separated by $C^2$ curves which are either shock curves or boundaries of rarefaction waves. In a subsequent step, we show that these curves depend differentiably on the initial state. This allows the transformation to a fixed space-time domain as a reference space. In this reference space, we can show that the transformed solution depends differentiably on the initial state in the topology of continuous functions. For this, a detailed knowledge of the structure of the solution and the behaviour of the shock curves is crucial. As an immediate consequence, the differentiability of tracking type functionals for the optimal control problem follows. Finally, we investigate the adjoint problem as an efficient way to compute the gradient of the objective functional. The adjoint problem is a linear system of transport equations with a discontinuous coefficient and possibly discontinuous terminal data. In general, problems of this kind do not admit unique solutions. We derive interior boundary conditions to characterize the correct reversible solution. This is a joint work with Stefan Ulbrich.

Talk 3: A Variational Calculus for Optimal Control of Networks of Scalar Conservation or Balance Laws
Speaker: Marcel Steinhardt
Abstract: Networks of scalar conservation or balance laws provide models for vehicular traffic flow, supply chains or transmission of data. These networks usually consist of initial boundary value problems (IBVPs) of scalar conservation or balance laws on every edge coupled by node conditions. For the optimal control of solutions a variational calculus is desirable that implies differentiability of objective functionals w.r.t. controls. In the last decade research on IBVPs successfully introduced a variational calculus which implies differentiability of objective functionals of tracking type and also yields an adjoint based gradient representation for the functional. This talk presents recent progress in an extension of these results to networks of scalar conservation or balance laws. Regarding node conditions we introduce a framework for their representation compatible with the known approach on single edges which allows us to extend the results such as continuous Fréchet differentiability for functionals of tracking-type and an adjoint based gradient representation on the network. Joint work with Stefan Ulbrich

Speakers
avatar for Henrik Wyschka

Henrik Wyschka

University of Hamburg
JB

Jannik Breitkopf

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MS

Marcel Steinhardt

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 156 3518 Trousdale Pkwy, 156, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12G: PDE-Constrained Optimization
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: PDE-Constrained Optimization
Chair: Zhexian Li
Cluster: nan

Talk 1: Robust Topology Optimisation of Electric Machines using Topological Derivatives
Speaker: Theodor Komann
Abstract: Designing high-performance electrical machines that remain efficient and reliable under uncertain material and operating conditions is important for industrial applications. In this paper, we present a robust topology optimization framework with PDE constraints to address this challenge. We formulate the robust optimisation problem as a min-max problem, where the inner maximisation considers the worst-case under predefined uncertainties, and the outer minimisation searchs for a design that remains robust to these variations using the topological derivative. A level set function represents the shape of the domain, allowing for arbitary perturbations. We use a theorem of Clarke to compute subgradients of the worst-case function, thereby ensuring efficient solution of the min-max problem. Finally, numerical results on a two-material permanent magnet synchronous machine illustrate both the effectiveness of the method and the improved performance of designs that account for uncertainty. Theodor Komann, TU Darmstadt Joint work with Peter Gangl and Nepomuk Krenn, RICAM Linz, and Stefan Ulbrich, TU Darmstadt

Talk 2: Generalized Nash equilibrium problems for linear hyperbolic PDE-constrained games
Speaker: Marcelo Bongarti
Abstract: The concept of Nash equilibrium is fundamental to a wide range of applications, spanning fields from particle mechanics to micro and macroeconomics. However, much of the existing literature focuses on finite-dimensional settings. In this talk, we draw on energy markets coupled with transport dynamics to motivate the study of multi-objective optimization problems with linear hyperbolic PDE constraints. We present recent results on the existence and characterization of Nash equilibria, emphasizing optimality conditions as a framework for understanding such solutions.

Talk 3: Bilinear optimal control of advection-diffusion-reaction equations
Speaker: Zhexian Li
Abstract: We consider bilinear optimal control of advection-diffusion-reaction equations, where control arises as the velocity in the advection term. Existing works approached the problem by discretizing the ADR equation in space and time and solving the resulting finite-dimensional optimization. In this study, we treat the advection term as a forcing term and the equation becomes a reaction-diffusion equation. Then, we apply the Fourier transform to reformulate the PDE as infinitely many decoupled ODEs for which optimal control can be derived. The resulting optimal control is in an integral representation in the complex plane and can be numerically solved efficiently. Numerical experiments on canonical examples show that the optimality gap under our approach is smaller than existing approaches that discretize the equation.

Speakers
TK

Theodor Komann

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MB

Marcelo Bongarti

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZL

Zhexian Li

Graduate Research Assistant, University of Southern California
Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 114 3501 Trousdale Pkwy, 114, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12H: Neural Networks and Optimization
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Neural Networks and Optimization
Chair: Yiping Lu
Cluster: nan

Talk 1: Towards Quantifying the Hessian structure of Neural Networks
Speaker: Yushun Zhang
Abstract: Empirical studies reported that the Hessian of neural networks exhibits a near-block-diagonal structure, yet its theoretical foundation remains unclear. In this work, we provide a rigorous theoretical analysis of the Hessian structure of NNs at random initialization. We study linear models and 1-hidden-layer networks with the mean-square (MSE) loss and the Cross-Entropy (CE) loss for classification tasks. By leveraging random matrix theory, we compare the limit distributions of the diagonal and off-diagonal Hessian blocks and find that the block-diagonal structure arises as $C \rightarrow \infty$, where $C$ denotes the number of classes. Our findings reveal that $C$ is a primary driver of the block-diagonal structure. These results may shed new light on the Hessian structure of large language models (LLMs), which typically operate with a large $C$, often exceeding $10^4$ or $10^5$.

Talk 2: Multiscale Behavior of Gradient Descent at the Edge of Stability: Central Flow as a Boundary Layer
Speaker: Yiping Lu
Abstract: Understanding optimization in deep learning is challenging due to complex oscillatory dynamics known as the “edge of stability.” In this regime, gradient flow no longer serves as an accurate proxy for gradient descent. In this talk, we adopt a fast-slow differential equation approach to characterize both the oscillatory dynamics and the self-stabilizing behavior of gradient descent when operating at a large learning rate. Using singular perturbation theory, we describe the behavior near stationary manifolds as a boundary layer—analogous to the thin layer of fluid flowing immediately adjacent to a bounding surface. This boundary layer approximation captures the essential dynamics of gradient descent at the edge of stability.

Talk 3: Regularized Adaptive Momentum Dual Averaging with an Efficient Inexact Subproblem Solver for Structured Neural Network
Speaker: Ching-pei Lee
Abstract: We propose a Regularized Adaptive Momentum Dual Averaging (RAMDA) algorithm for training neural networks with a regularization term for promoting desired structures. Similar to existing regularized adaptive methods that adopt coordinate-wise scaling, the subproblem for computing the update direction of RAMDA involves a nonsmooth regularizer and a diagonal preconditioner, and therefore does not possess a closed-form solution in general. We thus also carefully devise an implementable inexactness condition that retains convergence guarantees similar to the exact versions, and show that this proposed condition can be quickly satisfied by applying standard proximal gradient to the subproblem of RAMDA. We show asymptotic variance reduction for RAMDA, and further leverage the theory of manifold identification to prove that, even in the presence of such inexactness, after a finite number of steps, the iterates of RAMDA attain the ideal structure induced by the partly smooth regularizer at the stationary point of asymptotic convergence. This structure is locally optimal near the point of convergence, making RAMDA the first regularized adaptive method outputting models that are locally optimally structured. Extensive numerical experiments in training state-of-the-art modern neural network models in computer vision, language modeling, and speech tasks show that the proposed RAMDA is efficient and consistently outperforms state of the art for training structured neural network in terms of both the structuredness and the predictive power of the model. This is a joint work with Zih-Syuan Huang.

Speakers
YZ

Yushun Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YL

Yiping Lu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
CL

Ching-pei Lee

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 116 3501 Trousdale Pkwy, 116, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12I: Machine Learning - Data Handling and Task Learning
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Machine Learning - Data Handling and Task Learning
Chair: Zeman Li
Cluster: nan

Talk 1: PiKE: Adaptive Data Mixing for Multi-Task Learning Under Low Gradient Conflicts
Speaker: Zeman Li
Abstract: Modern machine learning models are trained on diverse datasets and tasks to improve generalization. A key challenge in multitask learning is determining the optimal data mixing and sampling strategy across different data sources. Prior research in this multi-task learning setting has primarily focused on mitigating gradient conflicts between tasks. However, we observe that many real-world multitask learning scenarios-such as multilingual training and multi-domain learning in large foundation models-exhibit predominantly positive task interactions with minimal or no gradient conflict. Building on this insight, we introduce PiKE (Positive gradient interaction-based K-task weights Estimator), an adaptive data mixing algorithm that dynamically adjusts task contributions throughout training. PiKE optimizes task sampling to minimize overall loss, effectively leveraging positive gradient interactions with almost no additional computational overhead. We establish theoretical convergence guarantees for PiKE and demonstrate its superiority over static and non-adaptive mixing strategies. Additionally, we extend PiKE to promote fair learning across tasks, ensuring balanced progress and preventing task underrepresentation. Empirical evaluations on large-scale language model pretraining show that PiKE consistently outperforms existing heuristic and static mixing strategies, leading to faster convergence and improved downstream task performance. Li, Z., Deng, Y., Zhong, P., Razaviyayn, M., & Mirrokni, V. (2025). PiKE: Adaptive Data Mixing for Multi-Task Learning Under Low Gradient Conflicts. arXiv preprint arXiv:2502.06244.

Talk 2: Sample Reweighting for Large Models by Leveraging Weights and Losses of Smaller Models
Speaker: Mahdi Salmani
Abstract: Sample reweighting is an effective approach for mitigating the impact of noisy data by adjusting the importance of individual samples, enabling the model to focus on informative examples to improve performance. This is especially valuable for Large Language Models (LLMs), which leverage vast datasets to capture complex language patterns and drive advancements in AI applications. However, finding optimal sample weights may not be feasible due to the high cost of using conventional methods, such as those in [1], during the pre-training of larger models. One potential solution is to use weights obtained by a smaller model directly as weights for data in a larger model. However, as we will see in this talk, this may not be effective, as the optimal weight distribution for the smaller model can be too distant from that of the larger model, leading to suboptimal results. There are also papers [2] that use losses from a small model to prioritize training data. In this talk, we explore using both the weights and losses of the smaller model together as an alternative for training the larger model. References [1] Ren, M., et al. (2018). Learning to Reweight Examples for Robust Deep Learning. Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4342-4350. [2] Mindermann, S., et al. (2022). Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt. Proceedings of the 39th International Conference on Machine Learning, PMLR 162:15520-15542.

Talk 3: Learning Optimal Robust Policies under Observational Data with Causal Transport
Speaker: Ruijia Zhang
Abstract: We propose a causal distributionally robust learning framework that accounts for potential distributional shifts in observational data. To hedge against uncertainty, we introduce a novel ambiguity set based on a two-stage nested transport distance, which characterizes the similarity between the empirical distribution of observational data and the true distribution of potential treatment outcomes. It penalizes deviations in covariates and treatment-specific conditional distributions while preserving the underlying causal structure. We derive a dual reformulation and establish conditions under which the robust optimization problem admits a linear programming representation, ensuring computational tractability.

Speakers
ZL

Zeman Li

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MS

Mahdi Salmani

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
RZ

Ruijia Zhang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 100 3518 Trousdale Pkwy, 100, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12J: Optimization in Power Systems and Energy
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Optimization in Power Systems and Energy
Chair: Paprapee Buason
Cluster: nan

Talk 1: Beyond Traditional Linearizations: Enhancing Power Flow Approximations with Second-Order Sensitivities
Speaker: Paprapee Buason
Abstract: The power flow equations are fundamental to power system planning, analysis, and control, but their nonlinear and nonconvex nature presents significant challenges for optimization and decision-making. While linear approximations offer computational efficiency, they often rely on broad assumptions that may not hold across diverse operating conditions, leading to inaccuracies. In particular, these methods often fail to consistently over- or under-estimate critical quantities, such as voltage magnitudes, which can result in constraint violations when applied to optimization problems. Furthermore, they do not account for the varying curvature of the power flow equations, limiting their ability to provide reliable approximations In this talk, we introduce advanced techniques to enhance the accuracy and reliability of power flow linearizations. Specifically, we focus on conservative linear approximations (CLAs), which systematically over- or under-estimate key quantities to ensure constraint satisfaction and deliver safer, more reliable solutions. Our approach leverages a sample-based framework combined with constrained linear regression while maintaining tractability and parallelizability. Additionally, we employ second-order sensitivity analysis to assess curvature and guide the selection of high-accuracy approximations. Building on these insights, we develop conservative piecewise linear approximations (CPLAs), which selectively apply piecewise linear functions in directions exhibiting significant nonlinearities, further improving accuracy beyond what standard linear approximations can achieve. Through extensive evaluations, we demonstrate that these methods enhance both accuracy and reliability, broadening their applicability to power system optimization. References: P. Buason, S. Misra, J.P. Watson, and D.K. Molzahn, "Adaptive Power Flow Approximations with Second-Order Sensitivity Insights," to appear in IEEE Transactions on Power Systems. P. Buason, S. Misra, and D.K. Molzahn, "Sample-Based Piecewise Linear Power Flow Approximations Using Second-Order Sensitivities," submitted.

Talk 2: Learning to Optimize: An Accelerated Deep Learning Framework for AC Optimal Power Flow Problem
Speaker: Yu Zhang
Abstract: The Alternating Current Optimal Power Flow (AC-OPF) problem plays a critical role in ensuring efficient and reliable power grid operations, especially with the increasing penetration of renewable energy. However, traditional solvers struggle to meet the real-time requirements due to their computational complexity. This work presents a novel semi-supervised learning framework that leverages physics-informed gradient estimation techniques to accelerate deep learning-based AC-OPF solutions. By integrating data augmentation and developing batch-mean gradient estimators with a reduced branch set, we achieve significant improvements in both feasibility and optimality. Numerical simulations on benchmark systems demonstrate that the proposed method consistently delivers near-optimal solutions with minimal constraint violations while achieving substantial speed-ups compared to conventional nonlinear programming solvers. These results highlight the potential of deep learning to transform real-time energy market operations and support the growing demand for renewable integration. Reference: K. Chen, S. Bose, Y. Zhang, "Physics-Informed Gradient Estimation for Accelerating Deep Learning based AC-OPF," IEEE Transactions on Industrial Informatics, Feb. 2025 (accepted)

Talk 3: Stationary battery energy management problem: a comparative study of several optimization models
Speaker: Daniel Mimouni
Abstract: Energy Management Systems (EMS) are crucial in optimizing energy production and minimizing costs in modern power networks. The inherent complexity of EMS problems arises from their multiperiod nature, where decisions at each stage are interplayed with outcomes of a random vector representing fluctuations in both production and consumption over time. In this paper, we focus on the EMS of a stationary battery, using ground truth measurements of electricity consumption and production from a predominantly commercial building in France. We compare several optimization models tailored to the problem for this particular EMS. Classical approaches such as MPC and risk-free multi-stage stochastic programming with recourse rely on specific assumptions (e.g. knowing the probability distribution). Therefore, they often lack robustness to distributional shifts. To enhance robustness, we explore other models. By introducing a policy variance penalty into the multi-stage stochastic model, inspired by regularization techniques in machine learning, we mitigate sensitivity to distributional shifts. Furthermore, we consider a distributionally robust optimization that offers a middle ground between robust and risk-neutral models, improving robustness by optimizing over an ambiguity set. Reinforcement learning, in contrast, offers a data-driven approach that bypasses explicit scenario generation but introduces challenges related to stability and convergence. Through numerical experiments, we evaluate these models in terms of cost efficiency, computational scalability, and out-of-sample robustness, offering a comprehensive comparison and insights into their practical interest for real-world EMS problems.

Speakers
avatar for Paprapee Buason

Paprapee Buason

Postdoctoral research associate, Los Alamos National Laboratory
Name: Dr. Paprapee BuasonTitle: Beyond Traditional Linearizations: Enhancing Power Flow Approximations with Second-Order SensitivitiesAffiliation: Los Alamos National LaboratoryBio:
avatar for Yu Zhang

Yu Zhang

Assistant Professor, UC SANTA CRUZ
Name: Dr.Yu ZhangTitle: Assistant ProfessorAffiliation: University of California, Santa CruzBio:Yu Zhang is an Assistant Professor in the ECE Department at the University of California, Santa Cruz. He earned my Ph.D. in Electrical Engineering from the University of Minnesota and pursued... Read More →
DM

Daniel Mimouni

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 102 3501 Trousdale Pkwy, 102, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12K: Decomposition Methods
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Decomposition Methods
Chair: Matthew Viens
Cluster: nan

Talk 1: A Dantzig-Wolfe Decomposition Method for Quasi-Variational Inequalities
Speaker: Manoel Jardim
Abstract: We propose an algorithm to solve quasi-variational inequality problems, based on the Dantzig-Wolfe decomposition paradigm. Our approach solves in the subproblems variational inequalities, which is a simpler problem, while restricting quasi-variational inequalities in the master subproblems, making them generally (much) smaller in size when the original problem is large-scale. We prove global convergence of our algorithm, assuming the the mapping of the quasi-variational inequality is either single-valued and continuous or it is set-vaued maximally monotone. Quasi-variational inequalities serve as a framework for several equilibrium problems, and we illustrate our algorithm on an important example in the field of economics, namely the Walrasian equilibrium problem formulated as a generalized Nash equilibrium problem. We show that the proposed method demonstrates good performance for the large-scale cases. Our numerical section tackles big problems in the theory of abstract economy, as well as some academic examples that have been previously employed in the literature.

Talk 2: Treating Uncertainty in Modeling with Multiple Solutions
Speaker: Matthew Viens
Abstract: Optimization problems can have multiple solutions that achieve an optimal or near-optimal objective value. We provide a theoretical foundation for characterizing multiple solutions combining sublevel set, epigraph, and KKT representations. We discuss how this theory enables generation of multiple solutions in two different problems: quadratic programming and Benders decomposition. We demonstrate how multiple solutions can provide additional insights into solution structure and tradeoffs for both problems. Further, we show how this additional insight is of especial value in models with data uncertainty and held-out objectives.

Talk 3: A general-purpose approach to multi-agent Bayesian optimization across decomposition methods
Speaker: Dinesh Krishnamoorthy
Abstract: Multi-agent decision-making problems, formulated as optimization problems, arise in a wide range of applications where multiple local decision-making agents collaborate to achieve a common goal. Examples include sensor and communication networks, where nodes coordinate to optimize performance and resource allocation. Distributed optimization techniques play a crucial role in these settings, enabling each agent to solve its local optimization problem while coordinating with others to achieve a system-wide optimum. Such coordination can be either decentralized (peer-to-peer) or facilitated by a central coordinator. However, traditional approaches typically require explicit analytical models linking local decision variables to objectives, which are often difficult or impractical to obtain in engineering applications. This necessitates black-box optimization methods, such as Bayesian optimization (BO), which can optimize unknown objective functions based on sampled observations. In multi-agent systems with interdependencies through shared variables or coupling constraints, standard BO methods fail to account for subsystem interactions effectively. Moreover, local objective function observations are often inaccessible to other agents, limiting the information available for updating probabilistic models and acquisition functions. Consequently, while BO has proven effective for single-agent optimization with unknown objectives, its extension to multi-agent settings remains underdeveloped. This talk will address this research gap by presenting a general-purpose multi-agent Bayesian optimization (MABO) framework that is compatible with a wide array of decomposition methods, with both centralized coordination and peer-to-peer coordination. whereby we augment traditional BO acquisition functions with suitably derived coordinating terms to facilitate coordination among subsystems without sharing local data. Regret analysis reveals that the cumulative regret of MABO is the sum of individual regrets and remains unaffected by the coordinating terms, thus bridging advancements in distributed optimization and Bayesian optimization methodologies. Numerical experiments on vehicle platooning and offshore oil production optimization examples validate the effectiveness of the proposed MABO framework for different classes of decomposition methods. This talk is based on the paper published in Optimizationa and Engineering. Krishnamoorthy, D. A general-purpose approach to multi-agent Bayesian optimization across decomposition methods. Optim Eng (2025). https://doi.org/10.1007/s11081-024-09953-w

Speakers
MJ

Manoel Jardim

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MV

Matthew Viens

PhD Student/PhD Student Intern, UW-Madison Department of Computer Sciences/Sandia National Labs
Name:Matthew ViensTitle: PhD Student/PhD Student InternAffiliation: University of Wisconsin-Madison/Sandia National LabsBio:PhD Student at University of Wisconsin-Madison in Computer Sciences advised by Michael Ferris. Also a PhD Intern for the Discrete Math & Optimization team at... Read More →
avatar for Dinesh Krishnamoorthy

Dinesh Krishnamoorthy

Associate Professor, Norwegian University of Science and Technology
Name: Dr. Dinesh KrishnamoorthyTitle: Associate Professor Affiliation: Norwegian University of Science and Technology, TrondheimBio:Dinesh Krishnamoorthy is currently an Associate Professor at the Department of Engineering Cybernetics, Norwegian University of Science and Technology... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 118 3501 Trousdale Pkwy, 118, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12L: Polynomial Optimization & Tensor Methods
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Polynomial Optimization & Tensor Methods
Chair: Yang Liu
Cluster: nan

Talk 1: On the convergence of critical points on real varieties and applications to polynomial optimization
Speaker: Ali Mohammad Nezhad
Abstract: Let $F \in \mathrm{R}[X_1,\ldots,X_n]$ and the zero set $V=\mathrm{zero}(\{P_1,\ldots,P_s\},\mathrm{R}^n)$ be given with the canonical Whitney stratification, where $\{P_1,\ldots,P_s\} \subset \mathrm{R}[X_1,\ldots,X_n]$ and $\mathrm{R}$ is a real closed field. We explore isolated trajectories that result from critical points of $F$ on $V_{\xi}=\mathrm{zero}(\{P_1-\xi_1,\ldots,P_s-\xi_s\},\mathrm{R}^n)$ when $\xi \downarrow 0$, in the sense of stratified Morse theory. Our main motivation is the limiting behavior of log-barrier functions in polynomial optimization which leads to a central path, an underlying notion behind the theory of interior point methods. We prove conditions for the existence, convergence, and smoothness of a central path. We also consider the cases where $F$ and $P_i$ are definable functions in a (polynomially bounded) o-minimal expansion of $\mathbb{R}^n$. Joint work with Saugata Basu, Purdue University

Talk 2: APPROXIMATION OF A MOMENT SEQUENCE BY MOMENT-S.o.S HIERARCHY
Speaker: Hoang Anh Tran
Abstract: The moment-S.o.S hierarchy is a widely applicable framework to address polynomial optimization problems over basic semi-algebraic sets based on positivity certificates of polynomial. Recent works show that the convergence rate of this hierarchy over certain simple sets, namely, the unit ball, hypercube, and standard simplex, is of the order $\mathrm{O}(1/r^2)$, where $r$ denotes the level of the moment-S.o.S hierarchy. This paper aims to provide a comprehensive understanding of the convergence rate of the moment-S.o.S hierarchy by estimating the Hausdorff distance between the set of pseudo truncated moment sequences and the set of truncated moment sequences specified by Tchakaloff’s theorem. Our results provide a connection between the convergence rate of the moment-S.o.S hierarchy and the \L{}ojasiewicz exponent of the domain under the compactness assumption. Consequently, we obtain the convergence rate of $\mathrm{O}(1/r)$ for polytopes, $\mathrm{O}(1/\sqrt{r})$ for domains that either satisfy the Polyak-Łojasiewicz condition or are defined by locally strongly convex polynomials, and extends the convergence rate of $\mathrm{O}(1/r^2)$ for general polynomial over the sphere.

Talk 3: Efficient Adaptive Regularized Tensor Methods
Speaker: Yang Liu
Abstract: High-order tensor methods employing local Taylor approximations have attracted considerable attention for convex and nonconvex optimization. The pth-order adaptive regularization (ARp) approach builds a local model comprising a pth-order Taylor expansion and a (p+1)th-order regularization term, delivering optimal worst-case global and local convergence rates. However, for p≥2, subproblem minimization can yield multiple local minima, and while a global minimizer is recommended for p=2, effectively identifying a suitable local minimum for p≥3 remains elusive. This work extends interpolation-based updating strategies, originally proposed for p=2, to cases where p≥3, allowing the regularization parameter to adapt in response to interpolation models. Additionally, it introduces a new prerejection mechanism to discard unfavorable subproblem minimizers before function evaluations, thus reducing computational costs for p≥3. Numerical experiments, particularly on Chebyshev-Rosenbrock problems with p=3, indicate that the proper use of different minimizers can significantly improve practical performance, offering a promising direction for designing more efficient high-order methods.

Speakers
AM

Ali Mohammad Nezhad

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
HA

Hoang Anh Tran

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
YL

Yang Liu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 119 3501 Trousdale Pkwy, 119, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12M: Games and Variational Inequalities
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Games and Variational Inequalities
Chair: Chuangyin Dang
Cluster: nan

Talk 1: A Characterization of Nash Equilibrium in Behavioral Strategies through $(\gamma,\varepsilon)$-Nash Equilibrium with Local Sequential Rationality under Kreps and Wilson's Consistency and Its Computation
Speaker: Chuangyin Dang
Abstract: This paper develops a characterization of Nash equilibrium in behavioral strategies (NashEBS) through the introduction of $(\gamma,\varepsilon)$-Nash equilibrium with local sequential rationality under Kreps and Wilson's consistency, capitalizing on an extra behavioral strategy profile. For any given $\gamma>0$, we generate a perfect $\gamma$-Nash equilibrium as a limit point of a sequence of $(\gamma,\varepsilon_k)$-Nash equilibrium with $\varepsilon_k\to 0$. We acquire a NashEBS from a limit point of a sequence of perfect $\gamma_q$-Nash equilibrium with $\gamma_q\to 0$. This characterization allows one to analytically find all NashEBSs for small extensive-form games. An application of the characterization yields a polynomial system as a necessary and sufficient condition for determining whether a totally mixed assessment is a $(\gamma,\varepsilon)$-Nash equilibrium. Exploiting the system, we devise differentiable path-following methods to compute a NashEBS by establishing the existence of smooth paths, which are secured from the equilibrium systems of logarithmic-barrier, entropy-barrier, square-root-barrier, and convex-quadratic-penalty extensive-form games. Comprehensive numerical results further confirm the efficiency of the methods.

Talk 2: A Bandit Learning Approach to Continuous Multi-Agent Games
Speaker: Jianghai Hu
Abstract: A variety of practical problems can be modeled by multi-player games where a group of self-interested players aim at optimizing their own local objectives, where the objectives depend on the (continuous) actions taken by other players. In many cases, the local gradient information of each player's objective, critical for many existing algorithms, is not available. In this talk, we will focus on designing solution algorithms for multi-player games using bandit feedback, i.e., the only available feedback for each player is the realized objective values. Our proposed solution algorithms are based on the residual feedback scheme and the extra-gradient methods, and require only one query of objective function per iteration for each player. We show that the actual sequence of play generated by the algorithms converges to the game's critical points and characterize the convergence rates for certain classes of games. We also discuss extensions when there are delays in receiving feedback and when a multi-point estimation scheme is used. Numerical examples will be given to illustrate the effectiveness of the proposed algorithms.

Talk 3: PEPit-Assisted Primal-Dual Coordinate Descent Method for Weak-MVI Problems
Speaker: Iyad WALWIL
Abstract: We introduce two novel primal-dual algorithms for tackling non-convex, non-concave, and non-smooth saddle point problems characterized by the weak Minty variational inequality (MVI). The first algorithm generalizes the well-known Primal-Dual Hybrid Gradient (PDHG) method to address this challenging problem class. The second algorithm introduces a randomly extrapolated primal-dual coordinate descent approach, extending the Stochastic Primal-Dual Hybrid Gradient (SPDHG) algorithm. Designing a coordinated algorithm to solve non-convex, non-concave saddle point problems is unprecedented, and proving its convergence posed significant difficulties. This challenge motivated us to utilize PEPit, a Python-based tool for computer-assisted worst-case analysis of first-order optimization methods. By integrating PEPit with automated Lyapunov function techniques, we successfully derived our second novel algorithm. Both methods are effective under a mild condition on the weak-MVI parameter, achieving linear convergence with constant step sizes that adapt to the problem’s structure. Numerical experiments on sigmoid and perceptron-regression problems validate our theoretical findings.

Speakers
CD

Chuangyin Dang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JH

Jianghai Hu

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
IW

Iyad WALWIL

Ph.D student, Télécom Paris
Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 157 3518 Trousdale Pkwy, 157, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12N: Convergence Analysis and Rates
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Convergence Analysis and Rates
Chair: Tianyu Wang
Cluster: nan

Talk 1: Breaking a logarithmic barrier in the stopping time convergence rate of stochastic first-order methods.
Speaker: Tianyu Wang
Abstract: We focus on the convergence rate for stochastic optimization in terms of stopping times. Such problems are of great importance, since in most real-world cases, the time to stop the stochastic algorithm is a stopping time. To this end, we improve the state-of-the-art stopping-time convergence rate by removing a logarithmic factor. Our analysis leverages a novel framework building on a refined Lyapunov analysis and a new Gronwall-type argument.

Talk 2: Uniformly Optimal and Parameter-free First-order Methods for Convex and Function-constrained Optimization
Speaker: Zhenwei Lin
Abstract: This paper presents new first-order methods for achieving optimal oracle complexities in convex optimization with convex function constraints. Oracle complexities are measured by the number of function and gradient evaluations. To achieve this, we develop first-order methods that can utilize computational oracles for solving diagonal quadratic programs in subproblems. For problems where the optimal value $f^*$ is known, such as those in overparameterized models and feasibility problems, we propose an accelerated first-order method that incorporates a modified Polyak step size and Nesterov's momentum. Notably, our method does not require knowledge of smoothness levels, H\"{o}lder continuity parameter of the gradient, or additional line search, yet achieves the optimal oracle complexity bound of $\mathcal{O}(\varepsilon^{-2/(1+3\rho)})$ under H\"{o}lder smoothness conditions. When $f^*$ is unknown, we reformulate the problem as finding the root of the optimal value function and develop inexact fixed-point iteration and secant method to compute $f^*$. These root-finding subproblems are solved inexactly using first-order methods to a specified relative accuracy. We employ the accelerated prox-level (APL) method, which is proven to be uniformly optimal for convex optimization with simple constraints. Our analysis demonstrates that APL-based root-finding also achieves the optimal oracle complexity of $\mathcal{O}(\varepsilon^{-2/(1+3\rho)})$ for convex function-constrained optimization, without requiring knowledge of any problem-specific structures. Through experiments on various tasks, we demonstrate the advantages of our methods over existing approaches in function-constrained optimization.

Talk 3: Stochastic Halpern iteration in normed spaces and applications to reinforcement learning
Speaker: Mario Bravo
Abstract: We analyze the oracle complexity of the stochastic Halpern iteration with minibatching, where we aim to approximate fixed-points of nonexpansive operators in a normed finite-dimensional space. We show that if the underlying stochastic oracle is with uniformly bounded variance, our method exhibits an overall oracle complexity of $\tilde{O}(\varepsilon^{-5})$ to reduce the expected error residual below $\varepsilon$. Also, we establish a lower bound of $\Omega(\varepsilon^{-3})$, applicable to a broad class of algorithms, including all averaged iterations, even with minibatching.  As an application, we introduce a new model-free algorithm for weakly communicating average-reward MDPs, requiring no prior knowledge, and analyze its sample complexity. This is joint work with Juan Pablo Contreras (UDP, Chile)

Speakers
TW

Tianyu Wang

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZL

Zhenwei Lin

PhD candidate, Shanghai University of Finance and Economics
Name: Zhenwei LinTitle:Accelerated Bundle Level Method for Convex OptimizationAffiliation: Shanghai University of Finance and Economics and Purdue UniversityBio:This paper presents new first-order methods for achieving optimal oracle complexities in convex optimization with convex... Read More →
MB

Mario Bravo

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 256 3518 Trousdale Pkwy, 256, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12O: Numerical Methods
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Numerical Methods
Chair: Masaru Ito
Cluster: nan

Talk 1: Proximal gradient-type method with generalized distances for nonconvex composite optimization
Speaker: Masaru Ito
Abstract: In this talk, we consider a composite optimizaton problem minimizing the sum of two functions f and g. Typical proximal gradient methods rely on descent lemma and the convexity of g, for which the choice of distance-like functions to define the proximal subproblems is constrained by the structure of both f and g. We propose a proximal gradient-type method when f has locally Lipschitz gradient and g is nonconvex. We discuss conditions of distance-like functions allowing their broader choices and ensuring convergence results.

Talk 2: A fixed-point algorithm with matrix splitting for nonlinear second-order cone complementarity problems
Speaker: Shunsuke Hayashi
Abstract: The Second-Order Cone Complementarity Problem (SOCCP) is a wide class of problems containing the nonlinear complementarity problem (NCP) and the second-order cone programming problem (SOCP). Recently, Li et al. reformulated the linear SOCCP into a fixed-point problem by using matrix splitting, and constructed the Anderson-accelerated preconditioned modulus approach. In this study, we extend their approach to nonlinear SOCCPs. To solve such problems, we combine a matrix splitting with a fixed-point algorithm. We also present an approach with Anderson acceleration to enhance the convergence performance. We further show the convergence property under appropriate assumptions. Finally, we report some numerical results to evaluate the effectiveness of the algorithm.

Talk 3: A Simple yet Highly Accurate Prediction-Correction Algorithm for Time-Varying Optimization
Speaker: Tomoya Kamijima
Abstract: Time-varying optimization problems arise in various applications such as robotics, signal processing, and electronics. We propose SHARP, a simple yet highly accurate prediction-correction algorithm for unconstrained time-varying problems. The prediction step is based on Lagrange interpolation of past solutions, allowing for low computational cost without requiring Hessian matrices or gradients. To enhance stability, especially in non-convex settings, an acceptance condition is introduced to reject excessively large updates. We provide theoretical guarantees for a small tracking error and demonstrate the superior performance of SHARP through numerical experiments.

Speakers
MI

Masaru Ito

Associate Professor, Nihon University
SH

Shunsuke Hayashi

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Tomoya Kamijima

Tomoya Kamijima

Doctor student, The University of Tokyo
Name: Tomoya KamijimaAffiliation: The University of TokyoBio:I completed a master's degree at the University of Tokyo in 2025.Now I am pursuing a Ph.D.I am interested in continuous optimization, especially,time-varying optimization,online optimization,ODE approach.
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 258 3518 Trousdale Pkwy, 258, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12P: Learning and Optimization Interaction
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Learning and Optimization Interaction
Chair: Christopher Yeh
Cluster: nan

Talk 1: Learning Decision-Focused Uncertainty Representations for Robust and Risk-Constrained Optimization
Speaker: Christopher Yeh
Abstract: Machine learning can significantly improve performance for decision-making under uncertainty in a wide range of domains. However, ensuring robustness guarantees and satisfaction of risk constraints requires well-calibrated uncertainty estimates, which can be difficult to achieve with neural networks. Moreover, in high-dimensional settings, there may be many valid uncertainty estimates, each with their own performance profile—i.e., not all uncertainty is equally valuable for downstream decision-making. To address this problem, we developed an end-to-end framework to learn uncertainty representations for robust and risk-constrained optimization in a way that is informed by the downstream decision-making loss, with robustness guarantees and risk constraint satisfaction provided by conformal methods. In addition, we propose to represent arbitrary convex uncertainty sets with partially input-convex neural networks, which are learned as part of our framework. Our approach consistently improves upon two-stage estimate-then-optimize baselines on concrete applications in energy storage arbitrage and portfolio optimization.

Talk 2: Learning Algorithm Hyperparameters for Fast Parametric Convex Optimization
Speaker: Rajiv Sambharya
Abstract: We introduce a machine-learning framework to learn the hyperparameter sequence of first-order methods (e.g., the step sizes in gradient descent) to quickly solve parametric convex optimization problems. Our computational architecture amounts to running fixed-point iterations where the hyperparameters are the same across all parametric instances and consists of two phases. In the first step-varying phase the hyperparameters vary across iterations, while in the second steady-state phase the hyperparameters are constant across iterations. Our learned optimizer is flexible in that it can be evaluated on any number of iterations and is guaranteed to converge to an optimal solution. To train, we minimize the mean square error to a ground truth solution. In the case of gradient descent, the one-step optimal step size is the solution to a least squares problem, and in the case of unconstrained quadratic minimization, we can compute the two and three-step optimal solutions in closed-form. In other cases, we backpropagate through the algorithm steps to minimize the training objective after a given number of steps. We show how to learn hyperparameters for several popular algorithms: gradient descent, proximal gradient descent, and two ADMM-based solvers: OSQP and SCS. We use a sample convergence bound to obtain generalization guarantees for the performance of our learned algorithm for unseen data, providing both lower and upper bounds. We showcase the effectiveness of our method with many examples, including ones from control, signal processing, and machine learning. Remarkably, our approach is highly data-efficient in that we only use 10 problem instances to train the hyperparameters in all of our examples. [https://arxiv.org/pdf/2411.15717]

Talk 3: An Operator Learning Approach to Nonsmooth Optimal Control of Nonlinear Partial Differential Equations
Speaker: Tianyou Zeng
Abstract: Optimal control problems with nonsmooth objectives and nonlinear PDE constraints pose significant numerical challenges to traditional numerical methods, due to their nonconvexity, nonsmoothness, and expensive computational cost of iteratively solving high-dimensional and ill-conditioned systems introduced by mesh-based discretization. We present an operator learning approach for these problems. We implement a primal-dual idea in the optimization context and solve the resulting PDEs with pre-trained neural solution operators. Compared with traditional algorithms and existing deep learning methods, our approach avoids re-solving linear systems or retraining networks across iterations. Additionally, the pre-trained neural networks can be readily applied without retraining even for different problem parameters, like the desired state or the PDE source term. We demonstrate the effectiveness and efficiency of the proposed method through validation on some benchmark nonsmooth optimal control problems with nonlinear PDE constraints.

Speakers
avatar for Christopher Yeh

Christopher Yeh

PhD candidate (on the faculty job market), California Institute of Technology
Chris Yeh is a 5th-year PhD candidate in the Computing and Mathematical Science Department at Caltech, co-advised by Professors Yisong Yue and Adam Wierman. His research focus is on the problem of co-designing uncertainty quantification methods with decision-making algorithms, especially... Read More →
RS

Rajiv Sambharya

Postdoc, University of Pennsylvania
Name: Rajiv SambharyaTitle: PostdocAffiliation: University of PennsylvaniaBio:I am a Postdoctoral Researcher in the Electrical and Systems Engineering department at Penn hosted by George Pappas. I obtained my PhD from the Operations Research and Financial Engineering department a... Read More →
TZ

Tianyou Zeng

PhD Candidate, The University of Hong Kong
Name: Tianyou ZengTitle: PhD CandidateAffiliation: The University of Hong KongBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing in algorithms that eventuallyconverge—if given infinite time. His groundbreaking research in gradient descent... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 106 3501 Trousdale Pkwy, 106, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12Q: Cloud Computing, Transport, and AI
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Cloud Computing, Transport, and AI
Chair: Julius Lohmann
Cluster: nan

Talk 1: An ϵ outer linear approximation optimization model for geo-distributed cloud applications
Speaker: Julio C Góez
Abstract: Cloud computing has become key for software providers aiming to serve a geographically distributed user base. As application demand changes and users become more latency sensitive, software providers need to adapt their deployments to reach latency-based service-level objectives while keeping costs low. In this paper, we propose a method to provision Cloud resources considering latency-based service-level constraints while minimizing operating costs. Our method is based on a latency model to capture the load balancing among resources. To solve the resulting mixed integer non-linear optimization model, we propose an outer linear approximation that is able to find feasible solutions faster than solving the non-linear problem. Experiments based on realistic deployment data reveal how the proposed method is able to deliver timely solutions to the provisioning problem. Further, the solutions adapt to key elements of the problem, such as the service-level objective defined and the characteristics of the software application deployed.

Talk 2: Integral representation of the h-mass
Speaker: Julius Lohmann
Abstract: The multi-material transport problem [1, 2] is convex optimization problem on normal 1-currents in ℝⁿ with coefficients in ℝᵐ. The prescribed boundary can be written μ⃗₋−μ⃗₊, where μ⃗₊ and μ⃗₋ are compactly supported ℝᵐ-valued Radon measures whose components μ⃗±ʲ are nonnegative and satisfy μ⃗₊ʲ(ℝⁿ) = μ⃗₋ʲ(ℝⁿ). For each j the measure μ⃗₊ʲ can be interpreted as the source distribution of some material j which has to be transported to the sink μ⃗₋ʲ. The objective in the above Plateau problem is the so-called h-mass ℳₕ with norm h on the coefficient group ℝᵐ. The value h(θ⃗) represents the transportation cost to move a material bundle θ⃗ per unit distance. The triangle inequality h(θ⃗+θ⃗) ≤ h(θ⃗)+h(θ⃗) implies that the joint transport of different materials may be more efficient. The h-mass ℳₕ(F) then indicates the total transportation cost of an admissible candidate F, that is ∂F = μ⃗₋−μ⃗₊, with respect to h (also called mass flux in this context). A candidate mass flux F for the Plateau problem inf ℳₕ(G), subject to ∂G=μ⃗₋−μ⃗₊, can alternatively be interpreted as a finite mass flat 1-chain with coefficients in ℝᵐ and boundary μ⃗₋−μ⃗₊ or as a matrix-valued Radon measure (row j being a mass flux describing the transport of material j) whose distributional divergence equals μ⃗₊−μ⃗₋. Each interpretation of F, as a current, flat chain, or measure, comes with its own natural notion of an h-mass ℳₕ, 𝕄ₕ, or |·|ₕ. In [3] it is shown that all these notions are equivalent. In particular, this result yields an integral representation of the h-mass which, so far, was only known for rectifiable currents [2, 4]. In addition, the proof motivates a new definition of so-called calibrations for multi-material transport. Calibrations are a classical tool in the study of Plateau problems. They can be used to certify optimality of candidate minimizers. The new definition of a calibration in [3] provides a sufficient and necessary criterion for optimality (opposed to just sufficient as in the classical case). Since, in addition, this notion ‘includes’ the classical definition, a sharp characterization of the regularity of calibrations for multi- material transport is obtained. In my talk, I will motivate the study of multi-material transport, explain the different h-masses, sketch the argument yielding their equality, and give examples of calibrations. [1] A. Marchese, A. Massaccesi, R. Tione. A multi-material transport problem and its convex relaxation via rectifiable G-currents. SIAM J. Math. Anal., 51(3):1965–1998, 2019. [2] A. Marchese, A. Massaccesi, S. Stuvard, R. Tione. A multi-material transport problem with arbitrary marginals. Calc. Var. Partial Differential Equations, 60(3):Paper No. 88, 2021. [3] J. Lohmann, B. Schmitzer, B. Wirth. Formulas for the h-mass on 1-currents with coefficients in Rm. Preprint: arXiv:2407.10158 [math.OC], 2024. [4] B. White. Rectifiability of flat chains. Ann. of Math., 150(1):165–184, 1999.

Talk 3: Detecting AI Generated Images through Texture and Frequency Analysis of Patches
Speaker: Maryam Yashtini
Abstract: The significant improvement in AI image generation in recent years poses serious threats to social security, as AI generated misinformation may infringe upon political stability, personal privacy, and digital copy rights of artists. Building an AI generated image detector that accurately identifies generated image is crucial to maintain the social security and property rights of artists. This paper introduces preprocessing pipeline that uses positional encoded azimuthal integrals for image patches to create fingerprints that encapsulate distinguishing features. We then trained a multi-head attention model with 97.5% accuracy on classification of the fingerprints. The model also achieved 80% accuracy on images generated by AI models not presented in the training dataset, demonstrating the robustness of our pipeline and the potential of broader application of our model.

Speakers
JC

Julio C Góez

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
JL

Julius Lohmann

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
MY

Maryam Yashtini

Tenure Track Professor, Georgetown University
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 214 3501 Trousdale Pkwy, 214, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12R: Envelopes in Optimization
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Envelopes in Optimization
Chair: Yankun Huang
Cluster: nan

Talk 1: Majorization Minorization, Moreau Envelopes, and Deweighting Weighted Least Squares
Speaker: Qiang Heng
Abstract: This paper deals with tactics for fast computation in least squares regression in high dimensions. These tactics include: (a) the majorization-minimization (MM) principle, (b) smoothing by Moreau envelopes, and (c) the proximal distance principal for constrained estimation. In iteratively reweighted least squares, the MM principle can create a surrogate function that trades case weights for adjusted responses. Reduction to ordinary least squares then permits the reuse of the Gram matrix and its Cholesky decomposition across iterations. This tactic is pertinent to estimation in L2E regression and generalized linear models. For problems such as quantile regression, non-smooth terms of an objective function can be replaced by their Moreau envelope approximations and majorized by spherical quadratics. Finally, penalized regression with distance-to-set penalties also benefits from this perspective. Our numerical experiments validate the speed and utility of deweighting and Moreau envelope approximations. Julia software implementing these experiments is available on our web page.

Talk 2: Computing the convex envelope of bivariate piecewise linear-quadratic (PLQ) functions
Speaker: Tanmaya Karmarkar
Abstract: We introduce a linear-time algorithm for computing the convex envelope of bivariate piecewise linear-quadratic (PLQ) functions and establish that the biconjugate is piecewise rational defined over a polyhedral subdivision. Our approach consists of the following steps: (1) compute the convex envelope of each quadratic piece obtaining piecewise rational functions (quadratic divided by linear function) defined over a polyhedral subdivision; (2) compute the conjugate of each resulting piece to obtain piecewise quadratic functions defined over a parabolic subdivision; (3) compute the maximum of all those functions to obtain the conjugate of the original PLQ function as a piecewise quadratic function defined on a parabolic subdivision; (4) compute the conjugate of each resulting piece; and finally (5) compute the maximum over all those functions to obtain the biconjugate as rational functions (quadratic divided by linear function) defined over a polyhedral subdivision.

Talk 3: Inexact Moreau Envelope Lagrangian Method for Non-Convex Constrained Optimization under Local Error Bound Conditions on Constraint Functions
Speaker: Yankun Huang
Abstract: In this paper, we study the inexact Moreau envelope Lagrangian (iMELa) method for solving smooth non-convex optimization problems over a simple polytope with additional convex inequality constraints. By incorporating a proximal term into the traditional Lagrangian function, the iMELa method approximately solves a convex optimization subproblem over the polyhedral set at each main iteration. Under the assumption of a local error bound condition for subsets of the feasible set defined by subsets of the constraints, we establish that the iMELa method can find an $\epsilon$-Karush-Kuhn-Tucker point with $\tilde O(\epsilon^{-2})$ gradient oracle complexity. Paper preprint link: https://arxiv.org/abs/2502.19764.

Speakers
QH

Qiang Heng

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
TK

Tanmaya Karmarkar

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
avatar for Yankun Huang

Yankun Huang

Postdoctoral Scholar, Arizona State University
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 154 3518 Trousdale Pkwy, 154, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12S: Zeroth-Order and Derivative-Free Optimization
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Zeroth-Order and Derivative-Free Optimization
Chair: Anjie Ding
Cluster: nan

Talk 1: Sequentially testing a decrease condition in stochastic derivative-free optimization
Speaker: Anjie Ding
Abstract: In derivative-free optimization algorithms, the progress made by each trial step is measured by the potential decrease attained in the objective function, and in many algorithms one requires a decrease that must be sufficiently large when compared to a forcing function of the size of the step, typically a multiple of the square of the step size. When the function is stochastic, the estimation of the decrease needed to ensure the desired rate of convergence of the algorithms requires a sample size of the order of 4 of the inverse of the step size. In this talk, we introduce a new way of testing the satisfaction of a sufficient decrease condition in stochastic derivative-free optimization by framing it as a hypothesis test problem and solving it through the means of a sequential hypothesis test. The test makes a decision between two hypotheses, essentially corresponding to accepting or rejecting the sufficient decrease, outputting the probabilities of making these decisions incorrectly. For the purpose of establishing a standard (non-convex) convergence rate, such probabilities need to satisfy certain upper bounds to ensure enough correctness when the step size approaches zero. When the function noise is Gaussian, we will show that the size of the sample required to estimate the decrease drops to an order of 2 of the inverse of the step size when the decrease is far from the forcing function and the step size is not yet close to zero. In this case, this test sequentially estimates the potential decrease by observing the function (at the current and trial points) until their sum crosses either a lower or an upper bound selectively chosen as a function of the variation and the step size.

Talk 2: Which is faster: Linear minimization or Projection?
Speaker: Zev Woodstock
Abstract: It is already known that, for several important sets arising in applications, performing linear minimization can be faster than projection. However, if we consider the class of all nonempty compact convex sets, can we directly compare the computational complexity of linear minimization to that of projection? This talk provides two modest results in this direction: (1) high-precision linear minimization is no slower than projection; and (2) Exact linear minimization is no slower than projection under the additional assumption of polyhedrality.

Speakers
AD

Anjie Ding

PhD Student, Lehigh University
Name: Dr. Slothington "Slow Convergence" McNapfaceTitle: Distinguished Professor of Continuous Optimization & Energy MinimizationAffiliation: The Lush Canopy Institute of Sluggish AlgorithmsBio:Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
ZW

Zev Woodstock

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 155 3518 Trousdale Pkwy, 155, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12T: Distributional Shift and Systemic Risk
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Distributional Shift and Systemic Risk
Chair: Lai Tian
Cluster: nan

Talk 1: Sample-average approximations for systemic risk measures
Speaker: Çağın Ararat
Abstract: We investigate the convergence properties of sample-average approximations (SAA) for set-valued systemic risk measures. We assume that the systemic risk measure is defined using a general aggregation function with some continuity properties and value-at-risk applied as a monetary risk measure. Our focus is on the theoretical convergence of its SAA under Wijsman and Hausdorff topologies for closed sets. After building the general theory, we provide an in-depth study of an important special case where the aggregation function is defined based on the Eisenberg-Noe network model. In this case, we provide mixed-integer programming formulations for calculating the SAA sets via their weighted-sum and norm-minimizing scalarizations. When value-at-risk is replaced with expectation, we also provide lower bounds on the required sample size for a sufficiently close SAA with high probability. Based on joint works with Wissam AlAli and Nurtai Meimanjan.

Talk 2: Mixed-feature Logistic Regression Robust to Distribution Shifts
Speaker: Qingshi Sun
Abstract: Logistic regression models are widely used in the social and behavioral sciences and in high-stakes domains, due to their simplicity and interpretability properties. At the same time, such domains are permeated by distribution shifts, where the distribution generating the data changes between training and deployment. In this paper, we study a distributionally robust logistic regression problem that seeks the model that will perform best against adversarial realizations of the data distribution drawn from a suitably constructed Wasserstein ambiguity set. Our model and solution approach differ from prior work in that we can capture settings where the likelihood of distribution shifts can vary across features, significantly broadening the applicability of our model relative to the state-of-the-art. We propose a graph-based solution approach that can be integrated into off-the-shelf optimization solvers. We evaluate the performance of our model and algorithms on numerous publicly available datasets. Our solution achieves a 408x speed-up relative to the state-of-the-art. Additionally, compared to the state-of-the-art, our model reduces average calibration error by up to 36.19% and worst-case calibration error by up to 41.70%, while increasing the average area under the ROC curve (AUC) by up to 18.02% and worst-case AUC by up to 48.37%.

Talk 3: Stabilizing Stochastic Programs with Rockafellian Relaxation: Theoretical Results and Chance-Constrained Applications
Speaker: Lai Tian
Abstract: Solutions of a stochastic optimization problem tend to change disproportionally under small changes to its probability distribution. This sensitivity is particularly concerning, as it is virtually impossible to identify the “correct” distribution in real-world applications. In this talk, we demonstrate how Rockafellian relaxations provide a principled and effective approach to improving the stability of solutions under distributional changes. Unlike previous efforts that primarily focus on finite or discrete distributions, our framework accommodates general Borel probability measures and handles discontinuous integrands, such as those arising in chance-constrained formulations. These advances broaden the scope of robust solution techniques in modern stochastic optimization.

Speakers
avatar for Çağın Ararat

Çağın Ararat

Name: Çağın AraratAffiliation: Bilkent University, Ankara, TurkeyBio:Çağın Ararat is an Associate Professor in the Department of Industrial Engineering at Bilkent University. He received his BSc degree in 2010 from the same department, followed by a PhD degree in 2015 from the... Read More →
QS

Qingshi Sun

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 158 3518 Trousdale Pkwy, 158, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12U: Distributional Shift and Systemic Risk
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Session: Distributional Shift and Systemic Risk
Chair: Lai Tian
Cluster: nan

Talk 1: Sample-average approximations for systemic risk measures
Speaker: Çağın Ararat
Abstract: We investigate the convergence properties of sample-average approximations (SAA) for set-valued systemic risk measures. We assume that the systemic risk measure is defined using a general aggregation function with some continuity properties and value-at-risk applied as a monetary risk measure. Our focus is on the theoretical convergence of its SAA under Wijsman and Hausdorff topologies for closed sets. After building the general theory, we provide an in-depth study of an important special case where the aggregation function is defined based on the Eisenberg-Noe network model. In this case, we provide mixed-integer programming formulations for calculating the SAA sets via their weighted-sum and norm-minimizing scalarizations. When value-at-risk is replaced with expectation, we also provide lower bounds on the required sample size for a sufficiently close SAA with high probability. Based on joint works with Wissam AlAli and Nurtai Meimanjan.

Talk 2: Mixed-feature Logistic Regression Robust to Distribution Shifts
Speaker: Qingshi Sun
Abstract: Logistic regression models are widely used in the social and behavioral sciences and in high-stakes domains, due to their simplicity and interpretability properties. At the same time, such domains are permeated by distribution shifts, where the distribution generating the data changes between training and deployment. In this paper, we study a distributionally robust logistic regression problem that seeks the model that will perform best against adversarial realizations of the data distribution drawn from a suitably constructed Wasserstein ambiguity set. Our model and solution approach differ from prior work in that we can capture settings where the likelihood of distribution shifts can vary across features, significantly broadening the applicability of our model relative to the state-of-the-art. We propose a graph-based solution approach that can be integrated into off-the-shelf optimization solvers. We evaluate the performance of our model and algorithms on numerous publicly available datasets. Our solution achieves a 408x speed-up relative to the state-of-the-art. Additionally, compared to the state-of-the-art, our model reduces average calibration error by up to 36.19% and worst-case calibration error by up to 41.70%, while increasing the average area under the ROC curve (AUC) by up to 18.02% and worst-case AUC by up to 48.37%.

Talk 3: Stabilizing Stochastic Programs with Rockafellian Relaxation: Theoretical Results and Chance-Constrained Applications 
Speaker: Lai Tian
Abstract: Solutions of a stochastic optimization problem tend to change disproportionally under small changes to its probability distribution. This sensitivity is particularly concerning, as it is virtually impossible to identify the “correct” distribution in real-world applications. In this talk, we demonstrate how Rockafellian relaxations provide a principled and effective approach to improving the stability of solutions under distributional changes. Unlike previous efforts that primarily focus on finite or discrete distributions, our framework accommodates general Borel probability measures and handles discontinuous integrands, such as those arising in chance-constrained formulations. These advances broaden the scope of robust solution techniques in modern stochastic optimization.

Speakers
avatar for Çağın Ararat

Çağın Ararat

Name: Çağın AraratAffiliation: Bilkent University, Ankara, TurkeyBio:Çağın Ararat is an Associate Professor in the Department of Industrial Engineering at Bilkent University. He received his BSc degree in 2010 from the same department, followed by a PhD degree in 2015 from the... Read More →
QS

Qingshi Sun

Name: Dr. Slothington "Slow Convergence" McNapface Title: Distinguished Professor of Continuous Optimization & Energy Minimization Affiliation: The Lush Canopy Institute of Sluggish Algorithms Bio: Dr. Slothington McNapface is a leading expert in continuous optimization, specializing... Read More →
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 108 3501 Trousdale Pkwy, 108, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12V
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 110 3501 Trousdale Pkwy, 110, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12W
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 112 3501 Trousdale Pkwy, 112, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12X
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Taper Hall (THH) 215 3501 Trousdale Pkwy, 215, Los Angeles, CA 90089

4:15pm PDT

Parallel Sessions 12Y
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Thursday July 24, 2025 4:15pm - 5:30pm PDT
Joseph Medicine Crow Center for International and Public Affairs (DMC) 200 3518 Trousdale Pkwy, 200, Los Angeles, CA 90089

5:30pm PDT

End of Conference
Thursday July 24, 2025 5:30pm - 5:30pm PDT
Thursday July 24, 2025 5:30pm - 5:30pm PDT
TBA
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Timezone


Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -