Session: Bilevel Optimization - Algorithms
Chair: Yuchen Lou
Cluster:
Talk 1: A Fully First-Order Method for Stochastic Bilevel Optimization
Speaker: Dohyun Kwon
Abstract: The problem of stochastic bilevel optimization has been the focus of extensive study in recent years. Although many optimization methods have been proposed to address bilevel problems, existing approaches often require potentially expensive calculations involving the Hessians of lower-level objectives. The primary technical challenge lies in tracking the lower-level solutions as upper-level variables change. We introduce a Fully First-order Stochastic Approximation (F2SA) method [1, 2], which only relies on first-order gradient oracles. Additionally, we analyze the complexity of first-order methods under minimal assumptions and provide matching lower bounds [3]. This talk is based on joint work with Jeongyeol Kwon, Hanbaek Lyu, Stephen Wright, and Robert Nowak (UW-Madison). [1] Kwon, J., Kwon, D., Wright, S., & Nowak, R. D. (2023). A fully first-order method for stochastic bilevel optimization. In Proceedings of the 40th International Conference on Machine Learning (pp. 18083-18113). PMLR. (ICML 2023, Oral) [2] Kwon, J., Kwon, D., Wright, S., & Nowak, R. D. (2024). On Penalty Methods for Nonconvex Bilevel Optimization and First-Order Stochastic Approximation. In The Twelfth International Conference on Learning Representations. (ICLR 2024, Spotlight) [3] Kwon, J., Kwon, D., & Lyu, H. (2024). On the complexity of first-order methods in stochastic bilevel optimization. In Proceedings of the 41st International Conference on Machine Learning (pp. 25784-25811). PMLR. (ICML 2024)
Talk 2: A Decomposition Framework for Nonlinear Nonconvex Two-Stage Optimization
Speaker: Yuchen Lou
Abstract: We propose a new decomposition framework for continuous nonlinear constrained two-stage optimization, where both first- and second-stage problems can be nonconvex. A smoothing technique based on an interior-point formulation renders the optimal solution of the second-stage problem differentiable with respect to the first-stage parameters. As a consequence, efficient off-the-shelf optimization packages can be utilized. We show that the solution of the nonconvex second-stage problem behaves locally like a differentiable function so that existing proofs can be applied for the global convergence of the first-stage. We also prove fast local convergence of the algorithm as the barrier parameter is driven to zero. Numerical experiments for large-scale instances demonstrate the computational advantages of the decomposition framework.
Talk 3: A gradient-based method for bilevel optimization with convex lower-level problems
Speaker: Mattia Solla Saez
Abstract: This talk presents an algorithm to compute approximate solutions to a class of bilevel optimization problems. Recent advances in variational analysis have shown that the solution mapping of certain generalized equations is continuously differentiable at their nondegenerate solutions. We use this result to construct a sequence of approximate solutions of an appropriate regularization of the problem that converges to a feasible point of the original problem. Under an additional second-order assumption, the sequence is shown to converge to a stationary point of the original problem. Our framework allows for a wide class of nonsmooth problems in the lower level, including (convex) nonlinear programming problems and (convex) minimax problems. We show that, in these cases, the nondegeneracy assumption reduces to known conditions like strict complementarity, and discuss how to deal with the absence of this assumption algorithmically. This is joint work with Dr. Johannes Royset (royset@usc.edu).