Session: Recent Advances in Stochastic Optimization: Complexity, Adaptivity, and Nonsmooth Extensions (II)
Chair: Sen Na & Zhaosong Lu
Cluster: Nonlinear Optimization
Talk 1: On the convergence of policy gradient methods for stochastic nonlinear dynamical systems
Speaker: Sungho Shin
Abstract: We analyze the local convergence of policy gradient methods for stochastic nonlinear dynamical systems. Under several technical assumptions, we show that the policy gradient algorithm converges to the optimal policy.
Talk 2: Improved complexity of proximal bundle methods and new insights on bundle management
Speaker: Andy Sun
Abstract: Proximal bundle method (PBA) is a fundamental and computationally effective algorithm for solving optimization problems with nonsmooth components. In this talk, we will first investigate a composite setting where one function is smooth and the other is piecewise linear. We present a novel complexity analysis of PBA and derive a O(\epsilon^{-0.8}) iteration complexity, improving upon the known O(\epsilon^{-2}) guarantee. Our analysis also sheds new light on bundle management strategies. Computation experiments on two-stage robust optimization and support vector machine demonstrate the effectiveness of the new insights.
Talk 3: An Optimal Single-Loop Algorithm for Convex Finite-Sum Coupled Compositional Stochastic Optimization
Speaker: Tianbao Yang
Abstract: We will talk about a class of convex Finite-Sum Coupled Compositional Stochastic Optimization (cFCCO) problems with many applications, including group distributionally robust optimization (GDRO), learning with imbalanced data, reinforcement learning, and learning to rank. We will present an efficient single-loop primal-dual block-coordinate proximal algorithm, dubbed ALEXR. This algorithm leverages block-coordinate stochastic mirror ascent updates for the dual variable and stochastic proximal gradient descent updates for the primal variable. We establish the convergence rates of ALEXR in both convex and strongly convex cases under smoothness and non-smoothness conditions of involved functions, which not only improve the best rates in previous works on smooth cFCCO problems but also expand the realm of cFCCO for solving more challenging non-smooth problems such as the dual form of GDRO. Finally, we present lower complexity bounds to demonstrate that the convergence rates of ALEXR are optimal among first-order block-coordinate stochastic algorithms for the considered class of cFCCO problems.