Session: Derivative-free stochastic optimization methods II
Chair: Matt Menickelly
Cluster: Derivative-free Optimization
Talk 1: Augmenting subspace optimization methods with linear bandits
Speaker: Matt Menickelly
Abstract: In recent years, there has been a growing interest in developing iterative optimization methods that focus on finding descent restricted to affine subspaces containing an incumbent. Each iteration typically consists of choosing a subspace according to some possibly randomized technique, and then building and minimizing a local model employing derivatives of the function projected onto the chosen subspace. In model-based derivative-free optimization, where gradient approximations essentially require a finite difference (i.e., a number of function evaluations linear in problem dimension), these methods suggest serious opportunities for practical gains in efficiency, since the number of function evaluations necessary to obtain a projected gradient approximation instead scales with the chosen subspace dimension. Motivated by practicality, we consider a simple augmentation of such a generic subspace method. In particular, we consider a sequential optimization framework where actions consist of one-dimensional linear subspaces and rewards consist of projected gradient measurements made along corresponding affine subspaces. This sequential optimization problem can be analyzed through the lens of dynamic regret. We modify an existing upper confidence bound (UCB) linear bandit method to achieve sublinear dynamic regret. We demonstrate the efficacy of employing this UCB method alongside a sketched version of the derivative-free optimization method, POUNDers.
Talk 2: Adaptive sampling trust region optimization with randomized subpaces
Speaker: Sara Shashaani
Abstract: The globally convergent ASTRO-DF algorithm was established to make stochastic derivative-free optimization efficient with careful sampling strategy of generating only sufficiently many random outputs in every function evaluation. Despite theoretical guarantees for sample complexity, this algorithm still suffers curse of dimensionality. The new variant of this algorithm uses randomized subspaces in combination with adaptive sampling to address this shortcoming. This variant, therefore, has the ability to be run for high-dimensional derivative-free problems while maintaining optimal sample-efficiency.
Talk 3: Zeroth-order Riemannian averaging stochastic approximation algorithms
Speaker: Krishnakumar Balasubramanian
Abstract: We present Zeroth-order Riemannian Averaging Stochastic Approximation (\texttt{Zo-RASA}) algorithms for stochastic optimization on Riemannian manifolds. We show that \texttt{Zo-RASA} achieves optimal sample complexities for generating -approximation first-order stationary solutions using only one-sample or constant-order batches in each iteration. Our approach employs Riemannian moving-average stochastic gradient estimators, and a novel Riemannian-Lyapunov analysis technique for convergence analysis. We improve the algorithm's practicality by using retractions and vector transport, instead of exponential mappings and parallel transports, thereby reducing per-iteration complexity. Additionally, we introduce a novel geometric condition, satisfied by manifolds with bounded second fundamental form, which enables new error bounds for approximating parallel transport with vector transport.