Session: Randomized optimization algorithms 2/2
Chair: Laurent Condat
Cluster: Nonlinear Optimization
Talk 1: Block Coordinate DC Programming
Speaker: Alp Yurtsever
Abstract: We introduce an extension of the Difference of Convex Algorithm (DCA) in the form of a randomized block coordinate approach for problems with a separable structure. For n coordinate blocks and k iterations, our main result proves a non-asymptotic convergence rate of O(n/k) for the proposed method. Furthermore, leveraging the connection between DCA and Expectation Maximization (EM), we propose a block coordinate EM algorithm.
Talk 2: Variable metric proximal stochastic gradient methods with additional sampling
Speaker: Ilaria Trombini
Abstract: Many optimization problems in machine learning can be framed as minimiz- ing the sum of two functions: the first typically represents the expected risk, which is often substituted by the empirical risk in practice, while the second en- codes prior information about the solution. Given that the first term is generally differentiable and the second is convex, proximal gradient methods are particu- larly well-suited for addressing such optimization challenges. This talk presents a new class of proximal stochastic gradient methods, which are characterized by three fundamental components: a variable metric guiding the iterations, a stochastic line search governing the decrease properties, and an incremental mini- batch size technique that utilizes additional sampling. The convergence of the proposed algorithms is established under various assumptions about the function being minimized. Notably, no assumption is made about the Lipschitz continu- ity of the gradient for the differentiable part of the objective function. The talk also explores possible strategies for automatically choosing the parameters of the proposed method. Numerical tests on binary classification tasks demonstrate the effectiveness of this approach when compared to other state-of-the-art proximal stochastic gradient methods.
Talk 3: A Framework for Stochastic Quasi-Newton Type Methods
Speaker: Titus Pinta
Abstract: We develop a stochastic framework to analyze Quasi-Newton type methods where the Hessian approximation is an instance of a random variable. We will show that under a weak assumption on the regularity of these Hessian approximations, the sequence produced by the algorithm has a sufficient decrease property, with high probability. This intermediary result will be help us characterize the probability distribution of the random variable that counts the number of iterations until the algorithm produces a point with gradient norm less than a given epsilon. As a special case, we will see how random additive Gaussian noise impacts the performance of a Quasi-Newton type method. Finally, we provide some numerical results, based on random subspace embedding, showcasing our theory.