Session: Riemannian geometry, optimization, and applications
Chair: Wen Huang
Cluster: Optimization on Manifolds
Talk 1: Strong Convexity of Sets in Riemannian Manifolds
Speaker: Damien Scieur
Abstract: Convex curvature properties are important in designing and analyzing convex optimization algorithms in the Hilbertian or Riemannian settings. In the case of the Hilbertian setting, strongly convex sets are well studied. Herein, we propose various definitions of strong convexity for uniquely geodesic sets in a Riemannian manifold. We study their relationship, propose tools to determine the geodesic strongly convex nature of sets, and analyze the convergence of optimization algorithms over those sets. In particular, we demonstrate that the Riemannian Frank-Wolfe algorithm enjoys a global linear convergence rate when the Riemannian scaling inequalities hold
Talk 2: Velocity-Based Karcher Mean on the Special Orthogonal Group: A Generalized Karcher Mean for Riemannian Manifolds
Speaker: Zhifeng Deng
Abstract: Special orthogonal group SOn consists of orthogonal matrices with the determinant 1 that realize rotations on n variables. It arises in many important applications, especially from the computer vision background where the rotations of a coordinates are used to described spacial motions of an object. Averaging a given data set is one of the most important computational tasks and the Karcher mean formulation in a metric space with the associated algorithms are naturally adapted for the averaging task on SOn as the Riemannian structure equips SOn with a metric space. Unfortunately, the distance induced by the Riemannian metric is not differentiable at cut loci which makes the Karcher mean formulation on SOn, and in general manifold, a non-convex optimization problem. In this presentation, we investigate the consequences of the non-convex optimization, with the SO2 circle as an example, and address the key issue in the discontinuous Riemannian logarithm that retrieves velocities in the tangent space from the points in the manifold. We propose a generalized Karcher mean formulation for Riemannian manifolds, namely the velocity-based Karcher mean, to overcome the discontinuity issue of retrieving velocities. This generalized Karcher mean subsumes the original Karcher mean and includes extra points as velocity-based Karcher mean if they emanate (non-minimal) geodesics to that data set with velocities summed up to zero. Then, we develop the algorithmic primitives, including a nearby matrix logarithm, to solve the velocity-based Karcher mean on SOn in a smooth manner such that the computed velocity-based Karcher mean smoothly depends on the data set and the convergence towards a computed mean can be controlled by the velocities emanated from the initial guess. Finally, numerical experiments demonstrate the advantages of the velocity-based Karcher mean on SOn in applications with smooth constraints.
Talk 3: Riemannian Federated Learning via Averaging Gradient Stream
Speaker: Zhenwei Huang
Abstract: In recent years, federated learning has garnered significant attention as an efficient and privacy-preserving distributed learning paradigm. In the Euclidean setting, federated averaging (FedAvg) and its variants are a class of efficient algorithms for expected (empirical) risk minimization. In this talk, we introduces and analyzes a Riemannian federated averaging gradient stream (RFedAGS) algorithm, which is a generalization of FedAvg, to problems defined on a Riemannian manifold. Under standard assumptions, the convergence rate of RFedAGS with fixed step sizes is proven to be sublinear for an approximate stationary solution. If decaying step sizes are used, the global convergence is established. Furthermore, assuming that the objective obeys the Riemannian Polyak-{\L}ojasiewicz property, the optimal gaps generated by RFedAGS with fixed step size are linearly decreasing up to a tiny upper bound, meanwhile, if decaying step sizes are used, then the gaps sublinearly vanish. Unlike the existing Riemannian federated learning whose convergence analysis only allows one agent or one step in local update, the proposed RFedAGS allows multiple agents and multiple-step local updates. Thus, RFedAGS theoretically guarantees a reduction of outer iterations and therefore reduces communication costs. Numerical simulations conducted on synthetic and real-world data demonstrate the performance of the proposed RFedAGS.