Session: Sparsity Optimization
Chair: Jiachang Liu
Cluster:
Talk 1: Solving Sparse & High-Dimensional-Output Regression via Compression
Speaker: Guanyi Wang
Abstract: Multi-Output Regression (MOR) has been widely used in scientific data analysis for decision-making. Unlike traditional regression models, MOR aims to simultaneously predict multiple real-valued outputs given an input. However, the increasing dimensionality of the outputs poses significant challenges regarding interpretability and computational scalability for modern MOR applications. As a first step to address these challenges, this paper proposes a Sparse & High-dimensional-Output REgression (SHORE) model by incorporating additional sparsity requirements to resolve the output interpretability, and then designs a computationally efficient two-stage framework capable of handling SHORE with provable accuracy via compression on outputs. Theoretically, we show that the proposed framework is computationally scalable while maintaining the same order of training loss and prediction loss before-and-after compression under arbitrary or relatively weak sample set conditions. Empirically, numerical results further validate the theoretical findings, showcasing the efficiency and accuracy of the proposed framework.
Talk 2: Scalable First-order Method for Certifying Optimal k-Sparse GLMs
Speaker: Jiachang Liu
Abstract: This paper investigates the problem of certifying optimality for sparse generalized linear models (GLMs), where sparsity is enforced through an l0 cardinality constraint. While branch-and-bound (BnB) frameworks can certify optimality by pruning nodes using dual bounds, existing methods for computing these bounds are either computationally intensive or exhibit slow convergence, limiting their scalability to large-scale problems. To address this challenge, we propose a first-order proximal gradient algorithm designed to solve the perspective relaxation of the problem within a BnB framework. Specifically, we formulate the relaxed problem as a composite optimization problem and demonstrate that the proximal operator of the non-smooth component can be computed exactly in log-linear time complexity, eliminating the need to solve a computationally expensive second-order cone program. Furthermore, we introduce a simple restart strategy that enhances convergence speed while maintaining low per-iteration complexity. Extensive experiments on synthetic and real-world datasets show that our approach significantly accelerates dual bound computations and is highly effective in providing optimality certificates for large-scale problems. This is joint work with Andrea Lodi and Soroosh Shafiee. A preprint of this work can be found at https://arxiv.org/abs/2502.09502.
Talk 3: Sparse quadratic optimization via cardinality constraints
Speaker: Ademir Ribeiro
Abstract: We study the sparse regularized quadratic optimization problem where the sparsity level is controlled by means of the cardinality constraints. However, due to the existence of this kind of constraint (which is not continuous neither convex), it is very difficult to directly solve the problem. Recently, several researchers have addressed these type of problems using a common strategy, a continuous relaxation reformulation of the problem. In this work we propose a different approach in which the feasible set of the problem is decomposed into simpler sets, all of them meeting the cardinality constraint. Then, in order to overcome the combinatorial difficulty induced by this decomposition, we make use of an auxiliary and simple problem of maximizing a concave piecewise quadratic function. The solution of this problem, obtained by a subgradient method, is then used to find the solution of the original problem. Numerical experiments show that this strategy may be successful for a significant number of problems. Throughout this work we establish nonlinear optimization results in order to provide a rigorous mathematical analysis of the ideas involving the reformulation of problem, the proposed method and its convergence properties. Beck A, Eldar YC (2013) Sparsity constrained nonlinear optimization: Optimality conditions and algorithms. SIAM J. Optim. 23(3):1480–1509. Nesterov Y (2004) Introductory Lectures on Convex Optimization – Basic Course (Kluwer Academic Publishers). Vreugdenhil R, Nguyen VA, Eftekhari A, Esfahani PM (2021) Principal component hierarchy for sparse quadratic programs. Meila M, Zhang T, eds., Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, 10607–10616 (PMLR).