Session: Learning and Optimization Interaction
Chair: Christopher Yeh
Cluster: nan
Talk 1: Learning Decision-Focused Uncertainty Representations for Robust and Risk-Constrained Optimization
Speaker: Christopher Yeh
Abstract: Machine learning can significantly improve performance for decision-making under uncertainty in a wide range of domains. However, ensuring robustness guarantees and satisfaction of risk constraints requires well-calibrated uncertainty estimates, which can be difficult to achieve with neural networks. Moreover, in high-dimensional settings, there may be many valid uncertainty estimates, each with their own performance profile—i.e., not all uncertainty is equally valuable for downstream decision-making. To address this problem, we developed an end-to-end framework to learn uncertainty representations for robust and risk-constrained optimization in a way that is informed by the downstream decision-making loss, with robustness guarantees and risk constraint satisfaction provided by conformal methods. In addition, we propose to represent arbitrary convex uncertainty sets with partially input-convex neural networks, which are learned as part of our framework. Our approach consistently improves upon two-stage estimate-then-optimize baselines on concrete applications in energy storage arbitrage and portfolio optimization.
Talk 2: Learning Algorithm Hyperparameters for Fast Parametric Convex Optimization
Speaker: Rajiv Sambharya
Abstract: We introduce a machine-learning framework to learn the hyperparameter sequence of first-order methods (e.g., the step sizes in gradient descent) to quickly solve parametric convex optimization problems. Our computational architecture amounts to running fixed-point iterations where the hyperparameters are the same across all parametric instances and consists of two phases. In the first step-varying phase the hyperparameters vary across iterations, while in the second steady-state phase the hyperparameters are constant across iterations. Our learned optimizer is flexible in that it can be evaluated on any number of iterations and is guaranteed to converge to an optimal solution. To train, we minimize the mean square error to a ground truth solution. In the case of gradient descent, the one-step optimal step size is the solution to a least squares problem, and in the case of unconstrained quadratic minimization, we can compute the two and three-step optimal solutions in closed-form. In other cases, we backpropagate through the algorithm steps to minimize the training objective after a given number of steps. We show how to learn hyperparameters for several popular algorithms: gradient descent, proximal gradient descent, and two ADMM-based solvers: OSQP and SCS. We use a sample convergence bound to obtain generalization guarantees for the performance of our learned algorithm for unseen data, providing both lower and upper bounds. We showcase the effectiveness of our method with many examples, including ones from control, signal processing, and machine learning. Remarkably, our approach is highly data-efficient in that we only use 10 problem instances to train the hyperparameters in all of our examples. [https://arxiv.org/pdf/2411.15717]
Talk 3: An Operator Learning Approach to Nonsmooth Optimal Control of Nonlinear Partial Differential Equations
Speaker: Tianyou Zeng
Abstract: Optimal control problems with nonsmooth objectives and nonlinear PDE constraints pose significant numerical challenges to traditional numerical methods, due to their nonconvexity, nonsmoothness, and expensive computational cost of iteratively solving high-dimensional and ill-conditioned systems introduced by mesh-based discretization. We present an operator learning approach for these problems. We implement a primal-dual idea in the optimization context and solve the resulting PDEs with pre-trained neural solution operators. Compared with traditional algorithms and existing deep learning methods, our approach avoids re-solving linear systems or retraining networks across iterations. Additionally, the pre-trained neural networks can be readily applied without retraining even for different problem parameters, like the desired state or the PDE source term. We demonstrate the effectiveness and efficiency of the proposed method through validation on some benchmark nonsmooth optimal control problems with nonlinear PDE constraints.