Session: Robustness in learning from data
Chair: Jun-ya Gotoh
Cluster: Optimization For Data Science
Talk 1: The exploration-exploitation-robustness tradeoff for multi-period data driven problems with learning
Speaker: Andrew Lim
Abstract: We study the tradeoff between exploration, exploration and robustness in the setting of a robust optimal stopping problem with learning. We show that a decision maker (DM) concerned about model uncertainty explores less, even though additional data reduces model uncertainty, because the “learning shock” when it is collected increases the sensitivity of the expected reward to worst-case deviations from the nominal model. We also show that this “conservatism” can be fixed by introducing hedging instruments that offset the learning shocks. (With Thaisiri Watewai (Chulalongkorn University) and Anas Abdelhakmi (National University of Singapore)).
Talk 2: Adaptive smoothing and importance sampling for stochastic optimization of the conditional value-at-risk
Speaker: Anton Malandii
Abstract: We present a novel method for solving conditional value-at-risk (CVaR) optimization problems based on the dual representation of CVaR defined as a supremum of an expectation over a risk envelope. The algorithm is based on a Bregman proximal point algorithm and consists of alternating stochastic primal and dual stages. Every (inner) primal stage involves a subproblem solved by sampling from a probability distribution updated at each dual stage (outer iteration). The likelihood ratio of the dual probability distributions relative to the distribution underlying the original problem converges to the risk identifier of the solution’s CVaR. Thus, the dual probabilities provide the algorithm with a built-in importance sampling mechanism that draws from the tail of the underlying distribution. Because only samples in the tail influence the CVaR, and samples outside the tail are drawn with decreasing probability, the algorithm delivers exceptional performance over other stochastic approximation methods. We prove the convergence of the algorithm for convex objective functions and present numerical evidence for solving non-convex problems. Our numerical experiments target representative problems in machine learning, and engineering design, focusing on support-vector classification, support-vector regression, and risk-averse large-scale topology optimization.
Talk 3: Convex vs. Nonconvex Regularization Terms---A Comparative Study of Regression B-Splines with Knot and Spline Selection
Speaker: Jun-ya Gotoh
Abstract: Robustness is important in learning from data. From the perspective of mathematical optimization, there are two contrasting approaches: robust optimization, which emphasizes worst-case samples, and robust regression, which reduces/ignores the contribution of unfavorable samples. The former tends to be realized by convex regularization, and the latter by non-convex regularization. On the other hand, l1 regularization, which is popular because it often leads to sparsity of the solution or associated quantities, is somewhere in between, but is closer to robust optimization in that it preserves convexity. In this presentation, we will compare convex and non-convex regularizations using knot selection and spline selection in multivariate B-spline regression as an example and discuss the choice between the two regularization methods from a practical per