TB10 - Complexity and Practice in Continuous Optimization
May 12 2026 15:30 – 17:10
Location: Procter & Gamble (green)
Chaired by Rahbarnia Farhad
4 Presentations
Nonsmooth Exact Penalty Methods
Penalty methods are a well known class of algorithms for constrained
optimization. They transform a constrained problem into a sequence of
unconstrained penalized problems in the hope that approximate solutions of
the latter converge to a solution of the former. If Lagrange multipliers
exist, exact penalty methods ensure that the penalty parameter only need
increase a finite number of times, but are typically scorned in smooth
optimization for the penalized problems are not smooth. This led researchers
to consider the implementation of exact penalty methods inconvenient. Recent
advances in proximal methods have led to increasingly efficient solvers for
nonsmooth optimization. We study a general exact penalty algorithm and use it
to show that the exact L2-penalty method for equality-constrained
optimization can, in fact, be implemented efficiently by solving the
penalized problem using a proximal-type algorithm.
Trust-region methods under the (L_0,L_1)-smoothness condition
Generalized smoothness assumptions have attracted growing attention in recent years, motivated in part by machine learning problems in which the objective is not \(L\)-smooth.
One of the most prominent assumptions of this type is \((L_0,L_1)\)-smoothness.
We study a trust-region algorithm for minimizing nonconvex deterministic \((L_0,L_1)\)-smooth objectives.
Within a simple framework, we recover the existing complexity bound \[O((f(x_0) - f_{low})(L_0\epsilon^{-2} + L_1\epsilon^{-1}))\] \emph{without requiring estimates of} \(L_0\) \emph{and} \(L_1\).
To derive this result, we only impose a mild assumption on the model Hessian, even though the objective function need not be twice differentiable.
We also provide complexity bounds for (strongly) convex objectives under the same assumptions.
Regularized Quasi-Newton Methods for Unconstrained Optimization: Negative Curvature, Complexity and Fast Local Convergence
We present R2N, a globally convergent regularized quasi-Newton framework for large-scale unconstrained smooth optimization. At each iteration, R2N computes a step that approximately minimizes a quadratic model of the objective augmented by an adaptive quadratic regularization term, while guaranteeing Cauchy decrease. Although the objective is assumed only to be $C^1$, exact second derivatives may be used when available. R2N accommodates a broad range of step-computation strategies, from sparse factorization to matrix-free Krylov methods. When present, directions of nonpositive curvature are exploited.
Our analysis relaxes standard assumptions by allowing the model Hessians to be unbounded, provided they satisfy a mild growth condition. We derive a worst-case evaluation complexity bound that separates the progress made along positive- and nonpositive-curvature directions. In addition, by forcing the regularization parameter to vanish near isolated local minimizers, we obtain superlinear local convergence. R2N specializes naturally to nonlinear least-squares problems.
We provide a full Julia implementation in the JSOSolvers ecosystem. Numerical experiments on the CUTEst test set demonstrate its robustness and efficiency.
Quasi-Newton and Krylov Methods for the Solution of Trust-Region Subproblems
We study the computation of trust-region steps for unconstrained optimization by comparing the conjugate-gradient method (CG), quasi-Newton methods from the Broyden class, and other Krylov methods. For strictly convex quadratic functions, we derive a new relationship between CG and methods of the Broyden class that refines existing results and clarifies when these methods generate the same iterates and enjoy quadratic termination. We further extend this perspective to limited-memory BFGS (L-BFGS) and Krylov methods based on the Arnoldi process, namely DIOM, highlighting the role of memory in practical performance. To assess these theoretical results, we embed the solvers in a trust-region framework and report numerical experiments on positive-definite linear systems and unconstrained optimization problems. The results show that memory is a key algorithmic lever: L-BFGS and DIOM methods are consistently more robust than CG and often achieve comparable accuracy with fewer Hessian–vector products.
