Optimization Days 2026

HEC Montréal, Québec, Canada

May 11 — 13, 2026

TA7 - Machine Learning and Optimization 2

May 12 2026 10:30 – 12:10

Location: Budapest (green)

Chaired by Mahmoud Abdelkader

4 Presentations

10:30 - 10:55

Boosting Revisited: Benchmarking and Advancing LP-Based Ensemble Methods

  • Fabian Akkerman, University of Twente
  • Julien Ferry, speaker, CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains, Polytechnique Montréal
  • Christian Artigues, LAAS-CNRS, Université de Toulouse
  • Emmanuel Hébrard, LAAS-CNRS, Université de Toulouse
  • Thibaut Vidal, CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains, Polytechnique Montréal

Despite their theoretical appeal, totally corrective boosting methods based on linear programming have received limited empirical attention. In this work, we conduct the first large-scale experimental study of six LP-based boosting formulations, including two novel methods, NM-Boost and QRLP-Boost, across 20 diverse datasets. We evaluate the use of both heuristic and optimal base learners within these formulations, and analyze not only accuracy, but also ensemble sparsity, margin distribution, anytime performance, and hyperparameter sensitivity.

We show that totally corrective methods can outperform or match state-of-the-art heuristics like XGBoost and LightGBM when using shallow trees, while producing significantly sparser ensembles. We further show that these methods can thin pre-trained ensembles without sacrificing performance, and we highlight both the strengths and limitations of using optimal decision trees in this context.

10:55 - 11:20

Counterfactual Maps: What They Are and How to Find Them

  • Awa Khouna, speaker, Polytechnique Montréal
  • Julien Ferry, CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains, Polytechnique Montréal
  • Thibaut Vidal, CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains, Polytechnique Montréal

Counterfactual explanations are a central tool in interpretable machine learning, yet computing them exactly for complex models remains challenging. For tree ensembles, predictions are piecewise constant over a large collection of axis-aligned hyperrectangles, implying that an optimal counterfactual for a point corresponds to its projection onto the nearest rectangle with an alternative label under a chosen metric. Existing methods largely overlook this geometric structure, relying either on heuristics with no optimality guarantees or on mixed-integer programming formulations that do not scale to interactive use. In this work, we revisit counterfactual generation through the lens of nearest-region search and introduce counterfactual maps, a global representation of recourse for tree ensembles. Leveraging the fact that any tree ensemble can be compressed into an equivalent partition of labeled hyperrectangles, we cast counterfactual search as the problem of identifying the generalized Voronoi cell associated with the nearest rectangle of an alternative label. This leads to an exact, amortized algorithm based on volumetric k-dimensional (KD) trees, which performs branch-and-bound nearest-region queries with explicit optimality certificates and sublinear average query time after a one-time preprocessing phase. Our experimental analyses on several real datasets drawn from high-stakes application domains show that this approach delivers globally optimal counterfactual explanations with millisecond-level latency, achieving query times that are orders of magnitude faster than existing exact, cold-start optimization methods.

11:20 - 11:45

Fairness in Hybrid Interpretable Models: A Thorough Characterization using Combinatorial Optimization and Rashomon Sets

  • Ziba Jabbar Zare, speaker, CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains, Concordia University
  • Ulrich Aivodji, Software and Information Technology Engineering , Ecole de Technologie Superieure
  • Julien Ferry, CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains, Polytechnique Montréal
  • Thibaut Vidal, CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains, Polytechnique Montréal

Hybrid interpretable models combine an interpretable component with a black box, enabling a direct trade-off between accuracy and transparency. However, they also raise specific fairness issues, as some users are classified by an interpretable model while others are deferred to a black-box. We coin this phenomenon interpretability coverage fairness. To thoroughly characterize it, we adapt tools from predictive multiplicity and develop combinatorial optimization methods to construct Rashomon sets of near-optimal hybrid interpretable models. Our results reveal substantial disparities across protected groups.

11:45 - 12:10

Distributionally Robust Predict-then-Optimize via Wasserstein Ambiguity Sets

  • Mahmoud Khaled Abdelkader, speaker, Concordia University
  • Chelvy Moe-Mackosso, PhD Student, Concordia University
  • Yassine Yaakoubi, Concordia University

We introduce Distributionally Robust SPO+ (DR-SPO+) to address distribution shifts in contextual optimization. While standard SPO+ assumes consistent training and testing data, our approach minimizes the worst-case expected loss over a Wasserstein ambiguity set. By leveraging the piecewise linear structure of SPO+, we derive a tractable convex reformulation for polyhedral decision spaces. We prove the Fisher consistency of this robust surrogate and provide finite-sample regret bounds. Numerical experiments on shortest-path and knapsack problems demonstrate that DR-SPO+ significantly enhances out-of-distribution performance and training stability compared to standard methods, ensuring reliable decision-making under evolving operational conditions.

Keywords: Contextual optimization, Distributional robustness, Decision-focused learning.