Optimization Days 2024

HEC Montréal, Québec, Canada, 6 — 8 May 2024

Schedule Authors My Schedule

WB9 - Artificial Intelligence

May 8, 2024 03:30 PM – 05:10 PM

Location: Vilnius (green)

Chaired by Youssouf Emine

4 Presentations

  • 03:30 PM - 03:55 PM

    Counterfactual Explanations for Decision-Focused Pipelines

    • Alexandre Forel, presenter, Polytechnique Montreal
    • Germain Vivier-Ardisson, Ecole Polytechnique
    • Axel Parmentier, CERMICS, École des Ponts
    • Thibaut Vidal, CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains, Polytechnique Montréal

    Combining deep neural networks and combinatorial optimization models enjoys a growing popularity in structured learning and contextual optimization, improving the state of the art on a variety of applications. Yet, these pipelines lack interpretability since they are made of two opaque layers: a highly non-linear prediction model, and an optimization layer solved using a complex blackbox solver. Our goal is to improve the transparency of such methods by providing counterfactual explanations. We show that, in high-dimensional feature spaces, the use of generative models such as variational autoencoders is necessary to avoid producing adversarial explanations. We successfully obtain close and plausible explanations by tailoring the autoencoder to the task of explaining structured learning pipelines. For instance, we introduce a variant of the classic reconstruction loss during training and a plausibility regularization loss in the explanation problem. These provide the foundations of CF-OPT, a first-order optimization algorithm that can find counterfactual explanations for a broad class of structured learning architectures. Our numerical results show that both close and plausible explanations can be obtained for problems from the recent literature.

  • 03:55 PM - 04:20 PM

    Stealing Decision Trees exploiting optimal counterfactual explanations

    • Awa Khouna, presenter,
    • Vidal Thibaut, Massachusetts Institute of Technology, USA

    The rise of Machine Learning as a Service (MLaaS) has made advanced machine learning tools widely accessible, challenging the balance between model explainability and security. As MLaaS simplifies the use of complex models, the need for explainability—to make model decisions transparent and trustworthy—intensifies the risk of model extraction attacks. This work delves into the pressing issue of such attacks within the realm of machine learning, specifically targeting decision tree models using counterfactual explanations. Model extraction attacks pose a significant threat to the security and proprietary nature of machine learning models by enabling unauthorized parties to replicate a model with high fidelity without access to the original training data. Focusing on decision trees, a widely used model type for their simplicity and interpretability, we introduce two pioneering extraction attack algorithms: Balls Attack (BA) and Tree Reconstruction Attack (TRA). Importantly, the BA is model-agnostic and can be applied to any machine learning model, not just decision trees. These strategies exploit decision trees' vulnerabilities by leveraging counterfactual explanations to deduce the model's structure and parameters. Notably, the TRA demonstrates a remarkable efficiency advantage, achieving model replication with 100% fidelity while requiring a significantly fewer number of queries compared to existing benchmarks. This efficiency not only exposes a significant vulnerability but also emphasizes the potential for rapid dissemination of proprietary model details. Our approach demonstrates the ability to replicate decision tree models accurately and efficiently, thereby underlining a critical security flaw in the deployment of these models.

  • 04:20 PM - 04:45 PM

    Lossless Compression for Ensemble

    • Youssouf Emine, presenter, GERAD
    • Vidal Thibaut, Massachusetts Institute of Technology, USA

    Ensemble models are a cornerstone of machine learning, known for their enhanced accuracy and robustness through the aggregation of multiple base learners. However, their large memory and computational demands limit their deployment, especially in environments with limited resources. We introduce a novel approach to compress ensemble models, with the objective of minimizing their size while preserving or improving their predictive performance. We propose a series of compression schemes, each addressed by a distinct mixed integer optimization model. The first scheme aims to compress the ensemble without altering its predictions on a given dataset. The second scheme seeks to retain or improve the accuracy of the original ensemble on a specific dataset. Finally, for tree-based ensembles, we present a specialized algorithm that compresses the model across the entire feature space.

  • 04:45 PM - 05:10 PM

    Decision Tree-based Fuzzy Rule Induction using Truth Table: A Multi-Class Failure Diagnosis Case Study

    • Abdelouadoud Kerarmi, presenter, Ai movement - International Artificial Intelligence Center of Morocco, Mohammed VI Polytechnic University, Rabat, Morocco.
    • Assia Kamal-Idrissi, Ai movement - International Artificial Intelligence Center of Morocco, Mohammed VI Polytechnic University, Rabat, Morocco.
    • Amal El Fallah Seghrouchni, Ai movement - International Artificial Intelligence Center of Morocco, Mohammed VI Polytechnic University, Rabat, Morocco.

    Fuzzy Logic offers valuable advantages in multi-classification tasks, with its capabilities to deal with imprecise and uncertain data for nuanced multi-criteria decision aid. However, generating precise fuzzy sets demands significant effort and expertise. Moreover, the computational time of FL models escalates with an increased number of rules due to combinatorial complexity. Thus, good data description, knowledge extraction/representation, and rule induction are crucial for a robust FL model. We address these challenges by proposing an Integrated Truth Table in Decision Tree-based Fuzzy Logic (ITTDTFL) model that induct rules and membership functions using Machine Learning Decision Tree, then it reduces and enhance them using Truth Table by eliminating inclusions and redundancies to be used in the FL model. We compare the ITTDTFL model with state-of-the-art models, including FURIA, RIPPER, XGBoost, Fuzzy PROAFTN and Decision-Tree-based FL. Experiments were conducted on real machine failure datasets, evaluating the performances based on several factors, including the number of generated rules, accuracy, and computational time.

Back