Journées de l'optimisation 2022

HEC Montréal, Québec, Canada, 16 — 18 mai 2022

Horaire Auteurs Mon horaire

TB3 - Constraint Programming and Machine Learning

17 mai 2022 15h30 – 17h10

Salle: Procter & Gamble (vert)

Présidée par Gilles Pesant

4 présentations

  • 15h30 - 15h55

    Learning Optimal Decision Trees using Constraint Programming

    • Hélène Verhaeghe, prés., Polytechnique Montréal
    • Siegfried Nijssen, UCLouvain
    • Gilles Pesant, Polytechnique Montréal
    • Claude-Guy Quimper, Université Laval
    • Pierre Schaus, UCLouvain, Belgique

    Decision trees are among the most popular classification models in machine learning. Traditionally, they are learned using greedy algorithms. However, such algorithms pose several disadvantages: it is difficult to limit the size of the decision trees while maintaining a good classification accuracy, and it is hard to impose additional constraints on the models that are learned. For these reasons, there has been a recent interest in exact and flexible algorithms for learning decision trees. In this paper, we introduce a new approach to learn decision trees using constraint programming. Compared to earlier approaches, we show that our approach obtains better performance, while still being sufficiently flexible to allow for the inclusion of constraints. Our approach builds on three key building blocks: (1) the use of AND/OR search, (2) the use of caching, (3) the use of the CoverSize global constraint proposed recently for the problem of itemset mining. This allows our constraint programming approach to deal in a much more efficient way with the decompositions in the learning problem.

  • 15h55 - 16h20

    Adding Long-Term Structure to Trained Sequence Generators using Constraint Programming

    • Virasone Manibod, prés., Polytechnique Montréal
    • Gilles Pesant, Polytechnique Montréal

    Sequence models in machine learning often struggle to exhibit long-term structure. We consider this
    problem at inference time in the context of enforcing on a sequence constraints that are not featured
    in the data set on which the model was trained. The difficulty lies in imposing previously-unseen
    structure while staying close to the data set the model was trained on. It is particularly hard for
    long-term structure, which requires balancing foresight over many yet-to-be generated tokens and
    the immediacy of next-token predictions from the sequence model. We address this problem by
    expressing such structure through a constraint programming model and by mixing the learned
    probabilities of the sequence model with the marginal probabilities of the constraint programming
    model before predicting the next token by sampling the resulting probability distribution. We
    experiment with our proposal in the context of computer-assisted music composition and show that
    we achieve the desired structure without straying too much from the learned style of the corpus.

  • 16h20 - 16h45

    Combining Reinforcement Learning and Constraint Programming for Sequence-Generation Tasks with Hard Constraints

    • Daphné Lafleur, prés., Polytechnique Montréal
    • Gilles Pesant, Polytechnique Montréal
    • Sarath Chandar, Université de Montréal

    While Machine Learning (ML) techniques are good at generating data similar to a dataset, they lack the capacity to enforce constraints. On the other hand, any solution to a Constraint Programming (CP) model satisfies its constraints but has no obligation to imitate a dataset. Yet, we sometimes need both. In this paper we borrow RL-Tuner, a Reinforcement Learning (RL) algorithm introduced to tune neural networks, as our enabling architecture to exploit the respective strengths of ML and CP. RL-Tuner maximizes the sum of a pretrained network's learned probabilities and of manually-tuned penalties for each violated constraint. We replace the latter with outputs of a CP model representing the marginal probabilities of each note and the number of constraint violations. As was the case for the original RL-Tuner, we apply our algorithm to music generation since it is a highly-constrained domain for which CP is especially suited. We show that combining ML and CP, as opposed to using them individually, allows the agent to reflect the pretrained network while taking into account constraints, leading to melodic lines that respect both the corpus' style and the music theory constraints.

  • 16h45 - 17h10

    Combining Constraint Programming and Deep Reinforcement Learning to Solve Classical AI Planning Tasks

    • Dana Chaillard, prés., Polytechnique Montréal
    • Gilles Pesant, Polytechnique Montréal

    While there has been significant progress in general AI planning, certain domains remain out of reach of current typical AI planning systems. Recently, the use of deep reinforcement learning has allowed to push that frontier further. We try to help further these RL agents with the use of constraint programming with belief propagation. Constraint programming is a very good method to uncover the structure of a problem and we use that information to guide a reinforcement learning agent on a classical AI planning task, Floortile. We experiment with three ways to provide guidance and report our initial findings.