Journées de l'optimisation 2022
HEC Montréal, Québec, Canada, 16 — 18 mai 2022
TB3 - Constraint Programming and Machine Learning
17 mai 2022 15h30 – 17h10
Salle: Procter & Gamble (vert)
Présidée par Gilles Pesant
15h30 - 15h55
Learning Optimal Decision Trees using Constraint Programming
Decision trees are among the most popular classification models in machine learning. Traditionally, they are learned using greedy algorithms. However, such algorithms pose several disadvantages: it is difficult to limit the size of the decision trees while maintaining a good classification accuracy, and it is hard to impose additional constraints on the models that are learned. For these reasons, there has been a recent interest in exact and flexible algorithms for learning decision trees. In this paper, we introduce a new approach to learn decision trees using constraint programming. Compared to earlier approaches, we show that our approach obtains better performance, while still being sufficiently flexible to allow for the inclusion of constraints. Our approach builds on three key building blocks: (1) the use of AND/OR search, (2) the use of caching, (3) the use of the CoverSize global constraint proposed recently for the problem of itemset mining. This allows our constraint programming approach to deal in a much more efficient way with the decompositions in the learning problem.
15h55 - 16h20
Adding Long-Term Structure to Trained Sequence Generators using Constraint Programming
Sequence models in machine learning often struggle to exhibit long-term structure. We consider this
problem at inference time in the context of enforcing on a sequence constraints that are not featured
in the data set on which the model was trained. The difficulty lies in imposing previously-unseen
structure while staying close to the data set the model was trained on. It is particularly hard for
long-term structure, which requires balancing foresight over many yet-to-be generated tokens and
the immediacy of next-token predictions from the sequence model. We address this problem by
expressing such structure through a constraint programming model and by mixing the learned
probabilities of the sequence model with the marginal probabilities of the constraint programming
model before predicting the next token by sampling the resulting probability distribution. We
experiment with our proposal in the context of computer-assisted music composition and show that
we achieve the desired structure without straying too much from the learned style of the corpus.
16h20 - 16h45
Combining Reinforcement Learning and Constraint Programming for Sequence-Generation Tasks with Hard Constraints
While Machine Learning (ML) techniques are good at generating data similar to a dataset, they lack the capacity to enforce constraints. On the other hand, any solution to a Constraint Programming (CP) model satisfies its constraints but has no obligation to imitate a dataset. Yet, we sometimes need both. In this paper we borrow RL-Tuner, a Reinforcement Learning (RL) algorithm introduced to tune neural networks, as our enabling architecture to exploit the respective strengths of ML and CP. RL-Tuner maximizes the sum of a pretrained network's learned probabilities and of manually-tuned penalties for each violated constraint. We replace the latter with outputs of a CP model representing the marginal probabilities of each note and the number of constraint violations. As was the case for the original RL-Tuner, we apply our algorithm to music generation since it is a highly-constrained domain for which CP is especially suited. We show that combining ML and CP, as opposed to using them individually, allows the agent to reflect the pretrained network while taking into account constraints, leading to melodic lines that respect both the corpus' style and the music theory constraints.
16h45 - 17h10
Combining Constraint Programming and Deep Reinforcement Learning to Solve Classical AI Planning Tasks
While there has been significant progress in general AI planning, certain domains remain out of reach of current typical AI planning systems. Recently, the use of deep reinforcement learning has allowed to push that frontier further. We try to help further these RL agents with the use of constraint programming with belief propagation. Constraint programming is a very good method to uncover the structure of a problem and we use that information to guide a reinforcement learning agent on a classical AI planning task, Floortile. We experiment with three ways to provide guidance and report our initial findings.