15th EUROPT Workshop on Advances in Continuous Optimization

Montréal, Canada, July 12 — 14, 2017

15th EUROPT Workshop on Advances in Continuous Optimization

Montréal, Canada, July 12 — 14, 2017

Schedule Authors My Schedule
Cal add eabad1550a3cf3ed9646c36511a21a854fcb401e3247c61aefa77286b00fe402

Classification Problems

Jul 14, 2017 10:30 AM – 11:45 AM

Location: TD Assurance Meloche Monnex

Chaired by Giulia Zarpellon

3 Presentations

  • Cal add eabad1550a3cf3ed9646c36511a21a854fcb401e3247c61aefa77286b00fe402
    10:30 AM - 10:55 AM

    Multimodel selection for classification problems

    • Alexander Aduenko, presenter, Moscow Institute of Physics and Technology
    • Vadim Strijov, Federal Research Center «Computer Science and Control» of Russian Academy of Sciences

    In this article we consider the problem of multimodel selection for classification problems. Multimodels are used when a sample cannot be described by a single model. This happens when features' weights depend on the features' values. In such a case a single generalized linear model cannot describe the relation between features and target variable. Though a multimodel is an interpretable generalization of a single model case, it can contain large number of similar models. This leads to a poor forecast quality and lack of interpretability. Several multimodel pruning algorithms are constructed based on the suggested method for statistical model comparison. The method is based on introduced similarity function for posterior distributions of models' parameters. Properties of this function are considered and asymptotic distrubiton of its values for coincident generalized linear models is obtained. The notion of an adequate multimodel is introduced, for which all the constituting models are pairwise statistically distinguishable. The upper and the lower bounds on the maximum number of models in an adequate multimodel are obtained. Diagonal maximum evidence estimate of features' weights' covariance matrix is used for feature selection. Asymptotic degeneracy of non-diagonal estimate of this matrix is proved. A method is suggested to detect and handle multicollinear features. Several computational experiments show significant improvement in classification quality for real datasets and substantial multimodel size reduction.

  • Cal add eabad1550a3cf3ed9646c36511a21a854fcb401e3247c61aefa77286b00fe402
    10:55 AM - 11:20 AM

    A multi-objective approach for binary classification with imbalanced data

    • Alessandro Galligari, presenter, Università degli Studi di Firenze
    • Cocchi Guido, Università degli Studi di Firenze
    • Sciandrone Marco, Università degli Studi di Firenze

    The aim of this work is to deal with the problem of imbalanced data in binary classification introducing a novel multi-objective approach. The examples of each class are grouped together forming two different objectives. This approach is non-parametric since the relative weight between classes has not to be validated as it is done in single-objective methods with the overall sum of the example errors. Several formulations of the multi-objective problem have been defined in order to obtain a solution on the Pareto frontier which guarantees robustness and a suitable generalization capability from the machine learning point of view. The results of computational experiments will be presented and discussed.

  • Cal add eabad1550a3cf3ed9646c36511a21a854fcb401e3247c61aefa77286b00fe402
    11:20 AM - 11:45 AM

    Learning a classification of Mixed-Integer Quadratic Programming problems

    • Zarpellon Giulia, presenter, Polytechnique Montreal
    • Lodi Andrea, Polytechnique Montreal
    • Bonami Pierre, CPLEX Optimization, IBM Spain

    Within state-of-the-art solvers such as IBM-CPLEX the ability to solve both convex and nonconvex Mixed-Integer Quadratic Programming (MIQP) problems to proven optimality goes back few years, but still presents some unclear decisions. Among them, we are interested in understanding whether for solving an MIQP problem it is favorable to linearize its quadratic part or not. Our approach employs Machine Learning techniques to learn a classifier that predicts, for a given MIQP instance, the most suitable resolution process within CPLEX’s
    framework. We aim as well at gaining theoretical insights about the instances’ features leading this algorithmic discrimination.