WB9 - Data mining
May 13 2026 11:05 – 12:45
Location: Luc-Poirier (green)
Chaired by Mael Charpentier
3 Presentations
Counterfactual explanations for clustering medoids
As the demand for interpretable machine learning grows, understanding the structure of clustering results remains a challenging task. We propose a novel post-clustering method that generates counterfactual explanations for cluster medoids by answering the question: What is the smallest change to a data point that would make it the representative of its cluster? We formulate this problem as a convex optimization model, specifically a second-order cone program, which guarantees globally optimal solutions and can be efficiently solved using off-the-shelf solvers. Beyond generating individual explanations, our approach provides new insights into cluster structure. By analyzing the minimal counterfactual adjustments, we identify which features constrain representativeness, rather than merely describing which features characterize a cluster. This perspective reveals how global distance relationships shape cluster prototypes and highlights differences in how representativeness is structured across datasets. Experiments on illustrative datasets show that the proposed method can capture structural properties that are not reflected by standard distance-based measures.
K-means with Controlled Center Updates: Rethinking the Centroid Update Rule
The k-means algorithm is widely used for clustering but is sensitive to initialization and often converges to poor local minima. We propose a k-means variant with novel center updating approach. Experiments on benchmark datasets show improved solution quality over standard k-means, with gains up to 15% under equal computational cost.
Optimizing Sensor Deployment with Gaussian Process Regression
We present a framework for optimizing ecological sensor deployment using regression to quantify uncertainty from sparse data. Sensor placement is guided by information-theoretic metrics, while unmonitored locations are extrapolated using a streaming Gaussian process. We outline how this framework can be extended to adapt sensor placement for early change detection.
