L. Amorosi, J. Puerto Albandoz, C. Valverde Martín
Hierarchical clustering is a statistical technique to study groups (clusters) within a dataset by creating a hierarchy of clusters, represented by a rooted tree (dendrogram). The leaves correspond to data points, and each internal node represents a cluster containing its descendant leaves. Among hierarchical clustering methods, agglomerative ones rely on greedy procedures that return nested partitions, where each level joins two clusters from the lower partition using a local criterion. In this talk, we focus on a mathematical programming formalization that incorporates single and complete linkage procedures. Through experiments, we compare the dendrograms obtained from the exact resolution of the formulations with those from the greedy approach. Additionally, we present a scalable matheuristic algorithm that generates better quality dendrograms than the greedy approach, even for large datasets.
Keywords: Data science; Hierarchical clustering; Mathematical programming;
Scheduled
Location
June 11, 2025 10:30 AM
Mr 2