L. Amorosi, J. Puerto Albandoz, C. Valverde Martín
Hierarchical clustering is a statistical technique to study groups (clusters) within a dataset by creating a hierarchy of clusters, represented by a rooted tree (dendrogram). The leaves correspond to data points, and each internal node represents a cluster containing its descendant leaves. Among hierarchical clustering methods, agglomerative ones rely on greedy procedures that return nested partitions, where each level joins two clusters from the lower partition using a local criterion. In this talk, we focus on a mathematical programming formalization that incorporates single and complete linkage procedures. Through experiments, we compare the dendrograms obtained from the exact resolution of the formulations with those from the greedy approach. Additionally, we present a scalable matheuristic algorithm that generates better quality dendrograms than the greedy approach, even for large datasets.
Palabras clave: Data science; Hierarchical clustering; Mathematical programming;
Programado
Localización
11 de junio de 2025 10:30
MR 2