M. Comas Cufí, P. De La Lama, J. Saperas Riera
Significant advancements in compositional data (CoDa) analysis using Aitchison geometry enable robust metric space exploration. CoDa resides in the Simplex, a (D-1)-dimensional space. This study compares various norms, focusing on the L1-norm, and applies convex clustering algorithms adapted to CoDa. We evaluate different L1-based penalization terms: the L1-olr norms (derived from Principal Components and a Sequential Binary Partition -default in CoDaPack-), the L1-clr norm, and the L1-CoDa norm, using a dataset on milk composition from 24 mammals. Results show that L1-olr lacks subcompositional coherence and basis independence but is suitable for agglomerative clustering, whereas L1-clr preserves those properties but is less effective for agglomerative clustering. The L1-CoDa norm maintains the compositional properties and supports agglomeration, enhancing meaningful data interpretation. These findings highlight the importance of tailored norms for CoDa clustering.
Keywords: compositional data, L¹-Norm, clustering
Scheduled
Posters session I
June 12, 2025 7:00 PM
Foyer principal (coffe break)