J. Saperas Riera, G. Mateu Figueras, J. A. Martín Fernández
The study of norms is key in compositional data analysis (CoDA). We explore the advantages of the L1 Euclidean norm induced on the compositional quotient space, compared with the L1-clr norm, its restriction to the clr subspace. Our objective is to highlight how the L1-CoDa norm better captures compositional geometry, yielding more interpretable results in cluster analysis and penalized regression.
In convex clustering (clusterpath), the L1-CoDa norm forms agglomerative clusters, while L1-clr does not. In LASSO regression, L1-clr introduces spurious correlations, distorting variable selection, whereas L1-CoDa avoids these artefacts by selecting balances.
These findings highlight the importance of choosing an appropriate norm in compositional data analysis. The L1-CoDa norm ensures a natural and robust approach within Aitchison geometry, enhancing interpretability in clustering, regression, and statistical modelling.
Keywords: compositional data, L1-CoDa norm, cluster analysis, penalized regression, LASSO, Aitchison geometry.
Scheduled
Análisis Multivariante
June 12, 2025 7:00 PM
MR 1