M. D. Jiménez Gamero, M. R. Sillero Denamiel
In some practical settings, the population is divided into a large number k of subpopulations (by countries, cities, age groups, etc). In such a case, one may be interested in testing the equality of the k subpopulations, which is itself of interest and as a previous step in some Machine Learning approaches such as classification problems. With this aim, an unbiased estimator of the Gini covariance is taken as a test statistic. The asymptotic distribution of the test statistic is stated under the null hypothesis as well as under alternatives, assuming k large and small to moderate sample sizes. Specifically, it is shown that the test statistic is asymptotically free distributed under the null hypothesis, avoiding the use of complicated resampling procedures. The finite sample performance of the test based on the asymptotic null distribution is studied via simulation and compared with existing methods. An application to a real dataset about the quality of the air is shown.
Keywords: k-sample problem, energy distance, Gini correlation, asymptotic power, consistency
Scheduled
AMC4 Prediction and Classification
June 11, 2025 10:30 AM
MR 1