S. Sabroso-Lasa, L. M. Esteban Escaño, N. Malats, J. T. Alcalá Nalvaiz
The advancement of computational methods and data collection technologies has resulted in an increase in the generation of complex datasets. As a result, managing missing data has become a critical challenge.
Imputing missing values is crucial for ensuring the reliability of statistical inferences and predictive models, and for that reason, understanding how the proportion and distribution of missing data influence model performance, and specifically the Youden Index (J) remains an issue that requires further investigation.
To overcome this situation, we conducted simulations under realistic conditions with various imputation methods to evaluate the predictive ability across different levels of missing data, which ranged from 5% to 75%.
Our findings indicate that most diagnostic metrics decrease by 20–30% compared to models with complete data. This underscores the significant impact of missing data on diagnostic metrics and can lead to unreliable Youden Index values and cutoff points.
Keywords: Missing data, Imputation, Youden Index, Predictive Models
Scheduled
Software I
June 10, 2025 3:30 PM
Sala VIP Jaume Morera i Galícia