P. García Arce, R. Naveiro Flores, D. Ríos Insua
Machine learning systems are increasingly exposed to adversarial attacks. While much of the research in adversarial machine learning has focused on studying the weaknesses against evasion attacks against classification models in classical setups, the susceptibility of Bayesian regression models to attacks remains underexplored. This paper introduces a general methodology for designing optimal evasion attacks against such models. We investigate two adversarial objectives: perturbing specific point predictions and altering the entire posterior predictive distribution. For both scenarios, we propose gradient-based attacks that are applicable even when the posterior predictive distribution lacks a closed-form solution and is accessible only through Markov Chain Monte Carlo sampling.
Keywords: Adversarial Machine Learning, Bayesian Neural Networks
Scheduled
Interdisciplinary applications of Bayesian methods
June 10, 2025 3:30 PM
Sala de prensa (MR 13)