Quantifying the Importance of Latent Features in Neural Networks

Autor: Alshareef, A, Berthier, N, Schewe, S, Huang, X
Jazyk: angličtina
Rok vydání: 2022
Zdroj: CEUR Workshop Proceedings
Popis: The susceptibility of deep learning models to adversarial examples raises serious concerns over their application in safety-critical contexts. In particular, the level of understanding of the underlying decision processes often lies far below what can reasonably be accepted for standard safety assurance. In this work, we provide insights into the high-level representations learned by neural network models. We specifically investigate how the distribution of features in their latent space changes in the presence of distortions. To achieve this, we first abstract a given neural network model into a Bayesian Network, where each random variable represents the value of a hidden feature. We then estimate the importance of each feature by analysing the sensitivity of the abstraction to targeted perturbations. An importance value indicates the role of the corresponding feature in underlying decision process. Our empirical results suggest that obtained feature importance measures provide valuable insights for validating and explaining neural network decisions.
Databáze: OpenAIRE