Reliable Graph Neural Network Explanations Through Adversarial Training
Autor: | Loveland, Donald, Liu, Shusen, Kailkhura, Bhavya, Hiszpanski, Anna, Han, Yong |
---|---|
Rok vydání: | 2021 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | Graph neural network (GNN) explanations have largely been facilitated through post-hoc introspection. While this has been deemed successful, many post-hoc explanation methods have been shown to fail in capturing a model's learned representation. Due to this problem, it is worthwhile to consider how one might train a model so that it is more amenable to post-hoc analysis. Given the success of adversarial training in the computer vision domain to train models with more reliable representations, we propose a similar training paradigm for GNNs and analyze the respective impact on a model's explanations. In instances without ground truth labels, we also determine how well an explanation method is utilizing a model's learned representation through a new metric and demonstrate adversarial training can help better extract domain-relevant insights in chemistry. Comment: 4 pages, 3 figures, ICML Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI |
Databáze: | arXiv |
Externí odkaz: |