Reliable Graph Neural Network Explanations Through Adversarial Training

Autor: Loveland, Donald, Liu, Shusen, Kailkhura, Bhavya, Hiszpanski, Anna, Han, Yong
Rok vydání: 2021
Předmět:
Druh dokumentu: Working Paper
Popis: Graph neural network (GNN) explanations have largely been facilitated through post-hoc introspection. While this has been deemed successful, many post-hoc explanation methods have been shown to fail in capturing a model's learned representation. Due to this problem, it is worthwhile to consider how one might train a model so that it is more amenable to post-hoc analysis. Given the success of adversarial training in the computer vision domain to train models with more reliable representations, we propose a similar training paradigm for GNNs and analyze the respective impact on a model's explanations. In instances without ground truth labels, we also determine how well an explanation method is utilizing a model's learned representation through a new metric and demonstrate adversarial training can help better extract domain-relevant insights in chemistry.
Comment: 4 pages, 3 figures, ICML Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI
Databáze: arXiv