MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation

Autor: Yang, Wenzhuo, Li, Jia, Xiong, Caiming, Hoi, Steven C. H.
Rok vydání: 2022
Předmět:
Druh dokumentu: Working Paper
Popis: Counterfactual explanation is an important Explainable AI technique to explain machine learning predictions. Despite being studied actively, existing optimization-based methods often assume that the underlying machine-learning model is differentiable and treat categorical attributes as continuous ones, which restricts their real-world applications when categorical attributes have many different values or the model is non-differentiable. To make counterfactual explanation suitable for real-world applications, we propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE), which adopts a newly designed pipeline that can efficiently handle non-differentiable machine-learning models on a large number of feature values. in our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
Comment: 9 pages, 2 figures
Databáze: arXiv