Counterfactual Explanation of AI Models Using an Adaptive Genetic Algorithm With Embedded Feature Weights

Autor: Ebtisam AlJalaud, Manar Hosny
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: IEEE Access, Vol 12, Pp 74993-75009 (2024)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2024.3404043
Popis: Explainable Artificial Intelligence (XAI) is a cutting-edge AI development motivated by the need for transparency of black-box models in AI systems. This transparency enhances user trust, facilitates accountability, and enables a better understanding of AI systems decisions, especially in critical applications where insights into decision processes are essential. These benefits have increased XAI research interest, aiming to provide techniques for interpreting and understanding the behavior of intelligent models. Counterfactual explanation is a popular technique for model interpretation based on updating a few features such that the outcome of an AI model is changed. Users can gain insights into the critical features or factors influencing the AI system’s decision by analyzing these counterfactuals. However, most counterfactual techniques require more qualifications, such as simplicity, robustness, and coherence. In this research, we propose a novel approach, Adaptive Feature Weight Genetic Explanation (AFWGE), for generating counterfactual explanations of AI models, where a custom genetic algorithm (GA) is employed, incorporating adaptive feature weights to enhance the algorithm’s performance. Experimental results on four benchmark datasets show that AFWGE allows for the adaptation of feature weights during the evolutionary process, producing more effective counterfactual explanations with superior proximity, sparsity, plausibility, and actionability. Furthermore, it emphasizes feature weights as reliable indicators of the significance of the model’s features, providing valuable insights for interpreting the model. AFWGE not only advances the field of counterfactual explanation generation but also establishes a robust framework for assessing feature importance in machine learning models.
Databáze: Directory of Open Access Journals