Explaining mispredictions of machine learning models using rule induction

Autor:	Jürgen Cito, Isil Dillig, Seohyun Kim, Vijayaraghavan Murali, Satish Chandra
Rok vydání:	2021
Předmět:	Training set Process (engineering) Rule induction Computer science business.industry media_common.quotation_subject Statistical model Machine learning computer.software_genre Multiple Models Debugging Labeled data Artificial intelligence Set (psychology) business computer media_common
Zdroj:	ESEC/SIGSOFT FSE
DOI:	10.1145/3468264.3468614
Popis:	While machine learning (ML) models play an increasingly prevalent role in many software engineering tasks, their prediction accuracy is often problematic. When these models do mispredict, it can be very difficult to isolate the cause. In this paper, we propose a technique that aims to facilitate the debugging process of trained statistical models. Given an ML model and a labeled data set, our method produces an interpretable characterization of the data on which the model performs particularly poorly. The output of our technique can be useful for understanding limitations of the training data or the model itself; it can also be useful for ensembling if there are multiple models with different strengths. We evaluate our approach through case studies and illustrate how it can be used to improve the accuracy of predictive models used for software engineering tasks within Facebook.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::6a03c524935c4d1c58595116df45eafc https://doi.org/10.1145/3468264.3468614 Zobrazit plný text záznamu