Interpreting deep learning models for entity resolution: an experience report using LIME

Autor:	Divesh Srivastava, Paolo Merialdo, Nick Koudas, Donatella Firmani, Vincenzo Di Cicco
Přispěvatelé:	Vincenzo Di Cicco, Donatella Firmani, Nick Kouda, Paolo Merialdo, Divesh Srivastava, Di Cicco, Vincenzo, Firmani, Donatella, Koudas, Nick, Merialdo, Paolo, Srivastava, Divesh
Jazyk:	angličtina
Rok vydání:	2019
Předmět:	Entity Resolution (ER) seeks to understand which records refer to the same entity (e.g. matching products sold on multiple websites). The sheer number of ways humans represent and misrepresent information about real-world entities makes ER a challenging problem. Deep Learning (DL) has provided impressive results in the field of natural language processing thus recent works started exploring DL approaches to the ER problem with encouraging results. However we are still far from understanding why and when these approaches work in the ER setting. We are developing a methodology Mojito to produce explainable interpretations of the output of DL models for the ER task. Our methodology is based on LIME a popular tool for producing prediction explanations for generic classification tasks. In this paper we report our first experiences in interpreting recent DL models for the ER task. Our results demonstrate the importance of explanations in the DL space and suggest that when assessing performance of DL algorithms for ER accuracy alone may not be sufficient to demonstrate generality and reproducibility in a production environment Matching (statistics) business.industry Computer science Deep learning 020207 software engineering 02 engineering and technology Space (commercial competition) computer.software_genre Field (computer science) 030218 nuclear medicine & medical imaging Task (project management) 03 medical and health sciences 0302 clinical medicine 0202 electrical engineering electronic engineering information engineering Experience report Artificial intelligence business computer Natural language processing
Zdroj:	aiDM@SIGMOD
Popis:	Entity Resolution (ER) seeks to understand which records refer to the same entity (e.g., matching products sold on multiple websites). The sheer number of ways humans represent and misrepresent information about real-world entities makes ER a challenging problem. Deep Learning (DL) has provided impressive results in the field of natural language processing, thus recent works started exploring DL approaches to the ER problem, with encouraging results. However, we are still far from understanding why and when these approaches work in the ER setting. We are developing a methodology, Mojito, to produce explainable interpretations of the output of DL models for the ER task. Our methodology is based on LIME, a popular tool for producing prediction explanations for generic classification tasks. In this paper we report our first experiences in interpreting recent DL models for the ER task. Our results demonstrate the importance of explanations in the DL space, and suggest that, when assessing performance of DL algorithms for ER, accuracy alone may not be sufficient to demonstrate generality and reproducibility in a production environment.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::433b3ebe8ba27f780bffdae5b2f07e01 https://hdl.handle.net/11590/364326 Zobrazit plný text záznamu