TopEx: Topic-based Explanations for Model Comparison

Autor: Havaldar, Shreya, Stein, Adam, Wong, Eric, Ungar, Lyle
Rok vydání: 2023
Předmět:
Druh dokumentu: Working Paper
Popis: Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between DistilRoBERTa and GPT-2 on a variety of NLP tasks.
Comment: Accepted to ICLR 2023, Tiny Papers Track
Databáze: arXiv