Text Characterization Toolkit

Autor:	Simig, Daniel, Wang, Tianlu, Dankers, Verna, Henderson, Peter, Batsuren, Khuyagbaatar, Hupkes, Dieuwke, Diab, Mona
Rok vydání:	2022
Předmět:	Computer Science - Computation and Language Computer Science - Machine Learning
Druh dokumentu:	Working Paper
Popis:	In NLP, models are usually evaluated by reporting single-number performance scores on a number of readily available benchmarks, without much deeper analysis. Here, we argue that - especially given the well-known fact that benchmarks often contain biases, artefacts, and spurious correlations - deeper results analysis should become the de-facto standard when presenting new models or benchmarks. We present a tool that researchers can use to study properties of the dataset and the influence of those properties on their models' behaviour. Our Text Characterization Toolkit includes both an easy-to-use annotation tool, as well as off-the-shelf scripts that can be used for specific analyses. We also present use-cases from three different domains: we use the tool to predict what are difficult examples for given well-known trained models and identify (potentially harmful) biases and heuristics that are present in a dataset.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2210.01734 Zobrazit plný text záznamu View this record from Arxiv