Language-Agnostic Bias Detection in Language Models

Autor:	Köksal, Abdullatif, Yalcin, Omer Faruk, Akbiyik, Ahmet, Kilavuz, M. Tahir, Korhonen, Anna, Schütze, Hinrich
Rok vydání:	2023
Předmět:	FOS: Computer and information sciences Computer Science - Computation and Language Computation and Language (cs.CL)
DOI:	10.48550/arxiv.2305.13302
Popis:	Pretrained language models (PLMs) are key components in NLP, but they contain strong social biases. Quantifying these biases is challenging because current methods focusing on fill-the-mask objectives are sensitive to slight changes in input. To address this, we propose LABDet, a robust language-agnostic method for evaluating bias in PLMs. For nationality as a case study, we show that LABDet "surfaces" nationality bias by training a classifier on top of a frozen PLM on non-nationality sentiment detection. Collaborating with political scientists, we find consistent patterns of nationality bias across monolingual PLMs in six languages that align with historical and political context. We also show for English BERT that bias surfaced by LABDet correlates well with bias in the pretraining data; thus, our work is one of the few studies that directly links pretraining data to PLM behavior. Finally, we verify LABDet's reliability and applicability to different templates and languages through an extensive set of robustness checks.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ad7a02d610711b21f0a57a62cb35bbd1 Zobrazit plný text záznamu