PreCog: Exploring the Relation between Memorization and Performance in Pre-trained Language Models

Autor: Ranaldi, Leonardo, Ruzzetti, Elena Sofia, Zanzotto, Fabio Massimo
Rok vydání: 2023
Předmět:
Zdroj: 2023.ranlp-1.103
Druh dokumentu: Working Paper
DOI: 10.26615/978-954-452-092-2_103
Popis: Pre-trained Language Models such as BERT are impressive machines with the ability to memorize, possibly generalized learning examples. We present here a small, focused contribution to the analysis of the interplay between memorization and performance of BERT in downstream tasks. We propose PreCog, a measure for evaluating memorization from pre-training, and we analyze its correlation with the BERT's performance. Our experiments show that highly memorized examples are better classified, suggesting memorization is an essential key to success for BERT.
Databáze: arXiv