Grokking phase transitions in learning local rules with gradient descent

Autor: Žunkovič, Bojan, Ilievski, Enej
Rok vydání: 2022
Předmět:
Druh dokumentu: Working Paper
Popis: We discuss two solvable grokking (generalisation beyond overfitting) models in a rule learning scenario. We show that grokking is a phase transition and find exact analytic expressions for the critical exponents, grokking probability, and grokking time distribution. Further, we introduce a tensor-network map that connects the proposed grokking setup with the standard (perceptron) statistical learning theory and show that grokking is a consequence of the locality of the teacher model. As an example, we analyse the cellular automata learning task, numerically determine the critical exponent and the grokking time distributions and compare them with the prediction of the proposed grokking model. Finally, we numerically analyse the connection between structure formation and grokking.
Comment: 31+10 pages, 22 figures
Databáze: arXiv