Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Musat, Tiberiu"'
Autor:
Musat, Tiberiu
In this paper, I introduce the retrieval problem, a simple reasoning task that can be solved only by transformers with a minimum number of layers. The task has an adjustable difficulty that can further increase the required number of layers to any ar
Externí odkaz:
http://arxiv.org/abs/2411.12118
Autor:
Musat, Tiberiu
Recent studies have revealed that neural networks learn interpretable algorithms for many simple problems. However, little is known about how these algorithms emerge during training. In this article, I study the training dynamics of a small neural ne
Externí odkaz:
http://arxiv.org/abs/2408.09414