Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Lanir, Ben"'
Training stability of large language models(LLMs) is an important research topic. Reproducing training instabilities can be costly, so we use a small language model with 830M parameters and experiment with higher learning rates to force models to div
Externí odkaz:
http://arxiv.org/abs/2410.16682