Výsledky vyhledávání

Report

Methods of improving LLM training stability

Autor: Rybakov, Oleg, Chrzanowski, Mike, Dykas, Peter, Xue, Jinze, Lanir, Ben

Training stability of large language models(LLMs) is an important research topic. Reproducing training instabilities can be costly, so we use a small language model with 830M parameters and experiment with higher learning rates to force models to div

Externí odkaz: http://arxiv.org/abs/2410.16682

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání