Výsledky vyhledávání - "Zmushko, Philip"

Report

FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training

Autor: Zmushko, Philip, Beznosikov, Aleksandr, Takáč, Martin, Horváth, Samuel

With the increase in the number of parameters in large language models, the process of pre-training and fine-tuning increasingly demands larger volumes of GPU memory. A significant portion of this memory is typically consumed by the optimizer state.

Externí odkaz: http://arxiv.org/abs/2411.07837

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání