Výsledky vyhledávání - "Alcázar, Cristóbal"

Report

Avoiding mode collapse in diffusion models fine-tuned with reinforcement learning

Autor: Barceló, Roberto, Alcázar, Cristóbal, Tobar, Felipe

Fine-tuning foundation models via reinforcement learning (RL) has proven promising for aligning to downstream objectives. In the case of diffusion models (DMs), though RL training improves alignment from early timesteps, critical issues such as train

Externí odkaz: http://arxiv.org/abs/2410.08315

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání