Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Kang, Sehyeok"'
We present Preference Flow Matching (PFM), a new framework for preference-based reinforcement learning (PbRL) that streamlines the integration of preferences into an arbitrary class of pre-trained models. Existing PbRL methods require fine-tuning pre
Externí odkaz:
http://arxiv.org/abs/2405.19806