Mixing corrupted preferences for robust and feedback-efficient preference-based reinforcement learning

Autor: Heo, Jongkook, Lee, Young Jae, Kim, Jaehoon, Kwak, Min Gu, Park, Young Joon, Kim, Seoung Bum
Zdroj: In Knowledge-Based Systems 30 January 2025 309
Databáze: ScienceDirect