Mixing corrupted preferences for robust and feedback-efficient preference-based reinforcement learning
Autor: | Heo, Jongkook, Lee, Young Jae, Kim, Jaehoon, Kwak, Min Gu, Park, Young Joon, Kim, Seoung Bum |
---|---|
Zdroj: | In Knowledge-Based Systems 30 January 2025 309 |
Databáze: | ScienceDirect |
Externí odkaz: |