Zobrazeno 1 - 5
of 5
pro vyhledávání: '"Phatale, Samrat"'
Autor:
Luo, Liangchen, Liu, Yinxiao, Liu, Rosanne, Phatale, Samrat, Lara, Harsh, Li, Yunxuan, Shu, Lei, Zhu, Yun, Meng, Lei, Sun, Jiao, Rastogi, Abhinav
Complex multi-step reasoning tasks, such as solving mathematical problems or generating code, remain a significant hurdle for even the most advanced large language models (LLMs). Verifying LLM outputs with an Outcome Reward Model (ORM) is a standard
Externí odkaz:
http://arxiv.org/abs/2406.06592
Autor:
Sidahmed, Hakim, Phatale, Samrat, Hutcheson, Alex, Lin, Zhuonan, Chen, Zhang, Yu, Zac, Jin, Jarvis, Chaudhary, Simral, Komarytsia, Roman, Ahlheim, Christiane, Zhu, Yonghao, Li, Bowen, Ganesh, Saravanan, Byrne, Bill, Hoffmann, Jessica, Mansoor, Hassan, Li, Wei, Rastogi, Abhinav, Dixon, Lucas
While Reinforcement Learning from Human Feedback (RLHF) effectively aligns pretrained Large Language and Vision-Language Models (LLMs, and VLMs) with human preferences, its computational cost and complexity hamper its wider adoption. To alleviate som
Externí odkaz:
http://arxiv.org/abs/2403.10704
Autor:
Lee, Harrison, Phatale, Samrat, Mansoor, Hassan, Mesnard, Thomas, Ferret, Johan, Lu, Kellie, Bishop, Colton, Hall, Ethan, Carbune, Victor, Rastogi, Abhinav, Prakash, Sushant
Publikováno v:
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:26874-26901, 2024
Reinforcement learning from human feedback (RLHF) has proven effective in aligning large language models (LLMs) with human preferences, but gathering high-quality preference labels is expensive. RL from AI Feedback (RLAIF), introduced in Bai et al.,
Externí odkaz:
http://arxiv.org/abs/2309.00267
Autor:
Gupta, Raghav, Aksitov, Renat, Phatale, Samrat, Chaudhary, Simral, Lee, Harrison, Rastogi, Abhinav
Conversational recommendation systems (CRS) aim to recommend suitable items to users through natural language conversation. However, most CRS approaches do not effectively utilize the signal provided by these conversations. They rely heavily on expli
Externí odkaz:
http://arxiv.org/abs/2305.13725
Publikováno v:
ICCV Workshop on Closing the Loop Between Vision and Language, 2019
Painting captions are often dry and simplistic which motivates us to describe a painting creatively in the style of Shakespearean prose. This is a difficult problem, since there does not exist a large supervised dataset from paintings to Shakespearea
Externí odkaz:
http://arxiv.org/abs/1910.03634