Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Yu, Huimu"'
Large language models (LLMs) have made significant progress in natural language understanding and generation, driven by scalable pretraining and advanced finetuning. However, enhancing reasoning abilities in LLMs, particularly via reinforcement learn
Externí odkaz:
http://arxiv.org/abs/2410.02229