Zobrazeno 1 - 3
of 3
pro vyhledávání: '"Mahan, Dakota"'
Autor:
Mahan, Dakota, Van Phung, Duy, Rafailov, Rafael, Blagden, Chase, Lile, Nathan, Castricato, Louis, Fränken, Jan-Philipp, Finn, Chelsea, Albalak, Alon
Reinforcement Learning from Human Feedback (RLHF) has greatly improved the performance of modern Large Language Models (LLMs). The RLHF process is resource-intensive and technically challenging, generally requiring a large collection of human prefere
Externí odkaz:
http://arxiv.org/abs/2410.12832
Autor:
Pinnaparaju, Nikhil, Adithyan, Reshinth, Phung, Duy, Tow, Jonathan, Baicoianu, James, Datta, Ashish, Zhuravinskyi, Maksym, Mahan, Dakota, Bellagente, Marco, Riquelme, Carlos, Cooper, Nathan
We introduce Stable Code, the first in our new-generation of code language models series, which serves as a general-purpose base code language model targeting code completion, reasoning, math, and other software engineering-based tasks. Additionally,
Externí odkaz:
http://arxiv.org/abs/2404.01226
Autor:
Bellagente, Marco, Tow, Jonathan, Mahan, Dakota, Phung, Duy, Zhuravinskyi, Maksym, Adithyan, Reshinth, Baicoianu, James, Brooks, Ben, Cooper, Nathan, Datta, Ashish, Lee, Meng, Mostaque, Emad, Pieler, Michael, Pinnaparju, Nikhil, Rocha, Paulo, Saini, Harry, Teufel, Hannah, Zanichelli, Niccolo, Riquelme, Carlos
We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weight
Externí odkaz:
http://arxiv.org/abs/2402.17834