Výsledky vyhledávání - "Mahan, Dakota"

Report

Autor: Mahan, Dakota, Van Phung, Duy, Rafailov, Rafael, Blagden, Chase, Lile, Nathan, Castricato, Louis, Fränken, Jan-Philipp, Finn, Chelsea, Albalak, Alon

Reinforcement Learning from Human Feedback (RLHF) has greatly improved the performance of modern Large Language Models (LLMs). The RLHF process is resource-intensive and technically challenging, generally requiring a large collection of human prefere

Externí odkaz: http://arxiv.org/abs/2410.12832

Zobrazit plný text záznamu

Report

Stable Code Technical Report

Autor: Pinnaparaju, Nikhil, Adithyan, Reshinth, Phung, Duy, Tow, Jonathan, Baicoianu, James, Datta, Ashish, Zhuravinskyi, Maksym, Mahan, Dakota, Bellagente, Marco, Riquelme, Carlos, Cooper, Nathan

We introduce Stable Code, the first in our new-generation of code language models series, which serves as a general-purpose base code language model targeting code completion, reasoning, math, and other software engineering-based tasks. Additionally,

Externí odkaz: http://arxiv.org/abs/2404.01226

Zobrazit plný text záznamu

Report

Stable LM 2 1.6B Technical Report

Autor: Bellagente, Marco, Tow, Jonathan, Mahan, Dakota, Phung, Duy, Zhuravinskyi, Maksym, Adithyan, Reshinth, Baicoianu, James, Brooks, Ben, Cooper, Nathan, Datta, Ashish, Lee, Meng, Mostaque, Emad, Pieler, Michael, Pinnaparju, Nikhil, Rocha, Paulo, Saini, Harry, Teufel, Hannah, Zanichelli, Niccolo, Riquelme, Carlos

We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weight

Externí odkaz: http://arxiv.org/abs/2402.17834

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání