Zobrazeno 1 - 10
of 6 398
pro vyhledávání: '"Allal A"'
Autor:
Penedo, Guilherme, Kydlíček, Hynek, allal, Loubna Ben, Lozhkov, Anton, Mitchell, Margaret, Raffel, Colin, Von Werra, Leandro, Wolf, Thomas
The performance of a large language model (LLM) depends heavily on the quality and size of its pretraining dataset. However, the pretraining datasets for state-of-the-art open LLMs like Llama 3 and Mixtral are not publicly available and very little i
Externí odkaz:
http://arxiv.org/abs/2406.17557
Autor:
Hägele, Alexander, Bakouch, Elie, Kosson, Atli, Allal, Loubna Ben, Von Werra, Leandro, Jaggi, Martin
Scale has become a main ingredient in obtaining strong machine learning models. As a result, understanding a model's scaling properties is key to effectively designing both the right training setup as well as future generations of architectures. In t
Externí odkaz:
http://arxiv.org/abs/2405.18392
Emerging wireless technologies with Gbps connectivity, such as the 5th generation (5G) and 6th generation (6G) of mobile networks, require improved and substantiating documentation for the wireless standards concerning the radio signals, systems, tra
Externí odkaz:
http://arxiv.org/abs/2404.15798
In this paper, we introduce quadratic and cubic polynomial enrichments of the classical Crouzeix--Raviart finite element, with the aim of constructing accurate approximations in such enriched elements. To achieve this goal, we respectively add three
Externí odkaz:
http://arxiv.org/abs/2403.05844
Autor:
Lozhkov, Anton, Li, Raymond, Allal, Loubna Ben, Cassano, Federico, Lamy-Poirier, Joel, Tazi, Nouamane, Tang, Ao, Pykhtar, Dmytro, Liu, Jiawei, Wei, Yuxiang, Liu, Tianyang, Tian, Max, Kocetkov, Denis, Zucker, Arthur, Belkada, Younes, Wang, Zijian, Liu, Qian, Abulkhanov, Dmitry, Paul, Indraneil, Li, Zhuang, Li, Wen-Ding, Risdal, Megan, Li, Jia, Zhu, Jian, Zhuo, Terry Yue, Zheltonozhskii, Evgenii, Dade, Nii Osae Osae, Yu, Wenhao, Krauß, Lucas, Jain, Naman, Su, Yixuan, He, Xuanli, Dey, Manan, Abati, Edoardo, Chai, Yekun, Muennighoff, Niklas, Tang, Xiangru, Oblokulov, Muhtasham, Akiki, Christopher, Marone, Marc, Mou, Chenghao, Mishra, Mayank, Gu, Alex, Hui, Binyuan, Dao, Tri, Zebaze, Armel, Dehaene, Olivier, Patry, Nicolas, Xu, Canwen, McAuley, Julian, Hu, Han, Scholak, Torsten, Paquet, Sebastien, Robinson, Jennifer, Anderson, Carolyn Jane, Chapados, Nicolas, Patwary, Mostofa, Tajbakhsh, Nima, Jernite, Yacine, Ferrandis, Carlos Muñoz, Zhang, Lingming, Hughes, Sean, Wolf, Thomas, Guha, Arjun, von Werra, Leandro, de Vries, Harm
The BigCode project, an open-scientific collaboration focused on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In partnership with Software Heritage (SWH), we build The Stack v2 on top of the digita
Externí odkaz:
http://arxiv.org/abs/2402.19173
Autor:
BigCode collaboration, Hughes, Sean, de Vries, Harm, Robinson, Jennifer, Ferrandis, Carlos Muñoz, Allal, Loubna Ben, von Werra, Leandro, Ding, Jennifer, Paquet, Sebastien, Jernite, Yacine
This document serves as an overview of the different mechanisms and areas of governance in the BigCode project. It aims to support transparency by providing relevant information about choices that were made during the project to the broader public, a
Externí odkaz:
http://arxiv.org/abs/2312.03872
Autor:
Frédéric A. Miéville, Nicolas Pitteloud, Vérane Achard, Giorgio Lamanna, Olivier Pisaturo, Pierre-Alain Tercier, Abdelkarim S. Allal
Publikováno v:
Zeitschrift für Medizinische Physik, Vol 34, Iss 4, Pp 542-554 (2024)
Purpose: To determine 10 MV IMRT and VMAT based protocols with a daily bolus targeting a skin dose of 45 Gy in order to replace the 6 MV tangential fields with a 5 mm thick bolus on alternate days method for post-mastectomy radiotherapy. Method: We m
Externí odkaz:
https://doaj.org/article/f681e2d118f04d34826117412357ec3d
Autor:
Li, Raymond, Allal, Loubna Ben, Zi, Yangtian, Muennighoff, Niklas, Kocetkov, Denis, Mou, Chenghao, Marone, Marc, Akiki, Christopher, Li, Jia, Chim, Jenny, Liu, Qian, Zheltonozhskii, Evgenii, Zhuo, Terry Yue, Wang, Thomas, Dehaene, Olivier, Davaadorj, Mishig, Lamy-Poirier, Joel, Monteiro, João, Shliazhko, Oleh, Gontier, Nicolas, Meade, Nicholas, Zebaze, Armel, Yee, Ming-Ho, Umapathi, Logesh Kumar, Zhu, Jian, Lipkin, Benjamin, Oblokulov, Muhtasham, Wang, Zhiruo, Murthy, Rudra, Stillerman, Jason, Patel, Siva Sankalp, Abulkhanov, Dmitry, Zocca, Marco, Dey, Manan, Zhang, Zhihan, Fahmy, Nour, Bhattacharyya, Urvashi, Yu, Wenhao, Singh, Swayam, Luccioni, Sasha, Villegas, Paulo, Kunakov, Maxim, Zhdanov, Fedor, Romero, Manuel, Lee, Tony, Timor, Nadav, Ding, Jennifer, Schlesinger, Claire, Schoelkopf, Hailey, Ebert, Jan, Dao, Tri, Mishra, Mayank, Gu, Alex, Robinson, Jennifer, Anderson, Carolyn Jane, Dolan-Gavitt, Brendan, Contractor, Danish, Reddy, Siva, Fried, Daniel, Bahdanau, Dzmitry, Jernite, Yacine, Ferrandis, Carlos Muñoz, Hughes, Sean, Wolf, Thomas, Guha, Arjun, von Werra, Leandro, de Vries, Harm
The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilitie
Externí odkaz:
http://arxiv.org/abs/2305.06161
Autor:
Mohamed Hafsa, Leila Allal Benfekih
Publikováno v:
Egyptian Journal of Biological Pest Control, Vol 34, Iss 1, Pp 1-11 (2024)
Abstract Background The employment of entomopathogenic microorganisms is a promising approach for ensuring optimal phytosanitary protection in the framework of biological management of insect crop pests. Among these microbes, entomopathogenic soil-bo
Externí odkaz:
https://doaj.org/article/b5287cb1fa7a4f278e39ba502151fced
Autor:
Laurençon, Hugo, Saulnier, Lucile, Wang, Thomas, Akiki, Christopher, del Moral, Albert Villanova, Scao, Teven Le, Von Werra, Leandro, Mou, Chenghao, Ponferrada, Eduardo González, Nguyen, Huu, Frohberg, Jörg, Šaško, Mario, Lhoest, Quentin, McMillan-Major, Angelina, Dupont, Gerard, Biderman, Stella, Rogers, Anna, allal, Loubna Ben, De Toni, Francesco, Pistilli, Giada, Nguyen, Olivier, Nikpoor, Somaieh, Masoud, Maraim, Colombo, Pierre, de la Rosa, Javier, Villegas, Paulo, Thrush, Tristan, Longpre, Shayne, Nagel, Sebastian, Weber, Leon, Muñoz, Manuel, Zhu, Jian, Van Strien, Daniel, Alyafeai, Zaid, Almubarak, Khalid, Vu, Minh Chien, Gonzalez-Dios, Itziar, Soroa, Aitor, Lo, Kyle, Dey, Manan, Suarez, Pedro Ortiz, Gokaslan, Aaron, Bose, Shamik, Adelani, David, Phan, Long, Tran, Hieu, Yu, Ian, Pai, Suhas, Chim, Jenny, Lepercq, Violette, Ilic, Suzana, Mitchell, Margaret, Luccioni, Sasha Alexandra, Jernite, Yacine
As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multilingual settings. The BigScience workshop, a 1-year international and multidisciplinary initiative, was formed w
Externí odkaz:
http://arxiv.org/abs/2303.03915