Zobrazeno 1 - 10
of 13
pro vyhledávání: '"PRASHANTH P., SAI"'
Autor:
Prashanth, USVSN Sai, Deng, Alvin, O'Brien, Kyle, S V, Jyothir, Khan, Mohammad Aflah, Borkar, Jaydeep, Choquette-Choo, Christopher A., Fuehne, Jacob Ray, Biderman, Stella, Ke, Tracy, Lee, Katherine, Saphra, Naomi
Memorization in language models is typically treated as a homogenous phenomenon, neglecting the specifics of the memorized data. We instead model memorization as the effect of a set of complex factors that describe each sample and relate it to the mo
Externí odkaz:
http://arxiv.org/abs/2406.17746
Autor:
Biderman, Stella, Prashanth, USVSN Sai, Sutawika, Lintang, Schoelkopf, Hailey, Anthony, Quentin, Purohit, Shivanshu, Raff, Edward
Memorization, or the tendency of large language models (LLMs) to output entire sequences from their training data verbatim, is a key concern for safely deploying language models. In particular, it is vital to minimize a model's memorization of sensit
Externí odkaz:
http://arxiv.org/abs/2304.11158
Autor:
Biderman, Stella, Schoelkopf, Hailey, Anthony, Quentin, Bradley, Herbie, O'Brien, Kyle, Hallahan, Eric, Khan, Mohammad Aflah, Purohit, Shivanshu, Prashanth, USVSN Sai, Raff, Edward, Skowron, Aviya, Sutawika, Lintang, van der Wal, Oskar
How do large language models (LLMs) develop and evolve over the course of training? How do these patterns change as models scale? To answer these questions, we introduce \textit{Pythia}, a suite of 16 LLMs all trained on public data seen in the exact
Externí odkaz:
http://arxiv.org/abs/2304.01373
Autor:
Black, Sid, Biderman, Stella, Hallahan, Eric, Anthony, Quentin, Gao, Leo, Golding, Laurence, He, Horace, Leahy, Connor, McDonell, Kyle, Phang, Jason, Pieler, Michael, Prashanth, USVSN Sai, Purohit, Shivanshu, Reynolds, Laria, Tow, Jonathan, Wang, Ben, Weinbach, Samuel
We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to the public through a permissive license. It is, to the best of our knowledge, the largest d
Externí odkaz:
http://arxiv.org/abs/2204.06745
Publikováno v:
IEEE Communications Magazine; 2024, Vol. 62 Issue: 3 p9-12, 4p
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Publikováno v:
AIP Conference Proceedings Online; May 2023, Vol. 2477 Issue: 1 p030037-30042, 6p
Publikováno v:
AIP Conference Proceedings Online; May 2023, Vol. 2477 Issue: 1 p030036-30040, 5p
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Publikováno v:
Procedia Computer Science; 2020, Vol. 167, p2445-2457, 13p