Výsledky vyhledávání - "PRASHANTH P., SAI"

Report

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon

Autor: Prashanth, USVSN Sai, Deng, Alvin, O'Brien, Kyle, S V, Jyothir, Khan, Mohammad Aflah, Borkar, Jaydeep, Choquette-Choo, Christopher A., Fuehne, Jacob Ray, Biderman, Stella, Ke, Tracy, Lee, Katherine, Saphra, Naomi

Memorization in language models is typically treated as a homogenous phenomenon, neglecting the specifics of the memorized data. We instead model memorization as the effect of a set of complex factors that describe each sample and relate it to the mo

Externí odkaz: http://arxiv.org/abs/2406.17746

Zobrazit plný text záznamu

Report

Emergent and Predictable Memorization in Large Language Models

Autor: Biderman, Stella, Prashanth, USVSN Sai, Sutawika, Lintang, Schoelkopf, Hailey, Anthony, Quentin, Purohit, Shivanshu, Raff, Edward

Memorization, or the tendency of large language models (LLMs) to output entire sequences from their training data verbatim, is a key concern for safely deploying language models. In particular, it is vital to minimize a model's memorization of sensit

Externí odkaz: http://arxiv.org/abs/2304.11158

Zobrazit plný text záznamu

Report

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Autor: Biderman, Stella, Schoelkopf, Hailey, Anthony, Quentin, Bradley, Herbie, O'Brien, Kyle, Hallahan, Eric, Khan, Mohammad Aflah, Purohit, Shivanshu, Prashanth, USVSN Sai, Raff, Edward, Skowron, Aviya, Sutawika, Lintang, van der Wal, Oskar

How do large language models (LLMs) develop and evolve over the course of training? How do these patterns change as models scale? To answer these questions, we introduce \textit{Pythia}, a suite of 16 LLMs all trained on public data seen in the exact

Externí odkaz: http://arxiv.org/abs/2304.01373

Zobrazit plný text záznamu

Report

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

Autor: Black, Sid, Biderman, Stella, Hallahan, Eric, Anthony, Quentin, Gao, Leo, Golding, Laurence, He, Horace, Leahy, Connor, McDonell, Kyle, Phang, Jason, Pieler, Michael, Prashanth, USVSN Sai, Purohit, Shivanshu, Reynolds, Laria, Tow, Jonathan, Wang, Ben, Weinbach, Samuel

We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to the public through a permissive license. It is, to the best of our knowledge, the largest d

Externí odkaz: http://arxiv.org/abs/2204.06745

Zobrazit plný text záznamu