Výsledky vyhledávání

Report

Source-Aware Training Enables Knowledge Attribution in Language Models

Autor: Khalifa, Muhammad, Wadden, David, Strubell, Emma, Lee, Honglak, Wang, Lu, Beltagy, Iz, Peng, Hao

Large language models (LLMs) learn a vast amount of knowledge during pretraining, but they are often oblivious to the source(s) of such knowledge. We investigate the problem of intrinsic source citation, where LLMs are required to cite the pretrainin

Externí odkaz: http://arxiv.org/abs/2404.01019

Zobrazit plný text záznamu

Report

OLMo: Accelerating the Science of Language Models

Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important det

Externí odkaz: http://arxiv.org/abs/2402.00838

Zobrazit plný text záznamu

Report

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or recipes to

Externí odkaz: http://arxiv.org/abs/2402.00159

Zobrazit plný text záznamu

Report

Paloma: A Benchmark for Evaluating Language Model Fit

Autor: Magnusson, Ian, Bhagia, Akshita, Hofmann, Valentin, Soldaini, Luca, Jha, Ananya Harsh, Tafjord, Oyvind, Schwenk, Dustin, Walsh, Evan Pete, Elazar, Yanai, Lo, Kyle, Groeneveld, Dirk, Beltagy, Iz, Hajishirzi, Hannaneh, Smith, Noah A., Richardson, Kyle, Dodge, Jesse

Language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains$\unicode{x2013}$varying distributions of language. Rather than assuming perplexity on one distribut

Externí odkaz: http://arxiv.org/abs/2312.10523

Zobrazit plný text záznamu

Report

Catwalk: A Unified Language Model Evaluation Framework for Many Datasets

Autor: Groeneveld, Dirk, Awadalla, Anas, Beltagy, Iz, Bhagia, Akshita, Magnusson, Ian, Peng, Hao, Tafjord, Oyvind, Walsh, Pete, Richardson, Kyle, Dodge, Jesse

The success of large language models has shifted the evaluation paradigms in natural language processing (NLP). The community's interest has drifted towards comparing NLP models across many tasks, domains, and datasets, often at an extreme scale. Thi

Externí odkaz: http://arxiv.org/abs/2312.10253

Zobrazit plný text záznamu

Report

Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

Autor: Ivison, Hamish, Wang, Yizhong, Pyatkin, Valentina, Lambert, Nathan, Peters, Matthew, Dasigi, Pradeep, Jang, Joel, Wadden, David, Smith, Noah A., Beltagy, Iz, Hajishirzi, Hannaneh

Since the release of T\"ULU [Wang et al., 2023b], open resources for instruction tuning have developed quickly, from better base models to new finetuning techniques. We test and incorporate a number of these advances into T\"ULU, resulting in T\"ULU

Externí odkaz: http://arxiv.org/abs/2311.10702

Zobrazit plný text záznamu

Report

Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation

Autor: Peng, Hao, Cao, Qingqing, Dodge, Jesse, Peters, Matthew E., Fernandez, Jared, Sherborne, Tom, Lo, Kyle, Skjonsberg, Sam, Strubell, Emma, Plessas, Darrell, Beltagy, Iz, Walsh, Evan Pete, Smith, Noah A., Hajishirzi, Hannaneh

Rising computational demands of modern natural language processing (NLP) systems have increased the barrier to entry for cutting-edge research while posing serious environmental concerns. Yet, progress on model efficiency has been impeded by practica

Externí odkaz: http://arxiv.org/abs/2307.09701

Zobrazit plný text záznamu

Report

How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

Autor: Wang, Yizhong, Ivison, Hamish, Dasigi, Pradeep, Hessel, Jack, Khot, Tushar, Chandu, Khyathi Raghavi, Wadden, David, MacMillan, Kelsey, Smith, Noah A., Beltagy, Iz, Hajishirzi, Hannaneh

In this work we explore recent advances in instruction-tuning language models on a range of open instruction-following datasets. Despite recent claims that open models can be on par with state-of-the-art proprietary models, these claims are often acc

Externí odkaz: http://arxiv.org/abs/2306.04751

Zobrazit plný text záznamu

Report

Just CHOP: Embarrassingly Simple LLM Compression

Autor: Jha, Ananya Harsh, Sherborne, Tom, Walsh, Evan Pete, Groeneveld, Dirk, Strubell, Emma, Beltagy, Iz

Large language models (LLMs) enable unparalleled few- and zero-shot reasoning capabilities but at a high computational footprint. A growing assortment of methods for compression promises to reduce the computational burden of LLMs in deployment, but s

Externí odkaz: http://arxiv.org/abs/2305.14864

Zobrazit plný text záznamu

Report

TESS: Text-to-Text Self-Conditioned Simplex Diffusion

Autor: Mahabadi, Rabeeh Karimi, Ivison, Hamish, Tae, Jaesung, Henderson, James, Beltagy, Iz, Peters, Matthew E., Cohan, Arman

Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various continuous domains. However, applying continuous diffusion models to natural language remains challenging due to its discrete nature and the

Externí odkaz: http://arxiv.org/abs/2305.08379

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání