Zobrazeno 1 - 10
of 80
pro vyhledávání: '"Beltagy, Iz"'
Autor:
Khalifa, Muhammad, Wadden, David, Strubell, Emma, Lee, Honglak, Wang, Lu, Beltagy, Iz, Peng, Hao
Large language models (LLMs) learn a vast amount of knowledge during pretraining, but they are often oblivious to the source(s) of such knowledge. We investigate the problem of intrinsic source citation, where LLMs are required to cite the pretrainin
Externí odkaz:
http://arxiv.org/abs/2404.01019
Autor:
Groeneveld, Dirk, Beltagy, Iz, Walsh, Pete, Bhagia, Akshita, Kinney, Rodney, Tafjord, Oyvind, Jha, Ananya Harsh, Ivison, Hamish, Magnusson, Ian, Wang, Yizhong, Arora, Shane, Atkinson, David, Authur, Russell, Chandu, Khyathi Raghavi, Cohan, Arman, Dumas, Jennifer, Elazar, Yanai, Gu, Yuling, Hessel, Jack, Khot, Tushar, Merrill, William, Morrison, Jacob, Muennighoff, Niklas, Naik, Aakanksha, Nam, Crystal, Peters, Matthew E., Pyatkin, Valentina, Ravichander, Abhilasha, Schwenk, Dustin, Shah, Saurabh, Smith, Will, Strubell, Emma, Subramani, Nishant, Wortsman, Mitchell, Dasigi, Pradeep, Lambert, Nathan, Richardson, Kyle, Zettlemoyer, Luke, Dodge, Jesse, Lo, Kyle, Soldaini, Luca, Smith, Noah A., Hajishirzi, Hannaneh
Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important det
Externí odkaz:
http://arxiv.org/abs/2402.00838
Autor:
Soldaini, Luca, Kinney, Rodney, Bhagia, Akshita, Schwenk, Dustin, Atkinson, David, Authur, Russell, Bogin, Ben, Chandu, Khyathi, Dumas, Jennifer, Elazar, Yanai, Hofmann, Valentin, Jha, Ananya Harsh, Kumar, Sachin, Lucy, Li, Lyu, Xinxi, Lambert, Nathan, Magnusson, Ian, Morrison, Jacob, Muennighoff, Niklas, Naik, Aakanksha, Nam, Crystal, Peters, Matthew E., Ravichander, Abhilasha, Richardson, Kyle, Shen, Zejiang, Strubell, Emma, Subramani, Nishant, Tafjord, Oyvind, Walsh, Pete, Zettlemoyer, Luke, Smith, Noah A., Hajishirzi, Hannaneh, Beltagy, Iz, Groeneveld, Dirk, Dodge, Jesse, Lo, Kyle
Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or recipes to
Externí odkaz:
http://arxiv.org/abs/2402.00159
Autor:
Magnusson, Ian, Bhagia, Akshita, Hofmann, Valentin, Soldaini, Luca, Jha, Ananya Harsh, Tafjord, Oyvind, Schwenk, Dustin, Walsh, Evan Pete, Elazar, Yanai, Lo, Kyle, Groeneveld, Dirk, Beltagy, Iz, Hajishirzi, Hannaneh, Smith, Noah A., Richardson, Kyle, Dodge, Jesse
Language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains$\unicode{x2013}$varying distributions of language. Rather than assuming perplexity on one distribut
Externí odkaz:
http://arxiv.org/abs/2312.10523
Autor:
Groeneveld, Dirk, Awadalla, Anas, Beltagy, Iz, Bhagia, Akshita, Magnusson, Ian, Peng, Hao, Tafjord, Oyvind, Walsh, Pete, Richardson, Kyle, Dodge, Jesse
The success of large language models has shifted the evaluation paradigms in natural language processing (NLP). The community's interest has drifted towards comparing NLP models across many tasks, domains, and datasets, often at an extreme scale. Thi
Externí odkaz:
http://arxiv.org/abs/2312.10253
Autor:
Ivison, Hamish, Wang, Yizhong, Pyatkin, Valentina, Lambert, Nathan, Peters, Matthew, Dasigi, Pradeep, Jang, Joel, Wadden, David, Smith, Noah A., Beltagy, Iz, Hajishirzi, Hannaneh
Since the release of T\"ULU [Wang et al., 2023b], open resources for instruction tuning have developed quickly, from better base models to new finetuning techniques. We test and incorporate a number of these advances into T\"ULU, resulting in T\"ULU
Externí odkaz:
http://arxiv.org/abs/2311.10702
Autor:
Peng, Hao, Cao, Qingqing, Dodge, Jesse, Peters, Matthew E., Fernandez, Jared, Sherborne, Tom, Lo, Kyle, Skjonsberg, Sam, Strubell, Emma, Plessas, Darrell, Beltagy, Iz, Walsh, Evan Pete, Smith, Noah A., Hajishirzi, Hannaneh
Rising computational demands of modern natural language processing (NLP) systems have increased the barrier to entry for cutting-edge research while posing serious environmental concerns. Yet, progress on model efficiency has been impeded by practica
Externí odkaz:
http://arxiv.org/abs/2307.09701
Autor:
Wang, Yizhong, Ivison, Hamish, Dasigi, Pradeep, Hessel, Jack, Khot, Tushar, Chandu, Khyathi Raghavi, Wadden, David, MacMillan, Kelsey, Smith, Noah A., Beltagy, Iz, Hajishirzi, Hannaneh
In this work we explore recent advances in instruction-tuning language models on a range of open instruction-following datasets. Despite recent claims that open models can be on par with state-of-the-art proprietary models, these claims are often acc
Externí odkaz:
http://arxiv.org/abs/2306.04751
Autor:
Jha, Ananya Harsh, Sherborne, Tom, Walsh, Evan Pete, Groeneveld, Dirk, Strubell, Emma, Beltagy, Iz
Large language models (LLMs) enable unparalleled few- and zero-shot reasoning capabilities but at a high computational footprint. A growing assortment of methods for compression promises to reduce the computational burden of LLMs in deployment, but s
Externí odkaz:
http://arxiv.org/abs/2305.14864
Autor:
Mahabadi, Rabeeh Karimi, Ivison, Hamish, Tae, Jaesung, Henderson, James, Beltagy, Iz, Peters, Matthew E., Cohan, Arman
Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various continuous domains. However, applying continuous diffusion models to natural language remains challenging due to its discrete nature and the
Externí odkaz:
http://arxiv.org/abs/2305.08379