Zobrazeno 1 - 10
of 12 894
pro vyhledávání: '"Groeneveld ON"'
Autor:
Groeneveld, C., van Weeren, R. J., Botteon, A., Cassano, R., de Gasperin, F., Osinga, E., Brunetti, G., Röttgering, H. J. A.
Some galaxy clusters contain non-thermal synchrotron emitting plasma permeating the intracluster medium (ICM). The spectral properties of this radio emission are not well characterized at decameter wavelengths ({\nu} < 30 MHz), primarily due to the s
Externí odkaz:
http://arxiv.org/abs/2412.05360
Autor:
Bhagia, Akshita, Liu, Jiacheng, Wettig, Alexander, Heineman, David, Tafjord, Oyvind, Jha, Ananya Harsh, Soldaini, Luca, Smith, Noah A., Groeneveld, Dirk, Koh, Pang Wei, Dodge, Jesse, Hajishirzi, Hannaneh
We develop task scaling laws and model ladders to predict the individual task performance of pretrained language models (LMs) in the overtrained setting. Standard power laws for language modeling loss cannot accurately model task performance. Therefo
Externí odkaz:
http://arxiv.org/abs/2412.04403
Autor:
van Weeren, R. J., Timmerman, R., Vaidya, V., Gendron-Marsolais, M. -L., Botteon, A., Roberts, I. D., Hlavacek-Larrondo, J., Bonafede, A., Brüggen, M., Brunetti, G., Cassano, R., Cuciti, V., Edge, A. C., Gastaldello, F., Groeneveld, C., Shimwell, T. W.
Publikováno v:
A&A 692, A12 (2024)
The Perseus cluster is the brightest X-ray cluster in the sky and is known as a cool-core galaxy cluster. Being a very nearby cluster, it has been extensively studied. This has provided a comprehensive view of the physical processes that operate in t
Externí odkaz:
http://arxiv.org/abs/2410.02863
Autor:
Deitke, Matt, Clark, Christopher, Lee, Sangho, Tripathi, Rohun, Yang, Yue, Park, Jae Sung, Salehi, Mohammadreza, Muennighoff, Niklas, Lo, Kyle, Soldaini, Luca, Lu, Jiasen, Anderson, Taira, Bransom, Erin, Ehsani, Kiana, Ngo, Huong, Chen, YenSung, Patel, Ajay, Yatskar, Mark, Callison-Burch, Chris, Head, Andrew, Hendrix, Rose, Bastani, Favyen, VanderBilt, Eli, Lambert, Nathan, Chou, Yvonne, Chheda, Arnavi, Sparks, Jenna, Skjonsberg, Sam, Schmitz, Michael, Sarnat, Aaron, Bischoff, Byron, Walsh, Pete, Newell, Chris, Wolters, Piper, Gupta, Tanmay, Zeng, Kuo-Hao, Borchardt, Jon, Groeneveld, Dirk, Nam, Crystal, Lebrecht, Sophie, Wittlif, Caitlin, Schoenick, Carissa, Michel, Oscar, Krishna, Ranjay, Weihs, Luca, Smith, Noah A., Hajishirzi, Hannaneh, Girshick, Ross, Farhadi, Ali, Kembhavi, Aniruddha
Today's most advanced vision-language models (VLMs) remain proprietary. The strongest open-weight models rely heavily on synthetic data from proprietary VLMs to achieve good performance, effectively distilling these closed VLMs into open ones. As a r
Externí odkaz:
http://arxiv.org/abs/2409.17146
Autor:
Muennighoff, Niklas, Soldaini, Luca, Groeneveld, Dirk, Lo, Kyle, Morrison, Jacob, Min, Sewon, Shi, Weijia, Walsh, Pete, Tafjord, Oyvind, Lambert, Nathan, Gu, Yuling, Arora, Shane, Bhagia, Akshita, Schwenk, Dustin, Wadden, David, Wettig, Alexander, Hui, Binyuan, Dettmers, Tim, Kiela, Douwe, Farhadi, Ali, Smith, Noah A., Koh, Pang Wei, Singh, Amanpreet, Hajishirzi, Hannaneh
We introduce OLMoE, a fully open, state-of-the-art language model leveraging sparse Mixture-of-Experts (MoE). OLMoE-1B-7B has 7 billion (B) parameters but uses only 1B per input token. We pretrain it on 5 trillion tokens and further adapt it to creat
Externí odkaz:
http://arxiv.org/abs/2409.02060
Autor:
Li, Jeffrey, Fang, Alex, Smyrnis, Georgios, Ivgi, Maor, Jordan, Matt, Gadre, Samir, Bansal, Hritik, Guha, Etash, Keh, Sedrick, Arora, Kushal, Garg, Saurabh, Xin, Rui, Muennighoff, Niklas, Heckel, Reinhard, Mercat, Jean, Chen, Mayee, Gururangan, Suchin, Wortsman, Mitchell, Albalak, Alon, Bitton, Yonatan, Nezhurina, Marianna, Abbas, Amro, Hsieh, Cheng-Yu, Ghosh, Dhruba, Gardner, Josh, Kilian, Maciej, Zhang, Hanlin, Shao, Rulin, Pratt, Sarah, Sanyal, Sunny, Ilharco, Gabriel, Daras, Giannis, Marathe, Kalyani, Gokaslan, Aaron, Zhang, Jieyu, Chandu, Khyathi, Nguyen, Thao, Vasiljevic, Igor, Kakade, Sham, Song, Shuran, Sanghavi, Sujay, Faghri, Fartash, Oh, Sewoong, Zettlemoyer, Luke, Lo, Kyle, El-Nouby, Alaaeldin, Pouransari, Hadi, Toshev, Alexander, Wang, Stephanie, Groeneveld, Dirk, Soldaini, Luca, Koh, Pang Wei, Jitsev, Jenia, Kollar, Thomas, Dimakis, Alexandros G., Carmon, Yair, Dave, Achal, Schmidt, Ludwig, Shankar, Vaishaal
We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretrai
Externí odkaz:
http://arxiv.org/abs/2406.11794
Autor:
Groeneveld, C., van Weeren, R. J., Osinga, E., Williams, W. L., Callingham, J. R., de Gasperin, F., Botteon, A., Shimwell, T., de Jong, J. M. G. H. J., Jansen, L. F., Miley, G. K., Brunetti, G., Brüggen, M., Röttgering, H. J. A.
Publikováno v:
Nature Astronomy 8 (2024) 786-795
The largely unexplored decameter radio band (10-30 MHz) provides a unique window for studying a range of astronomical topics, such as auroral emission from exoplanets, inefficient cosmic ray acceleration mechanisms, fossil radio plasma, and free-free
Externí odkaz:
http://arxiv.org/abs/2405.05311
Autor:
Groeneveld, Dirk, Beltagy, Iz, Walsh, Pete, Bhagia, Akshita, Kinney, Rodney, Tafjord, Oyvind, Jha, Ananya Harsh, Ivison, Hamish, Magnusson, Ian, Wang, Yizhong, Arora, Shane, Atkinson, David, Authur, Russell, Chandu, Khyathi Raghavi, Cohan, Arman, Dumas, Jennifer, Elazar, Yanai, Gu, Yuling, Hessel, Jack, Khot, Tushar, Merrill, William, Morrison, Jacob, Muennighoff, Niklas, Naik, Aakanksha, Nam, Crystal, Peters, Matthew E., Pyatkin, Valentina, Ravichander, Abhilasha, Schwenk, Dustin, Shah, Saurabh, Smith, Will, Strubell, Emma, Subramani, Nishant, Wortsman, Mitchell, Dasigi, Pradeep, Lambert, Nathan, Richardson, Kyle, Zettlemoyer, Luke, Dodge, Jesse, Lo, Kyle, Soldaini, Luca, Smith, Noah A., Hajishirzi, Hannaneh
Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important det
Externí odkaz:
http://arxiv.org/abs/2402.00838
Autor:
Soldaini, Luca, Kinney, Rodney, Bhagia, Akshita, Schwenk, Dustin, Atkinson, David, Authur, Russell, Bogin, Ben, Chandu, Khyathi, Dumas, Jennifer, Elazar, Yanai, Hofmann, Valentin, Jha, Ananya Harsh, Kumar, Sachin, Lucy, Li, Lyu, Xinxi, Lambert, Nathan, Magnusson, Ian, Morrison, Jacob, Muennighoff, Niklas, Naik, Aakanksha, Nam, Crystal, Peters, Matthew E., Ravichander, Abhilasha, Richardson, Kyle, Shen, Zejiang, Strubell, Emma, Subramani, Nishant, Tafjord, Oyvind, Walsh, Pete, Zettlemoyer, Luke, Smith, Noah A., Hajishirzi, Hannaneh, Beltagy, Iz, Groeneveld, Dirk, Dodge, Jesse, Lo, Kyle
Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or recipes to
Externí odkaz:
http://arxiv.org/abs/2402.00159
Publikováno v:
European Journal of Engineering Education, 19 December 2023, page 1-18
Creativity is a critical skill that professional software engineers leverage to tackle difficult problems. In higher education, multiple efforts have been made to spark creative skills of engineering students. However, creativity is a vague concept t
Externí odkaz:
http://arxiv.org/abs/2312.12014