Zobrazeno 1 - 10
of 192
pro vyhledávání: '"Stewart Lawrence"'
Publikováno v:
ENLSP-IV 2024 - 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, Dec 2024, Vancouver, Canada
Speculative decoding aims to speed up autoregressive generation of a language model by verifying in parallel the tokens generated by a smaller draft model.In this work, we explore the effectiveness of learning-free, negligible-cost draft strategies,
Externí odkaz:
http://arxiv.org/abs/2411.03786
Autor:
Agrawal, Pravesh, Antoniak, Szymon, Hanna, Emma Bou, Bout, Baptiste, Chaplot, Devendra, Chudnovsky, Jessica, Costa, Diogo, De Monicault, Baudouin, Garg, Saurabh, Gervet, Theophile, Ghosh, Soham, Héliou, Amélie, Jacob, Paul, Jiang, Albert Q., Khandelwal, Kartik, Lacroix, Timothée, Lample, Guillaume, Casas, Diego Las, Lavril, Thibaut, Scao, Teven Le, Lo, Andy, Marshall, William, Martin, Louis, Mensch, Arthur, Muddireddy, Pavankumar, Nemychnikova, Valera, Pellat, Marie, Von Platen, Patrick, Raghuraman, Nikhil, Rozière, Baptiste, Sablayrolles, Alexandre, Saulnier, Lucile, Sauvestre, Romain, Shang, Wendy, Soletskyi, Roman, Stewart, Lawrence, Stock, Pierre, Studnia, Joachim, Subramanian, Sandeep, Vaze, Sagar, Wang, Thomas, Yang, Sophia
We introduce Pixtral-12B, a 12--billion-parameter multimodal language model. Pixtral-12B is trained to understand both natural images and documents, achieving leading performance on various multimodal benchmarks, surpassing a number of larger models.
Externí odkaz:
http://arxiv.org/abs/2410.07073
Autor:
Brooks, Alex, Marshall, Philip, Ozog, David, Rahman, Md. Wasi-ur, Stewart, Lawrence, Tom, Rithwik
Modern high-end systems are increasingly becoming heterogeneous, providing users options to use general purpose Graphics Processing Units (GPU) and other accelerators for additional performance. High Performance Computing (HPC) and Artificial Intelli
Externí odkaz:
http://arxiv.org/abs/2409.20476
Publikováno v:
37th Conference on Neural Information Processing Systems, Dec 2023, New Orleans, United States
We introduce a differentiable clustering method based on stochastic perturbations of minimum-weight spanning forests. This allows us to include clustering in end-to-end trainable pipelines, with efficient gradients. We show that our method performs w
Externí odkaz:
http://arxiv.org/abs/2305.16358
Autor:
Stewart Lawrence, Andy Wynne
Publikováno v:
Australasian Accounting, Business and Finance Journal, Vol 3, Iss 2, Pp 1-25 (2009)
This paper examines the impact of globalised accounting and economic reforms on the public sectors of lessdeveloped countries. Our interest is in the international institutions that have been instrumental inintroducing common, global remedies which a
Externí odkaz:
https://doaj.org/article/6f3a068a5b97499e9393f5eb60d08b6f
Neural networks can be trained to solve regression problems by using gradient-based methods to minimize the square loss. However, practitioners often prefer to reformulate regression as a classification problem, observing that training on the cross e
Externí odkaz:
http://arxiv.org/abs/2211.05641
Molecular Dynamics (MD) simulations play a central role in physics-driven drug discovery. MD applications often use the Particle Mesh Ewald (PME) algorithm to accelerate electrostatic force computations, but efficient parallelization has proven diffi
Externí odkaz:
http://arxiv.org/abs/2009.12617
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.