Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Norman, Tamara"'
Autor:
Alabed, Sami, Belov, Daniel, Chrzaszcz, Bart, Franco, Juliana, Grewe, Dominik, Maclaurin, Dougal, Molloy, James, Natan, Tom, Norman, Tamara, Pan, Xiaoyue, Paszke, Adam, Rink, Norman A., Schaarschmidt, Michael, Sitdikov, Timur, Swietlik, Agnieszka, Vytiniotis, Dimitrios, Wee, Joel
Training of modern large neural networks (NN) requires a combination of parallelization strategies encompassing data, model, or optimizer sharding. When strategies increase in complexity, it becomes necessary for partitioning tools to be 1) expressiv
Externí odkaz:
http://arxiv.org/abs/2401.11202
Autor:
Alabed, Sami, Grewe, Dominik, Franco, Juliana, Chrzaszcz, Bart, Natan, Tom, Norman, Tamara, Rink, Norman A., Vytiniotis, Dimitrios, Schaarschmidt, Michael
Large neural network models are commonly trained through a combination of advanced parallelism strategies in a single program, multiple data (SPMD) paradigm. For example, training large transformer models requires combining data, model, and pipeline
Externí odkaz:
http://arxiv.org/abs/2210.06352
Autor:
Schaarschmidt, Michael, Grewe, Dominik, Vytiniotis, Dimitrios, Paszke, Adam, Schmid, Georg Stefan, Norman, Tamara, Molloy, James, Godwin, Jonathan, Rink, Norman Alexander, Nair, Vinod, Belov, Dan
The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism. Implementing these methods is increasingly supported thro
Externí odkaz:
http://arxiv.org/abs/2112.02958
We present a novel characterization of the mapping of multiple parallelism forms (e.g. data and model parallelism) onto hierarchical accelerator systems that is hierarchy-aware and greatly reduces the space of software-to-hardware mapping. We experim
Externí odkaz:
http://arxiv.org/abs/2110.10548
Autor:
Hoffman, Matthew W., Shahriari, Bobak, Aslanides, John, Barth-Maron, Gabriel, Momchev, Nikola, Sinopalnikov, Danila, Stańczyk, Piotr, Ramos, Sabela, Raichuk, Anton, Vincent, Damien, Hussenot, Léonard, Dadashi, Robert, Dulac-Arnold, Gabriel, Orsini, Manu, Jacq, Alexis, Ferret, Johan, Vieillard, Nino, Ghasemipour, Seyed Kamyar Seyed, Girgin, Sertan, Pietquin, Olivier, Behbahani, Feryal, Norman, Tamara, Abdolmaleki, Abbas, Cassirer, Albin, Yang, Fan, Baumli, Kate, Henderson, Sarah, Friesen, Abe, Haroun, Ruba, Novikov, Alex, Colmenarejo, Sergio Gómez, Cabi, Serkan, Gulcehre, Caglar, Paine, Tom Le, Srinivasan, Srivatsan, Cowie, Andrew, Wang, Ziyu, Piot, Bilal, de Freitas, Nando
Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL a
Externí odkaz:
http://arxiv.org/abs/2006.00979
Autor:
Norman, Tamara, Hochhauser, Mark
Publikováno v:
Applied Clinical Trials. Aug2003, Vol. 12 Issue 8, p14. 2p.