Zobrazeno 1 - 10
of 216
pro vyhledávání: '"Gibson, Garth"'
Data parallel training is widely used for scaling distributed deep neural network (DNN) training. However, the performance benefits are often limited by the communication-heavy parameter synchronization step. In this paper, we take advantage of the d
Externí odkaz:
http://arxiv.org/abs/1905.03960
Autor:
Ratner, Alexander, Alistarh, Dan, Alonso, Gustavo, Andersen, David G., Bailis, Peter, Bird, Sarah, Carlini, Nicholas, Catanzaro, Bryan, Chayes, Jennifer, Chung, Eric, Dally, Bill, Dean, Jeff, Dhillon, Inderjit S., Dimakis, Alexandros, Dubey, Pradeep, Elkan, Charles, Fursin, Grigori, Ganger, Gregory R., Getoor, Lise, Gibbons, Phillip B., Gibson, Garth A., Gonzalez, Joseph E., Gottschlich, Justin, Han, Song, Hazelwood, Kim, Huang, Furong, Jaggi, Martin, Jamieson, Kevin, Jordan, Michael I., Joshi, Gauri, Khalaf, Rania, Knight, Jason, Konečný, Jakub, Kraska, Tim, Kumar, Arun, Kyrillidis, Anastasios, Lakshmiratan, Aparna, Li, Jing, Madden, Samuel, McMahan, H. Brendan, Meijer, Erik, Mitliagkas, Ioannis, Monga, Rajat, Murray, Derek, Olukotun, Kunle, Papailiopoulos, Dimitris, Pekhimenko, Gennady, Rekatsinas, Theodoros, Rostamizadeh, Afshin, Ré, Christopher, De Sa, Christopher, Sedghi, Hanie, Sen, Siddhartha, Smith, Virginia, Smola, Alex, Song, Dawn, Sparks, Evan, Stoica, Ion, Sze, Vivienne, Udell, Madeleine, Vanschoren, Joaquin, Venkataraman, Shivaram, Vinayak, Rashmi, Weimer, Markus, Wilson, Andrew Gordon, Xing, Eric, Zaharia, Matei, Zhang, Ce, Talwalkar, Ameet
Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically different d
Externí odkaz:
http://arxiv.org/abs/1904.03257
As Machine Learning (ML) applications increase in data size and model complexity, practitioners turn to distributed clusters to satisfy the increased computational and memory demands. Unfortunately, effective use of clusters for ML requires considera
Externí odkaz:
http://arxiv.org/abs/1410.8043
When training large machine learning models with many variables or parameters, a single machine is often inadequate since the model may be too large to fit in memory, while training can take a long time even with stochastic updates. A natural recours
Externí odkaz:
http://arxiv.org/abs/1406.4580
Training large machine learning (ML) models with many variables or parameters can take a long time if one employs sequential procedures even with stochastic updates. A natural solution is to turn to distributed computing on a cluster; however, naive,
Externí odkaz:
http://arxiv.org/abs/1312.5766
Autor:
Gibson, Garth A.1,2 garth@panasas.com, Meter, Rodney Van3 rdv@alumni.caltech.edu
Publikováno v:
Communications of the ACM. Nov2000, Vol. 43 Issue 11, p37-45. 9p.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Schroeder, Bianca, Gibson, Garth A.
Component failure in large-scale IT installations such as cluster supercomputers or internet service providers is becoming an ever larger problem as the number of processors, memory chips and disks in a single cluster approaches a million. In this pa
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1d0b60653bb520def07e41f2450e6f09