Zobrazeno 1 - 10
of 258
pro vyhledávání: '"Nadathur Satish"'
Autor:
Anderson, Michael, Chen, Benny, Chen, Stephen, Deng, Summer, Fix, Jordan, Gschwind, Michael, Kalaiah, Aravind, Kim, Changkyu, Lee, Jaewon, Liang, Jason, Liu, Haixin, Lu, Yinghai, Montgomery, Jack, Moorthy, Arun, Nadathur, Satish, Naghshineh, Sam, Nayak, Avinash, Park, Jongsoo, Petersen, Chris, Schatz, Martin, Sundaram, Narayanan, Tang, Bangsheng, Tang, Peter, Yang, Amy, Yu, Jiecao, Yuen, Hector, Zhang, Ying, Anbudurai, Aravind, Balan, Vandana, Bojja, Harsha, Boyd, Joe, Breitbach, Matthew, Caldato, Claudio, Calvo, Anna, Catron, Garret, Chandwani, Sneh, Christeas, Panos, Cottel, Brad, Coutinho, Brian, Dalli, Arun, Dhanotia, Abhishek, Duncan, Oniel, Dzhabarov, Roman, Elmir, Simon, Fu, Chunli, Fu, Wenyin, Fulthorp, Michael, Gangidi, Adi, Gibson, Nick, Gordon, Sean, Hernandez, Beatriz Padilla, Ho, Daniel, Huang, Yu-Cheng, Johansson, Olof, Juluri, Shishir, Kanaujia, Shobhit, Kesarkar, Manali, Killinger, Jonathan, Kim, Ben, Kulkarni, Rohan, Lele, Meghan, Li, Huayu, Li, Huamin, Li, Yueming, Liu, Cynthia, Liu, Jerry, Maher, Bert, Mallipedi, Chandra, Mangla, Seema, Matam, Kiran Kumar, Mehta, Jubin, Mehta, Shobhit, Mitchell, Christopher, Muthiah, Bharath, Nagarkatte, Nitin, Narasimha, Ashwin, Nguyen, Bernard, Ortiz, Thiara, Padmanabha, Soumya, Pan, Deng, Poojary, Ashwin, Ye, Qi, Raginel, Olivier, Rajagopal, Dwarak, Rice, Tristan, Ross, Craig, Rotem, Nadav, Russ, Scott, Shah, Kushal, Shan, Baohua, Shen, Hao, Shetty, Pavan, Skandakumaran, Krish, Srinivasan, Kutta, Sumbaly, Roshan, Tauberg, Michael, Tzur, Mor, Verma, Sidharth, Wang, Hao, Wang, Man, Wei, Ben, Xia, Alex, Xu, Chenyu, Yang, Martin, Zhang, Kai, Zhang, Ruoxi, Zhao, Ming, Zhao, Whitney, Zhu, Rui, Mathews, Ajit, Qiao, Lin, Smelyanskiy, Misha, Jia, Bill, Rao, Vijay
In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network band
Externí odkaz:
http://arxiv.org/abs/2107.04140
Autor:
Zhaoxia, Deng, Park, Jongsoo, Tang, Ping Tak Peter, Liu, Haixin, Jie, Yang, Yuen, Hector, Huang, Jianyu, Khudia, Daya, Wei, Xiaohan, Wen, Ellie, Choudhary, Dhruv, Krishnamoorthi, Raghuraman, Wu, Carole-Jean, Nadathur, Satish, Kim, Changkyu, Naumov, Maxim, Naghshineh, Sam, Smelyanskiy, Mikhail
Tremendous success of machine learning (ML) and the unabated growth in ML model complexity motivated many ML-specific designs in both CPU and accelerator architectures to speed up the model inference. While these architectures are diverse, highly opt
Externí odkaz:
http://arxiv.org/abs/2105.12676
Autor:
Park, Jongsoo, Naumov, Maxim, Basu, Protonu, Deng, Summer, Kalaiah, Aravind, Khudia, Daya, Law, James, Malani, Parth, Malevich, Andrey, Nadathur, Satish, Pino, Juan, Schatz, Martin, Sidorov, Alexander, Sivakumar, Viswanath, Tulloch, Andrew, Wang, Xiaodong, Wu, Yiming, Yuen, Hector, Diril, Utku, Dzhulgakov, Dmytro, Hazelwood, Kim, Jia, Bill, Jia, Yangqing, Qiao, Lin, Rao, Vijay, Rotem, Nadav, Yoo, Sungjoo, Smelyanskiy, Mikhail
The application of deep learning techniques resulted in remarkable improvement of machine learning models. In this paper provides detailed characterizations of deep learning models used in many Facebook social network services. We present computation
Externí odkaz:
http://arxiv.org/abs/1811.09886
Autor:
Rotem, Nadav, Fix, Jordan, Abdulrasool, Saleem, Catron, Garret, Deng, Summer, Dzhabarov, Roman, Gibson, Nick, Hegeman, James, Lele, Meghan, Levenstein, Roman, Montgomery, Jack, Maher, Bert, Nadathur, Satish, Olesen, Jakob, Park, Jongsoo, Rakhov, Artem, Smelyanskiy, Misha, Wang, Man
This paper presents the design of Glow, a machine learning compiler for heterogeneous hardware. It is a pragmatic approach to compilation that enables the generation of highly optimized code for multiple targets. Glow lowers the traditional neural ne
Externí odkaz:
http://arxiv.org/abs/1805.00907
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Shaden Smith, Theodore L. Willke, Zheguang Zhao, Subramanya R. Dulloor, Narayanan Sundaram, Mihai Capota, Michael R. Anderson, Nadathur Satish
Publikováno v:
Proceedings of the VLDB Endowment. 10:901-912
Apache Spark is a popular framework for data analytics with attractive features such as fault tolerance and interoperability with the Hadoop ecosystem. Unfortunately, many analytics operations in Spark are an order of magnitude or more slower compare
Autor:
Fredrik Manne, Arif O. Khan, Mahantesh Halappanavar, Pradeep Dubey, Md. Mostofa Ali Patwary, Narayanan Sundaram, Nadathur Satish, Alex Pothen
Publikováno v:
SIAM Journal on Scientific Computing. 38:S593-S619
We describe a half-approximation algorithm, $b$-Suitor, for computing a $b$-Matching of maximum weight in a graph with weights on the edges. $b$-Matching is a generalization of the well-known Matching problem in graphs, where the objective is to choo
Autor:
Mikhail Smelyanskiy, Jatin Chhugani, Changkyu Kim, Nadathur Satish, Hideki Saito, Pradeep Dubey, Milind B. Girkar, Rakesh Krishnaiyer
Publikováno v:
ISCA
Current processor trends of integrating more cores with wider SIMD units, along with a deeper and complex memory hierarchy, have made it increasingly more challenging to extract performance from applications. It is believed by some that traditional a
Publikováno v:
MICRO
MIT web domain
MIT web domain
Putting the DRAM on the same package with a processor enables several times higher memory bandwidth than conventional off-package DRAM. Yet, the latency of in-package DRAM is not appreciably lower than that of off-package DRAM. A promising use of in-
Autor:
Mikhail E. Smorkalov, Md. Mostofa Ali Patwary, Evan Racah, Srinivas Sridharan, Prabhat, Wahid Bhimji, Thorsten Kurth, Nadathur Satish, Jian Zhang, Mikhail Shiryaev, Tareq M. Malas, Pradeep Dubey, Narayanan Sundaram, Jack Deslippe, Ioannis Mitliagkas
Publikováno v:
SC
This paper presents the first, 15-PetaFLOP Deep Learning system for solving scientific pattern classification problems on contemporary HPC architectures. We develop supervised convolutional architectures for discriminating signals in high-energy phys
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3bf2d8cbbe6124c4a13adabe44da076a
http://arxiv.org/abs/1708.05256
http://arxiv.org/abs/1708.05256