Zobrazeno 1 - 10
of 10
pro vyhledávání: '"Rakesh Komuravelli"'
Autor:
Mark Zhao, Niket Agarwal, Aarti Basant, Buğra Gedik, Satadru Pan, Mustafa Ozdal, Rakesh Komuravelli, Jerry Pan, Tianshu Bao, Haowei Lu, Sundaram Narayanan, Jack Langman, Kevin Wilfong, Harsha Rastogi, Carole-Jean Wu, Christos Kozyrakis, Parik Pol
Publikováno v:
Proceedings of the 49th Annual International Symposium on Computer Architecture.
Autor:
Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie (Amy) Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng, Yinbin Ma, Junjie Yang, Ellie Wen, Hong Li, Lin Yang, Chonglin Sun, Whitney Zhao, Dimitry Melts, Krishna Dhulipala, KR Kishore, Tyler Graf, Assaf Eisenman, Kiran Kumar Matam, Adi Gangidi, Guoqiang Jerry Chen, Manoj Krishnan, Avinash Nayak, Krishnakumar Nair, Bharath Muthiah, Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Ajit Mathews, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, Vijay Rao
Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed so
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3b1a6d07c5739fe931da1f91ac4f756a
http://arxiv.org/abs/2104.05158
http://arxiv.org/abs/2104.05158
Autor:
Sarita V. Adve, Vikram Adve, Matthew D. Sinclair, Prakalp Srivastava, Maria Kotsifakou, Rakesh Komuravelli
Publikováno v:
PPOPP
We propose a parallel program representation for heterogeneous systems, designed to enable performance portability across a wide range of popular parallel hardware, including GPUs, vector instruction sets, multicore CPUs and potentially FPGAs. Our re
Autor:
Vikram Adve, Johnathan Alsop, Rakesh Komuravelli, Maria Kotsifakou, Matthew D. Sinclair, Prakalp Srivastava, Muhammad Huzaifa, Sarita V. Adve
Publikováno v:
ISCA
Heterogeneous systems employ specialization for energy efficiency. Since data movement is expected to be a dominant consumer of energy, these systems employ specialized memories (e.g., scratchpads and FIFOs) for better efficiency for targeted data. T
Publikováno v:
IEEE Micro. 34:138-148
Recent research in disciplined shared-memory programming models presents a unique opportunity for rethinking the multicore memory hierarchy for better efficiency in terms of complexity, performance, and energy. The DeNovo hardware system showed that
Publikováno v:
ISPASS
In recent years the power wall has prevented the continued scaling of single core performance. This has lead to the rise of dark silicon and motivated a move toward parallelism and specialization. As a result, energy-efficient high-throughput GPU cor
Publikováno v:
ISPASS
While many techniques have been shown to be successful at reducing the amount of on-chip network traffic, no studies have shown how close a combined approach would come to eliminating all unnecessary data traffic, nor have any studies provided insigh
Publikováno v:
ASPLOS
Recent work has shown that disciplined shared-memory programming models that provide deterministic-by-default semantics can simplify both parallel software and hardware. Specifically, the DeNovo hardware system has shown that the software guarantees
Autor:
Nicholas P. Carter, Hyojin Sung, Robert Smolinski, Ching-Tsun Chou, Rakesh Komuravelli, Byn Choi, Nima Honarmand, Sarita V. Adve, Vikram Adve
Publikováno v:
PACT
For parallelism to become tractable for mass programmers, shared-memory languages and environments must evolve to enforce disciplined practices that ban "wild shared-memory behaviors;'' e.g., unstructured parallelism, arbitrary data races, and ubiqui
Autor:
Danny Dig, Mohsen Vakilian, Stephen T. Heumann, Robert L. Bocchino, Vikram Adve, Jeffrey Overbey, Hyojin Sung, Rakesh Komuravelli, Patrick Simmons, Sarita V. Adve
Publikováno v:
OOPSLA
Today's shared-memory parallel programming models are complex and error-prone.While many parallel programs are intended to be deterministic, unanticipated thread interleavings can lead to subtle bugs and nondeterministic semantics. In this paper, we