Zobrazeno 1 - 10
of 48
pro vyhledávání: '"Abellan, Jose L."'
Autor:
Shivdikar, Kaustubh, Agostini, Nicolas Bohm, Jayaweera, Malith, Jonatan, Gilbert, Abellan, Jose L., Joshi, Ajay, Kim, John, Kaeli, David
Graph Neural Networks (GNNs) are emerging as a formidable tool for processing non-euclidean data across various domains, ranging from social network analysis to bioinformatics. Despite their effectiveness, their adoption has not been pervasive becaus
Externí odkaz:
http://arxiv.org/abs/2404.15510
Autor:
Agostini, Nicolas Bohm, Haris, Jude, Gibson, Perry, Jayaweera, Malith, Rubin, Norm, Tumeo, Antonino, Abellán, José L., Cano, José, Kaeli, David
This paper addresses the need for automatic and efficient generation of host driver code for arbitrary custom AXI-based accelerators targeting linear algebra algorithms, an important workload in various applications, including machine learning and sc
Externí odkaz:
http://arxiv.org/abs/2312.14821
Autor:
Shivdikar, Kaustubh, Bao, Yuhui, Agrawal, Rashmi, Shen, Michael, Jonatan, Gilbert, Mora, Evelio, Ingare, Alexander, Livesay, Neal, Abellán, José L., Kim, John, Joshi, Ajay, Kaeli, David
Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without decrypting it. FHE has garnered significant attention over the past decade as it supports secure outsourcing of data processing to remote cloud services. Despite its
Externí odkaz:
http://arxiv.org/abs/2309.11001
Autor:
Muñoz-Martínez, Francisco, Garg, Raveesh, Abellán, José L., Pellauer, Michael, Acacio, Manuel E., Krishna, Tushar
Sparsity is a growing trend in modern DNN models. Existing Sparse-Sparse Matrix Multiplication (SpMSpM) accelerators are tailored to a particular SpMSpM dataflow (i.e., Inner Product, Outer Product or Gustavsons), that determines their overall effici
Externí odkaz:
http://arxiv.org/abs/2301.10852
Over the years, processor throughput has steadily increased. However, the memory throughput has not increased at the same rate, which has led to the memory wall problem in turn increasing the gap between effective and theoretical peak processor perfo
Externí odkaz:
http://arxiv.org/abs/2201.12027
Autor:
Garg, Raveesh, Qin, Eric, Muñoz-Martínez, Francisco, Guirado, Robert, Jain, Akshay, Abadal, Sergi, Abellán, José L., Acacio, Manuel E., Alarcón, Eduard, Rajamanickam, Sivasankaran, Krishna, Tushar
Graph Neural Networks (GNNs) have garnered a lot of recent interest because of their success in learning representations from graph-structured data across several critical applications in cloud and HPC. Owing to their unique compute and memory charac
Externí odkaz:
http://arxiv.org/abs/2103.07977
Autor:
Mojumder, Saiful A., Sun, Yifan, Delshadtehrani, Leila, Ma, Yenai, Baruah, Trinayan, Abellán, José L., Kim, John, Kaeli, David, Joshi, Ajay
The sizes of GPU applications are rapidly growing. They are exhausting the compute and memory resources of a single GPU, and are demanding the move to multiple GPUs. However, the performance of these applications scales sub-linearly with GPU count be
Externí odkaz:
http://arxiv.org/abs/2008.02300
Autor:
Mojumder, Saiful A., Sun, Yifan, Delshadtehrani, Leila, Ma, Yenai, Baruah, Trinayan, Abellán, José L., Kim, John, Kaeli, David, Joshi, Ajay
While multi-GPU (MGPU) systems are extremely popular for compute-intensive workloads, several inefficiencies in the memory hierarchy and data movement result in a waste of GPU resources and difficulties in programming MGPU systems. First, due to the
Externí odkaz:
http://arxiv.org/abs/2007.04292
The design of specialized architectures for accelerating the inference procedure of Deep Neural Networks (DNNs) is a booming area of research nowadays. First-generation rigid proposals have been rapidly replaced by more advanced flexible accelerator
Externí odkaz:
http://arxiv.org/abs/2006.07137
Autor:
Sun, Yifan, Baruah, Trinayan, Mojumder, Saiful A., Dong, Shi, Ubal, Rafael, Gong, Xiang, Treadway, Shane, Bao, Yuhui, Zhao, Vincent, Abellán, José L., Kim, John, Joshi, Ajay, Kaeli, David
The rapidly growing popularity and scale of data-parallel workloads demand a corresponding increase in raw computational power of GPUs (Graphics Processing Units). As single-GPU systems struggle to satisfy the performance demands, multi-GPU systems h
Externí odkaz:
http://arxiv.org/abs/1811.02884