Výsledky vyhledávání - "SINCLAIR, MATTHEW D."

Report

Global Optimizations & Lightweight Dynamic Logic for Concurrency

Autor: Pati, Suchita, Aga, Shaizeen, Jayasena, Nuwan, Sinclair, Matthew D.

Modern accelerators like GPUs are increasingly executing independent operations concurrently to improve the device's compute utilization. However, effectively harnessing it on GPUs for important primitives such as general matrix multiplications (GEMM

Externí odkaz: http://arxiv.org/abs/2409.02227

Zobrazit plný text záznamu

Report

PAL: A Variability-Aware Policy for Scheduling ML Workloads in GPU Clusters

Autor: Jain, Rutwik, Tran, Brandon, Chen, Keting, Sinclair, Matthew D., Venkataraman, Shivaram

Large-scale computing systems are increasingly using accelerators such as GPUs to enable peta- and exa-scale levels of compute to meet the needs of Machine Learning (ML) and scientific computing applications. Given the widespread and growing use of M

Externí odkaz: http://arxiv.org/abs/2408.11919

Zobrazit plný text záznamu

Report

T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives

Autor: Pati, Suchita, Aga, Shaizeen, Islam, Mahzabeen, Jayasena, Nuwan, Sinclair, Matthew D.

Large Language Models increasingly rely on distributed techniques for their training and inference. These techniques require communication across devices which can reduce scaling efficiency as the number of devices increases. While some distributed t

Externí odkaz: http://arxiv.org/abs/2401.16677

Zobrazit plný text záznamu

Report

Fifty Years of ISCA: A data-driven retrospective on key trends

Autor: Upasani, Gaurang, Sinclair, Matthew D., Sampson, Adrian, Ranganathan, Parthasarathy, Patterson, David, Shah, Shaan, Parthasarathy, Nidhi, Jain, Rutwik

Computer Architecture, broadly, involves optimizing hardware and software for current and future processing systems. Although there are several other top venues to publish Computer Architecture research, including ASPLOS, HPCA, and MICRO, ISCA (the I

Externí odkaz: http://arxiv.org/abs/2306.03964

Zobrazit plný text záznamu

Report

Integrating Per-Stream Stat Tracking into Accel-Sim

Autor: Qiao, Shichen, Su, Xin, Sinclair, Matthew D.

Accel-Sim is a widely used computer architecture simulator that models the behavior of modern NVIDIA GPUs in great detail. However, although Accel-Sim and the underlying GPGPU-Sim model many of the features of real GPUs, thus far it has not been able

Externí odkaz: http://arxiv.org/abs/2304.11136

Zobrazit plný text záznamu

Report

Computation vs. Communication Scaling for Future Transformers on Future Hardware

Autor: Pati, Suchita, Aga, Shaizeen, Islam, Mahzabeen, Jayasena, Nuwan, Sinclair, Matthew D.

Scaling neural network models has delivered dramatic quality gains across ML problems. However, this scaling has increased the reliance on efficient distributed training techniques. Accordingly, as with other distributed computing scenarios, it is im

Externí odkaz: http://arxiv.org/abs/2302.02825

Zobrazit plný text záznamu

Report

Not All GPUs Are Created Equal: Characterizing Variability in Large-Scale, Accelerator-Rich Systems

Autor: Sinha, Prasoon, Guliani, Akhil, Jain, Rutwik, Tran, Brandon, Sinclair, Matthew D., Venkataraman, Shivaram

Scientists are increasingly exploring and utilizing the massive parallelism of general-purpose accelerators such as GPUs for scientific breakthroughs. As a result, datacenters, hyperscalers, national computing centers, and supercomputers have procure

Externí odkaz: http://arxiv.org/abs/2208.11035

Zobrazit plný text záznamu

Report

A Case for Fine-grain Coherence Specialization in Heterogeneous Systems

Autor: Alsop, Johnathan, Na, Weon Taek, Sinclair, Matthew D., Grayson, Samuel, Adve, Sarita V.

Hardware specialization is becoming a key enabler of energyefficient performance. Future systems will be increasingly heterogeneous, integrating multiple specialized and programmable accelerators, each with different memory demands. Traditionally, co

Externí odkaz: http://arxiv.org/abs/2104.11678

Zobrazit plný text záznamu

Report

Demystifying BERT: Implications for Accelerator Design

Autor: Pati, Suchita, Aga, Shaizeen, Jayasena, Nuwan, Sinclair, Matthew D.

Transfer learning in natural language processing (NLP), as realized using models like BERT (Bi-directional Encoder Representation from Transformer), has significantly improved language representation with models that can tackle challenging language p

Externí odkaz: http://arxiv.org/abs/2104.08335

Zobrazit plný text záznamu

Report

SeqPoint: Identifying Representative Iterations of Sequence-based Neural Networks

Autor: Pati, Suchita, Aga, Shaizeen, Sinclair, Matthew D., Jayasena, Nuwan

The ubiquity of deep neural networks (DNNs) continues to rise, making them a crucial application class for hardware optimizations. However, detailed profiling and characterization of DNN training remains difficult as these applications often run for

Externí odkaz: http://arxiv.org/abs/2007.10459

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání