Zobrazeno 1 - 10
of 12 658
pro vyhledávání: '"Multi-GPU"'
Modern GPU systems are constantly evolving to meet the needs of computing-intensive applications in scientific and machine learning domains. However, there is typically a gap between the hardware capacity and the achievable application performance. T
Externí odkaz:
http://arxiv.org/abs/2410.00801
This article presents an optimized algorithm and implementation for calculating resolution-of-the-identity Hartree-Fock (RI-HF) energies and analytic gradients using multiple Graphics Processing Units (GPUs). The algorithm is especially designed for
Externí odkaz:
http://arxiv.org/abs/2407.19614
Autor:
Park, Jeongmin Brian, Wu, Kun, Mailthody, Vikram Sharma, Quresh, Zaid, Mahlke, Scott, Hwu, Wen-mei
Graph Neural Networks (GNNs) are widely used today in recommendation systems, fraud detection, and node/link classification tasks. Real world GNNs continue to scale in size and require a large memory footprint for storing graphs and embeddings that o
Externí odkaz:
http://arxiv.org/abs/2407.15264
Characterizing and predicting the training performance of modern machine learning (ML) workloads on compute systems with compute and communication spread between CPUs, GPUs, and network devices is not only the key to optimization and planning but als
Externí odkaz:
http://arxiv.org/abs/2404.12674
Autor:
Zhang, Yicheng, Nazaraliyev, Ravan, Dutta, Sankha Baran, Abu-Ghazaleh, Nael, Marquez, Andres, Barker, Kevin
High-speed interconnects, such as NVLink, are integral to modern multi-GPU systems, acting as a vital link between CPUs and GPUs. This study highlights the vulnerability of multi-GPU systems to covert and side channel attacks due to congestion on int
Externí odkaz:
http://arxiv.org/abs/2404.03877
Traffic propagation simulation is crucial for urban planning, enabling congestion analysis, travel time estimation, and route optimization. Traditional micro-simulation frameworks are limited to main roads due to the complexity of urban mobility and
Externí odkaz:
http://arxiv.org/abs/2406.08496
Achieving high-performance computation on quantum systems presents a formidable challenge that necessitates bridging the capabilities between quantum hardware and classical computing resources. This study introduces an innovative distribution-aware Q
Externí odkaz:
http://arxiv.org/abs/2403.05828
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
A novel parallelization paradigm has been developed for multi-GPU architectures. Classical multi-GPU parallelization for SPH rely on domain decomposition. In our approach each particle can be assigned to a GPU independently of its position in space.
Externí odkaz:
http://arxiv.org/abs/2310.03596
The presence of many degenerate $d/f$ orbitals makes polynuclear transition metal compounds such as iron-sulfur clusters in nitrogenase challenging for state-of-the-art quantum chemistry methods. To address this challenge, we present the first distri
Externí odkaz:
http://arxiv.org/abs/2311.02854