Výsledky vyhledávání

Report

Understanding Data Movement in AMD Multi-GPU Systems with Infinity Fabric

Autor: Schieffer, Gabin, Shi, Ruimin, Markidis, Stefano, Herten, Andreas, Faj, Jennifer, Peng, Ivy

Modern GPU systems are constantly evolving to meet the needs of computing-intensive applications in scientific and machine learning domains. However, there is typically a gap between the hardware capacity and the achievable application performance. T

Externí odkaz: http://arxiv.org/abs/2410.00801

Zobrazit plný text záznamu

Report

Multi-GPU RI-HF Energies and Analytic Gradients $-$ Towards High Throughput Ab Initio Molecular Dynamics

Autor: Stocks, Ryan, Palethorpe, Elise, Barca, Giuseppe M. J.

This article presents an optimized algorithm and implementation for calculating resolution-of-the-identity Hartree-Fock (RI-HF) energies and analytic gradients using multiple Graphics Processing Units (GPUs). The algorithm is especially designed for

Externí odkaz: http://arxiv.org/abs/2407.19614

Zobrazit plný text záznamu

Report

LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme

Autor: Park, Jeongmin Brian, Wu, Kun, Mailthody, Vikram Sharma, Quresh, Zaid, Mahlke, Scott, Hwu, Wen-mei

Graph Neural Networks (GNNs) are widely used today in recommendation systems, fraud detection, and node/link classification tasks. Real world GNNs continue to scale in size and require a large memory footprint for storing graphs and embeddings that o

Externí odkaz: http://arxiv.org/abs/2407.15264

Zobrazit plný text záznamu

Report

Towards Universal Performance Modeling for Machine Learning Training on Multi-GPU Platforms

Autor: Lin, Zhongyi, Sun, Ning, Bhattacharya, Pallab, Feng, Xizhou, Feng, Louis, Owens, John D.

Characterizing and predicting the training performance of modern machine learning (ML) workloads on compute systems with compute and communication spread between CPUs, GPUs, and network devices is not only the key to optimization and planning but als

Externí odkaz: http://arxiv.org/abs/2404.12674

Zobrazit plný text záznamu

Report

Beyond the Bridge: Contention-Based Covert and Side Channel Attacks on Multi-GPU Interconnect

Autor: Zhang, Yicheng, Nazaraliyev, Ravan, Dutta, Sankha Baran, Abu-Ghazaleh, Nael, Marquez, Andres, Barker, Kevin

High-speed interconnects, such as NVLink, are integral to modern multi-GPU systems, acting as a vital link between CPUs and GPUs. This study highlights the vulnerability of multi-GPU systems to covert and side channel attacks due to congestion on int

Externí odkaz: http://arxiv.org/abs/2404.03877

Zobrazit plný text záznamu

Report

Large Scale Multi-GPU Based Parallel Traffic Simulation for Accelerated Traffic Assignment and Propagation

Autor: Jiang, Xuan, Sengupta, Raja, Demmel, James, Williams, Samuel

Traffic propagation simulation is crucial for urban planning, enabling congestion analysis, travel time estimation, and route optimization. Traditional micro-simulation frameworks are limited to main roads due to the complexity of urban mobility and

Externí odkaz: http://arxiv.org/abs/2406.08496

Zobrazit plný text záznamu

Report

Multi-GPU-Enabled Hybrid Quantum-Classical Workflow in Quantum-HPC Middleware: Applications in Quantum Simulations

Autor: Chen, Kuan-Cheng, Li, Xiaoren, Xu, Xiaotian, Wang, Yun-Yuan, Liu, Chen-Yu

Achieving high-performance computation on quantum systems presents a formidable challenge that necessitates bridging the capabilities between quantum hardware and classical computing resources. This study introduces an innovative distribution-aware Q

Externí odkaz: http://arxiv.org/abs/2403.05828

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Report

A novel multi-GPU parallelization paradigm for SPH applied to solid mechanics in complex industrial applications

Autor: Unfer, Thomas, Collé, Anthony, Limido, Jérôme

A novel parallelization paradigm has been developed for multi-GPU architectures. Classical multi-GPU parallelization for SPH rely on domain decomposition. In our approach each particle can be assigned to a GPU independently of its position in space.

Externí odkaz: http://arxiv.org/abs/2310.03596

Zobrazit plný text záznamu

Report

A distributed multi-GPU ab initio density matrix renormalization group algorithm with applications to the P-cluster of nitrogenase

Autor: Xiang, Chunyang, Jia, Weile, Fang, Wei-Hai, Li, Zhendong

The presence of many degenerate $d/f$ orbitals makes polynuclear transition metal compounds such as iron-sulfur clusters in nitrogenase challenging for state-of-the-art quantum chemistry methods. To address this challenge, we present the first distri

Externí odkaz: http://arxiv.org/abs/2311.02854

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání