GPU-Accelerated Machine Learning Inference as a Service for Computing in Neutrino Experiments
Autor: | Nhan Tran, Kevin Pedro, Jeffrey Krupa, Burt Holzman, K. Knoepfel, Benjamin Hawks, Philip Harris, Maria Acosta Flechas, Michael Wang, Tingjun Yang |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
FOS: Computer and information sciences
Big Data Coprocessor Computer science heterogeneous (CPU+GPU) computing Inference FOS: Physical sciences Symmetric multiprocessor system Machine learning computer.software_genre 01 natural sciences High Energy Physics - Experiment High Energy Physics - Experiment (hep-ex) Artificial Intelligence 0103 physical sciences Computer Science (miscellaneous) particle physics 010306 general physics cloud computing (SaaS) Original Research lcsh:T58.5-58.64 010308 nuclear & particles physics business.industry lcsh:Information technology Process (computing) Computational Physics (physics.comp-ph) Task (computing) Identification (information) Workflow machine learning Computer Science - Distributed Parallel and Cluster Computing Physics - Data Analysis Statistics and Probability GPU (graphics processing unit) Artificial intelligence Distributed Parallel and Cluster Computing (cs.DC) Web service business computer Physics - Computational Physics Data Analysis Statistics and Probability (physics.data-an) Information Systems |
Zdroj: | Frontiers in Big Data Frontiers in Big Data, Vol 3 (2021) DOE / OSTI |
ISSN: | 2624-909X |
Popis: | Machine learning algorithms are becoming increasingly prevalent and performant in the reconstruction of events in accelerator-based neutrino experiments. These sophisticated algorithms can be computationally expensive. At the same time, the data volumes of such experiments are rapidly increasing. The demand to process billions of neutrino events with many machine learning algorithm inferences creates a computing challenge. We explore a computing model in which heterogeneous computing with GPU coprocessors is made available as a web service. The coprocessors can be efficiently and elastically deployed to provide the right amount of computing for a given processing task. With our approach, Services for Optimized Network Inference on Coprocessors (SONIC), we integrate GPU acceleration specifically for the ProtoDUNE-SP reconstruction chain without disrupting the native computing workflow. With our integrated framework, we accelerate the most time-consuming task, track and particle shower hit identification, by a factor of 17. This results in a factor of 2.7 reduction in the total processing time when compared with CPU-only production. For this particular task, only 1 GPU is required for every 68 CPU threads, providing a cost-effective solution. 15 pages, 7 figures, 2 tables |
Databáze: | OpenAIRE |
Externí odkaz: |