Zobrazeno 1 - 10
of 2 846
pro vyhledávání: '"P. Höfler"'
Autor:
Okanovic, Patrik, Kirsch, Andreas, Kasper, Jannes, Hoefler, Torsten, Krause, Andreas, Gürel, Nezihe Merve
We introduce MODEL SELECTOR, a framework for label-efficient selection of pretrained classifiers. Given a pool of unlabeled target data, MODEL SELECTOR samples a small subset of highly informative examples for labeling, in order to efficiently identi
Externí odkaz:
http://arxiv.org/abs/2410.13609
Autor:
Chrapek, Marcin, Vahldiek-Oberwagner, Anjo, Spoczynski, Marcin, Constable, Scott, Vij, Mona, Hoefler, Torsten
Foundation Models (FMs) display exceptional performance in tasks such as natural language processing and are being applied across a growing range of disciplines. Although typically trained on large public datasets, FMs are often fine-tuned or integra
Externí odkaz:
http://arxiv.org/abs/2410.05930
Autor:
Schmid, Larissa, Copik, Marcin, Calotoiu, Alexandru, Brandner, Laurin, Koziolek, Anne, Hoefler, Torsten
Serverless computing has emerged as a prominent paradigm, with a significant adoption rate among cloud customers. While this model offers advantages such as abstraction from the deployment and resource scheduling, it also poses limitations in handlin
Externí odkaz:
http://arxiv.org/abs/2410.03480
Power grids serve as a vital component in numerous industries, seamlessly delivering electrical energy to industrial processes and technologies, making their safe and reliable operation indispensable. However, powerlines can be hard to inspect due to
Externí odkaz:
http://arxiv.org/abs/2409.16821
Autor:
De Sensi, Daniele, Pichetti, Lorenzo, Vella, Flavio, De Matteis, Tiziano, Ren, Zebin, Fusco, Luigi, Turisini, Matteo, Cesarini, Daniele, Lust, Kurt, Trivedi, Animesh, Roweth, Duncan, Spiga, Filippo, Di Girolamo, Salvatore, Hoefler, Torsten
Publikováno v:
Published in Proceedings of The International Conference for High Performance Computing Networking, Storage, and Analysis (SC '24) (2024)
Multi-GPU nodes are increasingly common in the rapidly evolving landscape of exascale supercomputers. On these systems, GPUs on the same node are connected through dedicated networks, with bandwidths up to a few terabits per second. However, gauging
Externí odkaz:
http://arxiv.org/abs/2408.14090
Autor:
Khalilov, Mikhail, Di Girolamo, Salvatore, Chrapek, Marcin, Nudelman, Rami, Bloch, Gil, Hoefler, Torsten
In the Fully Sharded Data Parallel (FSDP) training pipeline, collective operations can be interleaved to maximize the communication/computation overlap. In this scenario, outstanding operations such as Allgather and Reduce-Scatter can compete for the
Externí odkaz:
http://arxiv.org/abs/2408.13356
Autor:
Besta, Maciej, Gerstenberger, Robert, Iff, Patrick, Sonawane, Pournima, Luna, Juan Gómez, Kanakagiri, Raghavendra, Min, Rui, Mutlu, Onur, Hoefler, Torsten, Appuswamy, Raja, Mahony, Aidan O
Knowledge graphs (KGs) have achieved significant attention in recent years, particularly in the area of the Semantic Web as well as gaining popularity in other application domains such as data mining and search engines. Simultaneously, there has been
Externí odkaz:
http://arxiv.org/abs/2408.12173
Autor:
Fusco, Luigi, Khalilov, Mikhail, Chrapek, Marcin, Chukkapalli, Giridhar, Schulthess, Thomas, Hoefler, Torsten
Heterogeneous supercomputers have become the standard in HPC. GPUs in particular have dominated the accelerator landscape, offering unprecedented performance in parallel workloads and unlocking new possibilities in fields like AI and climate modeling
Externí odkaz:
http://arxiv.org/abs/2408.11556
Autor:
Okanovic, Patrik, Kwasniewski, Grzegorz, Labini, Paolo Sylos, Besta, Maciej, Vella, Flavio, Hoefler, Torsten
High-performance sparse matrix-matrix (SpMM) multiplication is paramount for science and industry, as the ever-increasing sizes of data prohibit using dense data structures. Yet, existing hardware, such as Tensor Cores (TC), is ill-suited for SpMM, a
Externí odkaz:
http://arxiv.org/abs/2408.11551
As inference on Large Language Models (LLMs) emerges as an important workload in machine learning applications, weight quantization has become a standard technique for efficient GPU deployment. Quantization not only reduces model size, but has also b
Externí odkaz:
http://arxiv.org/abs/2408.11743