Zobrazeno 1 - 10
of 607
pro vyhledávání: '"Hoefler, Torsten"'
Vector search systems are indispensable in large language model (LLM) serving, search engines, and recommender systems, where minimizing online search latency is essential. Among various algorithms, graph-based vector search (GVS) is particularly pop
Externí odkaz:
http://arxiv.org/abs/2406.12385
Autor:
Besta, Maciej, Scheidl, Florian, Gianinazzi, Lukas, Klaiman, Shachar, Müller, Jürgen, Hoefler, Torsten
Higher-order graph neural networks (HOGNNs) are an important class of GNN models that harness polyadic relations between vertices beyond plain edges. They have been used to eliminate issues such as over-smoothing or over-squashing, to significantly e
Externí odkaz:
http://arxiv.org/abs/2406.12841
Autor:
Besta, Maciej, Kubicek, Ales, Niggli, Roman, Gerstenberger, Robert, Weitzendorf, Lucas, Chi, Mingyuan, Iff, Patrick, Gajda, Joanna, Nyczyk, Piotr, Müller, Jürgen, Niewiadomski, Hubert, Chrapek, Marcin, Podstawski, Michał, Hoefler, Torsten
Retrieval Augmented Generation (RAG) enhances the abilities of Large Language Models (LLMs) by enabling the retrieval of documents into the LLM context to provide more accurate and relevant responses. Existing RAG solutions do not focus on queries th
Externí odkaz:
http://arxiv.org/abs/2406.05085
Autor:
Besta, Maciej, Paleari, Lorenzo, Kubicek, Ales, Nyczyk, Piotr, Gerstenberger, Robert, Iff, Patrick, Lehmann, Tomasz, Niewiadomski, Hubert, Hoefler, Torsten
Large Language Models (LLMs) are revolutionizing various domains, yet verifying their answers remains a significant challenge, especially for intricate open-ended tasks such as consolidation, summarization, and extraction of knowledge. In this work,
Externí odkaz:
http://arxiv.org/abs/2406.02524
In the era of post-Moore computing, network offload emerges as a solution to two challenges: the imperative for low-latency communication and the push towards hardware specialisation. Various methods have been employed to offload protocol- and data-p
Externí odkaz:
http://arxiv.org/abs/2405.16378
Autor:
Hoefler, Torsten, Calotoiu, Alexandru, Dipankar, Anurag, Schulthess, Thomas, Lapillonne, Xavier, Fuhrer, Oliver
We discuss the computational challenges and requirements for high-resolution climate simulations using the Icosahedral Nonhydrostatic Weather and Climate Model (ICON). We define a detailed requirements model for ICON which emphasizes the need for spe
Externí odkaz:
http://arxiv.org/abs/2405.13043
Autor:
Abubaker, Nabil, Hoefler, Torsten
Existing 3D algorithms for distributed-memory sparse kernels suffer from limited scalability due to reliance on bulk sparsity-agnostic communication. While easier to use, sparsity-agnostic communication leads to unnecessary bandwidth and memory consu
Externí odkaz:
http://arxiv.org/abs/2404.19638
Autor:
Luczynski, Piotr, Gianinazzi, Lukas, Iff, Patrick, Wilson, Leighton, De Sensi, Daniele, Hoefler, Torsten
Efficient Reduce and AllReduce communication collectives are a critical cornerstone of high-performance computing (HPC) applications. We present the first systematic investigation of Reduce and AllReduce on the Cerebras Wafer-Scale Engine (WSE). This
Externí odkaz:
http://arxiv.org/abs/2404.15888
Autor:
Shen, Siyuan, Huang, Langwen, Chrapek, Marcin, Schneider, Timo, Dayal, Jai, Gajbe, Manisha, Wisniewski, Robert, Hoefler, Torsten
The shift towards high-bandwidth networks driven by AI workloads in data centers and HPC clusters has unintentionally aggravated network latency, adversely affecting the performance of communication-intensive HPC applications. As large-scale MPI appl
Externí odkaz:
http://arxiv.org/abs/2404.14193
Autor:
Baumann, Yves, Ben-Nun, Tal, Besta, Maciej, Gianinazzi, Lukas, Hoefler, Torsten, Luczynski, Piotr
Contemporary accelerator designs exhibit a high degree of spatial localization, wherein two-dimensional physical distance determines communication costs between processing elements. This situation presents considerable algorithmic challenges, particu
Externí odkaz:
http://arxiv.org/abs/2404.12953