Zobrazeno 1 - 10
of 250
pro vyhledávání: '"Träff, Jesper Larsson"'
Autor:
Träff, Jesper Larsson
The reduce-scatter collective operation in which $p$ processors in a network of processors collectively reduce $p$ input vectors into a result vector that is partitioned over the processors is important both in its own right and as building block for
Externí odkaz:
http://arxiv.org/abs/2410.14234
Autor:
Träff, Jesper Larsson
These lecture notes are designed to accompany an imaginary, virtual, undergraduate, one or two semester course on fundamentals of Parallel Computing as well as to serve as background and reference for graduate courses on High-Performance Computing, p
Externí odkaz:
http://arxiv.org/abs/2407.18795
Autor:
Träff, Jesper Larsson
We give optimally fast $O(\log p)$ time (per processor) algorithms for computing round-optimal broadcast schedules for message-passing parallel computing systems. This affirmatively answers difficult questions posed in a SPAA 2022 BA and a CLUSTER 20
Externí odkaz:
http://arxiv.org/abs/2407.18004
Autor:
Träff, Jesper Larsson
We give optimally fast $O(\log p)$ time (per processor) algorithms for computing round-optimal broadcast schedules for message-passing parallel computing systems. This affirmatively answers the questions posed in Tr\"aff (2022). The problem is to bro
Externí odkaz:
http://arxiv.org/abs/2312.11236
Autor:
Träff, Jesper Larsson
We give a fast(er), communication-free, parallel construction of optimal communication schedules that allow broadcasting of $n$ distinct blocks of data from a root processor to all other processors in $1$-ported, $p$-processor networks with fully bid
Externí odkaz:
http://arxiv.org/abs/2205.10072
Autor:
Träff, Jesper Larsson
We discuss a simple, binary tree-based algorithm for the collective allreduce (reduction-to-all, MPI_Allreduce) operation for parallel systems consisting of $p$ suitably interconnected processors. The algorithm can be doubly pipelined to exploit bidi
Externí odkaz:
http://arxiv.org/abs/2109.12626
Autor:
Träff, Jesper Larsson, Pöter, Manuel
The lock-free, ordered, linked list is an important, standard example of a concurrent data structure. An obvious, practical drawback of textbook implementations is that failed compare-and-swap (CAS) operations lead to retraversal of the entire list (
Externí odkaz:
http://arxiv.org/abs/2010.15755
Autor:
Träff, Jesper Larsson
In $k$-ported message-passing systems, a processor can simultaneously receive $k$ different messages from $k$ other processors, and send $k$ different messages to $k$ other processors that may or may not be different from the processors from which me
Externí odkaz:
http://arxiv.org/abs/2008.12144
Autor:
Hunold, Sascha, von Kirchbach, Konrad, Lehr, Markus, Schulz, Christian, Träff, Jesper Larsson
Good process-to-compute-node mappings can be decisive for well performing HPC applications. A special, important class of process-to-node mapping problems is the problem of mapping processes that communicate in a sparse stencil pattern to Cartesian g
Externí odkaz:
http://arxiv.org/abs/2005.09521