Zobrazeno 1 - 10
of 216
pro vyhledávání: '"van de Geijn, Robert"'
We apply the FLAME methodology to derive algorithms hand in hand with their proofs of correctness for the computation of the $ L T L^T $ decomposition (with and without pivoting) of a skew-symmetric matrix. The approach yields known as well as new al
Externí odkaz:
http://arxiv.org/abs/2311.10700
Autor:
van de Geijn, Robert, Myers, Maggie
The FLAME methodology for deriving linear algebra algorithms from specification, first introduced around 2000, has been successfully applied to a broad cross section of operations. An open question has been whether it can yield algorithms for the bes
Externí odkaz:
http://arxiv.org/abs/2304.03068
This paper lays out insights and opportunities for implementing higher-precision matrix-matrix multiplication (GEMM) from (in terms of) lower-precision high-performance GEMM. The driving case study approximates double-double precision (FP64x2) GEMM i
Externí odkaz:
http://arxiv.org/abs/2303.04353
Matrix libraries often focus on achieving high performance for problems considered to be either "small" or "large", as these two scenarios tend to respond best to different optimization strategies. We propose a unified technique for implementing matr
Externí odkaz:
http://arxiv.org/abs/2302.08417
As the ratio between the rate of computation and rate with which data can be retrieved from various layers of memory continues to deteriorate, a question arises: Will the current best algorithms for computing matrix-matrix multiplication on future CP
Externí odkaz:
http://arxiv.org/abs/1904.05717
We approach the problem of implementing mixed-datatype support within the general matrix multiplication (GEMM) operation of the BLIS framework, whereby each matrix operand A, B, and C may be stored as single- or double-precision real or complex value
Externí odkaz:
http://arxiv.org/abs/1901.06015
Conventional GPU implementations of Strassen's algorithm (Strassen) typically rely on the existing high-performance matrix multiplication (GEMM), trading space for time. As a result, such approaches can only achieve practical speedup for relatively l
Externí odkaz:
http://arxiv.org/abs/1808.07984
Discovering "good" algorithms for an operation is often considered an art best left to experts. What if there is a simple methodology, an algorithm, for systematically deriving a family of algorithms as well as their cost analyses, so that the best a
Externí odkaz:
http://arxiv.org/abs/1808.07832
Dijkstra observed that verifying correctness of a program is difficult and conjectured that derivation of a program hand-in-hand with its proof of correctness was the answer. We illustrate this goal-oriented approach by applying it to the domain of d
Externí odkaz:
http://arxiv.org/abs/1710.04286
Tensor contraction (TC) is an important computational kernel widely used in numerous applications. It is a multi-dimensional generalization of matrix multiplication (GEMM). While Strassen's algorithm for GEMM is well studied in theory and practice, e
Externí odkaz:
http://arxiv.org/abs/1704.03092