'Short-Dot': Computing Large Linear Transforms Distributedly Using Coded Short Dot Products

Autor:	Sanghamitra Dutta, Pulkit Grover, Viveck R. Cadambe
Rok vydání:	2019
Předmět:	FOS: Computer and information sciences Computer science Computer Science - Information Theory Information Theory (cs.IT) Computation 020206 networking & telecommunications Dot product 02 engineering and technology Library and Information Sciences Condensed Matter::Mesoscopic Systems and Quantum Hall Effect Replication (computing) Computer Science Applications Exponential function Matrix (mathematics) Product (mathematics) 0202 electrical engineering electronic engineering information engineering Fraction (mathematics) Node (circuits) Erasure code Algorithm Information Systems
Zdroj:	IEEE Transactions on Information Theory. 65:6171-6193
ISSN:	1557-9654 0018-9448
DOI:	10.1109/tit.2019.2927558
Popis:	Faced with saturation of Moore's law and increasing dimension of data, system designers have increasingly resorted to parallel and distributed computing. However, distributed computing is often bottle necked by a small fraction of slow processors called "stragglers" that reduce the speed of computation because the fusion node has to wait for all processors to finish. To combat the effect of stragglers, recent literature introduces redundancy in computations across processors, e.g.,~using repetition-based strategies or erasure codes. The fusion node can exploit this redundancy by completing the computation using outputs from only a subset of the processors, ignoring the stragglers. In this paper, we propose a novel technique -- that we call "Short-Dot" -- to introduce redundant computations in a coding theory inspired fashion, for computing linear transforms of long vectors. Instead of computing long dot products as required in the original linear transform, we construct a larger number of redundant and short dot products that can be computed faster and more efficiently at individual processors. In reference to comparable schemes that introduce redundancy to tackle stragglers, Short-Dot reduces the cost of computation, storage and communication since shorter portions are stored and computed at each processor, and also shorter portions of the input is communicated to each processor. We demonstrate through probabilistic analysis as well as experiments that Short-Dot offers significant speed-up compared to existing techniques. We also derive trade-offs between the length of the dot-products and the resilience to stragglers (number of processors to wait for), for any such strategy and compare it to that achieved by our strategy. Comment: Presented at NIPS 2016, Barcelona, Spain
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b6933b133dc693faad108a733d92f15f https://doi.org/10.1109/tit.2019.2927558 Zobrazit plný text záznamu