High‐performance iterative dataflow abstractions in Twister2:TSet.

Autor: Wickramasinghe, Pulasthi, Perera, Niranda, Kamburugamuve, Supun, Govindarajan, Kannan, Abeykoon, Vibhatha, Widanage, Chathura, Uyar, Ahmet, Gunduz, Gurhan, Akkas, Selahattin, Fox, Geoffrey
Předmět:
Zdroj: Concurrency & Computation: Practice & Experience; 5/30/2022, Vol. 34 Issue 12, p1-16, 16p
Abstrakt: Summary: The dataflow model is gradually becoming the de facto standard for big data applications. While many popular frameworks are built around this model, very little research has been done on understanding its inner workings, which in turn has led to inefficiencies in existing frameworks. It is important to note that understanding the relationship between dataflow and high performance computing (HPC) building blocks allows us to address and alleviate many of these fundamental inefficiencies by learning from the extensive research literature in the HPC community. In this article, we present TSets, the dataflow abstraction of Twister2, which is a big data framework designed for high‐performance dataflow and iterative computations. We discuss the dataflow model adopted by TSets and the rationale behind implementing iteration handling at the worker level. Finally, we evaluate TSets to show the performance of the framework and the importance of the worker level iteration model. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index