Zobrazeno 1 - 10
of 11
pro vyhledávání: '"Stephan Ewen"'
Autor:
Rico Bergmann, Alexander Alexandrov, Ulf Leser, Volker Markl, Johann-Christoph Freytag, Matthias J. Sax, Mareike Hoger, Felix Naumann, Stephan Ewen, Mathias Peters, Arvid Heise, Astrid Rheinländer, Marcus Leich, Kostas Tzoumas, Fabian Hueske, Sebastian Schelter, Odej Kao, Daniel Warneke
Publikováno v:
The VLDB Journal. 23:939-964
We present Stratosphere, an open-source software stack for parallel data analysis. Stratosphere brings together a unique set of features that allow the expressive, easy, and efficient programming of analytical applications at very large scale. Strato
Autor:
Stephan Ewen, Max Heimel, Volker Markl, Dominic Battré, Odej Kao, Daniel Warneke, Fabian Hueske, Erik Nijkamp, Alexander Alexandrov
Publikováno v:
Proceedings of the VLDB Endowment. 3:1625-1628
Large-scale data analysis applications require processing and analyzing of Terabytes or even Petabytes of data, particularly in the areas of web analysis or scientific data management. This trend has been discussed as "web-scale data management" in a
Autor:
Sergey Dudoladov, Volker Markl, Chen Xu, Stephan Ewen, Asterios Katsifodimos, Sebastian Schelter, Kostas Tzoumas
Publikováno v:
SIGMOD Conference
Over the past years, parallel dataflow systems have been employed for advanced analytics in the field of data mining where many algorithms are iterative. These systems typically provide fault tolerance by periodically checkpointing the algorithm's st
Publikováno v:
SIGMOD Conference
Iterative algorithms occur in many domains of data analysis, such as machine learning or graph analysis. With increasing interest to run those algorithms on very large data sets, we see a need for new techniques to execute iterations in a massively p
Publikováno v:
CIKM
Executing data-parallel iterative algorithms on large datasets is crucial for many advanced analytical applications in the fields of data mining and machine learning. Current systems for executing iterative tasks in large clusters typically achieve f
Parallel dataflow systems are a central part of most analytic pipelines for big data. The iterative nature of many analysis and machine learning algorithms, however, is still a challenge for current systems. While certain types of bulk iterative algo
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b3d152d70d699b7c90597878936d83f1
http://arxiv.org/abs/1208.0088
http://arxiv.org/abs/1208.0088
Autor:
Thomas O. Bodner, Alexander Alexandrov, Berni Schiefer, Volker Markl, John Poelman, Stephan Ewen
Publikováno v:
Proceedings of the 1st Workshop on Architectures and Systems for Big Data.
The need for efficient data generation for the purposes of testing and benchmarking newly developed massively-parallel data processing systems has increased with the emergence of Big Data problems. As synthetic data model specifications evolve over t
Publikováno v:
SoCC
We present a parallel data processor centered around a programming model of so called Parallelization Contracts (PACTs) and the scalable parallel execution engine Nephele [18]. The PACT programming model is a generalization of the well-known map/redu
Publikováno v:
ICDE
Workflow languages like BPEL are broadly adopted by industry to integrate the heterogeneous applications and data stores of an enterprise. Leading vendors provide extensions to BPEL that allow for a tight integration of data processing capabilities i
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783540329602
EDBT
EDBT
Database Management Systems (DBMS) perform query plan selection by mathematically modeling the execution cost of candidate execution plans and choosing the cheapest query execution plan (QEP) according to that cost model. The cost model requires accu
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::455464ada431cfa70db5026292e9b9bb
https://doi.org/10.1007/11687238_50
https://doi.org/10.1007/11687238_50