Zobrazeno 1 - 8
of 8
pro vyhledávání: '"Yuri Tipikin"'
Publikováno v:
HPCS
In this paper we describe a new framework for creating distributed programmes which are resilient to cluster node failures. Our main goal is to create a simple and reliable model, that ensures continuous execution of parallel programmes without creat
Publikováno v:
HPCS
ResearcherID
ResearcherID
Nowadays many job schedulers rely on checkpoint mechanisms to make long-running batch jobs resilient to node failures. At large scale stopping a job and creating its image consumes considerable amount of time. The aim of this study is to propose a me
Autor:
Vladimir Korkhov, Ivan Gankevich, Alexander B. Degtyarev, Vladimir Gaiduchok, Alexander V. Bogdanov, Yuri Tipikin
Publikováno v:
Computational Science and Its Applications – ICCSA 2016 ISBN: 9783319421070
ICCSA (2)
ICCSA (2)
Master node fault-tolerance is the topic that is often dimmed in the discussion of big data processing technologies. Although failure of a master node can take down the whole data processing pipeline, this is considered either improbable or too diffi
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::375e2510b39e72d187daff1a7749d4c5
https://doi.org/10.1007/978-3-319-42108-7_29
https://doi.org/10.1007/978-3-319-42108-7_29
Publikováno v:
HPCS
Nowadays, many cluster management systems rely on distributed consensus algorithms to elect a leader that orchestrates subordinate nodes. Contrary to these studies we propose consensus-free algorithm that arranges cluster nodes into multiple levels o
Publikováno v:
Computational Science and Its Applications--ICCSA 2015 ISBN: 9783319214092
ICCSA (4)
ICCSA (4)
Efficient management of a distributed system is a common problem for university's and commercial computer centres, and handling node failures is a major aspect of it. Failures which are rare in a small commodity cluster, at large scale become common,
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::9eaac1e59e6ac008ba9b7cc6d29a166a
https://doi.org/10.1007/978-3-319-21410-8_20
https://doi.org/10.1007/978-3-319-21410-8_20
Autor:
Vladimir Gaiduchok, Alexander B. Degtyarev, Vladimir Korkhov, Yuri Tipikin, Serob Balyan, Dmitry Gushchanskiy, Ivan Gankevich, Alexander V. Bogdanov
Publikováno v:
Computational Science and Its Applications – ICCSA 2014 ISBN: 9783319091525
ICCSA (6)
ICCSA (6)
One of efficient ways to conduct experiments on HPC platforms is to create custom virtual computing environments tailored to the requirements of users and their applications. In this paper we investigate virtual private supercomputer, an approach bas
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::d086fdfab8d7a17feab4d26b0bc03fd6
https://doi.org/10.1007/978-3-319-09153-2_26
https://doi.org/10.1007/978-3-319-09153-2_26
Autor:
Alexander V. Bogdanov, Vladimir Gaiduchok, Vladimir Korkhov, Ivan Gankevich, Dmitry Gushchanskiy, Valeriy Zolotarev, Alexander B. Degtyarev, Yuri Tipikin
Publikováno v:
Ninth International Conference on Computer Science and Information Technologies Revised Selected Papers.
Virtual private supercomputer is an efficient way of conducting experiments on high-performance computational environment and the main role in this approach is played by virtualization and data consolidation. During experiment virtualization is used
Autor:
Alexander V. Bogdanov, Alexander B. Degtyarev, Vladimir Korkhov, Vladimir Gaiduchok, Yuri Tipikin, Ivan Gankevich
Publikováno v:
International Journal of Business Intelligence and Data Mining. 1:1
Distributed computing clusters are often built with commodity hardware which leads to periodic failures of processing nodes due to relatively low reliability of such hardware. While worker node fault-tolerance is straightforward, fault tolerance of m