Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Romain Lion"'
Autor:
Emmanuel Agullo, Mirco Altenbernd, Hartwig Anzt, Leonardo Bautista-Gomez, Tommaso Benacchio, Luca Bonaventura, Hans-Joachim Bungartz, Sanjay Chatterjee, Florina M Ciorba, Nathan DeBardeleben, Daniel Drzisga, Sebastian Eibl, Christian Engelmann, Wilfried N Gansterer, Luc Giraud, Dominik Göddeke, Marco Heisig, Fabienne Jézéquel, Nils Kohl, Xiaoye Sherry Li, Romain Lion, Miriam Mehl, Paul Mycek, Michael Obersteiner, Enrique S Quintana-Ortí, Francesco Rizzi, Ulrich Rüde, Martin Schulz, Fred Fung, Robert Speck, Linda Stals, Keita Teranishi, Samuel Thibault, Dominik Thönnes, Andreas Wagner, Barbara Wohlmuth
Publikováno v:
International Journal of High Performance Computing Applications
International Journal of High Performance Computing Applications, SAGE Publications, 2021
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
The international journal of high performance computing applications 36(2), 251-285 (2022). doi:10.1177/10943420211055188
International Journal of High Performance Computing Applications, 2021, pp.10943420211055188. ⟨10.1177/10943420211055188⟩
International Journal of High Performance Computing Applications, SAGE Publications, 2021
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
The international journal of high performance computing applications 36(2), 251-285 (2022). doi:10.1177/10943420211055188
International Journal of High Performance Computing Applications, 2021, pp.10943420211055188. ⟨10.1177/10943420211055188⟩
This work is based on the seminar titled ``Resiliency in Numerical Algorithm Design for Extreme Scale Simulations'' held March 1-6, 2020 at Schloss Dagstuhl, that was attended by all the authors. Naive versions of conventional resilience techniques w
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b05b71a6a20ec124273ae24745504d60
https://hal.inria.fr/hal-03348787/file/2010.13342.pdf
https://hal.inria.fr/hal-03348787/file/2010.13342.pdf
Autor:
Samuel Thibault, Romain Lion
Publikováno v:
FTXS 2020-IEEE/ACM 10th Workshop on Fault Tolerance for HPC at eXtreme Scale
FTXS 2020-IEEE/ACM 10th Workshop on Fault Tolerance for HPC at eXtreme Scale, Nov 2020, Atlanta / Virtual, United States. ⟨10.1109/FTXS51974.2020.00009⟩
FTXS@SC
FTXS 2020-IEEE/ACM 10th Workshop on Fault Tolerance for HPC at eXtreme Scale, Nov 2020, Atlanta / Virtual, United States. ⟨10.1109/FTXS51974.2020.00009⟩
FTXS@SC
International audience; The ever-increasing number of computation units assembled in current HPC platforms leads to a concerning increase in fault probability. Traditional checkpoint/restart strategies avoid wasting large amounts of computation time
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3c4cb4ce1301840e666697cbb4ee77c4
https://hal.archives-ouvertes.fr/hal-02970529v2/file/2020001221.pdf
https://hal.archives-ouvertes.fr/hal-02970529v2/file/2020001221.pdf