Preparation and optimization of a diverse workload for a large-scale heterogeneous system

Autor: Martin Schulz, Ulrike Meier Yang, David F. Richards, Tong Chen, Shiv Sundram, Todd Gamblin, Shelby Lockhart, Phil Regier, David Beckingsale, Ed Zywicz, Ruipeng Li, Giacomo Domeniconi, James C. Sexton, Bob Walkup, Jarom Nelson, Carlos Costa, Hui-Fang Wen, Ramesh Pankajakshan, John A. Gunnels, Xiaohua Zhang, Brian Van Essen, Kathryn M. O'Brien, I-Feng W. Kuo, Johann Dahm, Guillaume Thomas-Collignon, Bert Still, Naoya Maruyama, Jamie A. Bramwell, David Boehme, Kathleen Shoga, Carol S. Woodward, Howard A. Scott, M. P. Katz, Ian Karlin, T Epperly, Tzanio V. Kolev, Eun Kyung Lee, Steven H. Langer, Christopher Ward, David J. Gardner, Sara I. L. Kokkila-Schumacher, Christopher Young, Kevin O'Brien, Barry Chen, Björn Sjögreen, Jose R. Brunheroto, Claudia Misale, Roger Pearce, Guojing Cong, Matthew Legendre, Lu Wang, Jaime H. Moreno, Kathleen McCandless, Cyril Zeller, Rao Nimmakayala, Bronis R. de Supinski, Xinyu Que, Sorin Bastea, Robert D. Falgout, Peng Wang, Charway R. Cooper, Aaron Fisher, Jim Brase, R. Neely, David Appelhans, Alexey Voronin, James N. Glosli, Slaven Peles, Pei-Hung Lin, Tony Degroot, Hai Le, Daniel A. White, Levi Barnes, Steve Rennich, Yoonho Park, Peter D. Barnes, Bob Anderson, Jonathan J. Wong, Robert C. Blake
Rok vydání: 2019
Předmět:
Zdroj: SC
Popis: Productivity from day one on supercomputers that leverage new technologies requires significant preparation. An institution that procures a novel system architecture often lacks sufficient institutional knowledge and skills to prepare for it. Thus, the "Center of Excellence" (CoE) concept has emerged to prepare for systems such as Summit and Sierra, currently the top two systems in the Top 500. This paper documents CoE experiences that prepared a workload of diverse applications and math libraries for a heterogeneous system. We describe our approach to this preparation, including our management and execution strategies, and detail our experiences with and reasons for using different programming approaches. Our early science and performance results show that the project enabled significant early seismic science with up to a l4X throughput increase over Cori. In addition to our successes, we discuss our challenges and failures so others may benefit from our experience.
Databáze: OpenAIRE