Best of Both Worlds: High Performance Interactive and Batch Launching

Autor: Byun, Chansup, Kepner, Jeremy, Arcand, William, Bestor, David, Bergeron, Bill, Gadepally, Vijay, Houle, Michael, Hubbell, Matthew, Jones, Michael, Kirby, Andrew, Klein, Anna, Michaleas, Peter, Milechin, Lauren, Mullen, Julie, Prout, Andrew, Rosa, Antonio, Samsi, Siddharth, Yee, Charles, Reuther, Albert
Rok vydání: 2020
Předmět:
Druh dokumentu: Working Paper
DOI: 10.1109/HPEC43674.2020.9286142
Popis: Rapid launch of thousands of jobs is essential for effective interactive supercomputing, big data analysis, and AI algorithm development. Achieving thousands of launches per second has required hardware to be available to receive these jobs. This paper presents a novel preemptive approach to implement spot jobs on MIT SuperCloud systems allowing the resources to be fully utilized for both long running batch jobs while still providing fast launch for interactive jobs. The new approach separates the job preemption and scheduling operations and can achieve 100 times faster performance in the scheduling of a job with preemption when compared to using the standard scheduler-provided automatic preemption-based capability. The results demonstrate that the new approach can schedule interactive jobs preemptively at a performance comparable to when the required computing resources are idle and available. The spot job capability can be deployed without disrupting the interactive user experience while increasing the overall system utilization.
Databáze: arXiv