Autor: | Kevin D. Colby, Amiya K. Maji, Joseph Bottum, Jason Rahman |
---|---|
Rok vydání: | 2017 |
Předmět: |
Computer science
business.industry Context (language use) Software performance testing computer.software_genre Detect and avoid Middleware (distributed applications) Cloud testing Component-based software engineering Regression testing Operating system Software engineering business computer User-centered design |
Zdroj: | Proceedings of the Fourth International Workshop on HPC User Support Tools. |
DOI: | 10.1145/3152493.3152555 |
Popis: | HPC systems are made of many complex hardware and software components, and interaction between these components can often break, leading to job failures and customer dissatisfaction. Testing focused on individual components is often inadequate to identify broken inter-component interactions, therefore, to detect and avoid these, a holistic testing framework is needed which can test the full functionality and performance of a cluster from a user's perspective. Existing tools for HPC cluster testing are either rigid (i.e. works within the context of a single cluster) or are focused on system components (i.e., OS and middleware). In this paper, we present Testpilot---a flexible, holistic, and user-centric testing framework which can be used by system administrators, support staff, or even by users themselves. Testpilot can be used in various testing scenarios such as application testing, application update, OS update, or for continuous monitoring of cluster health. The authors have found Testpilot to be invaluable for regression testing at their HPC site and it has caught many issues that would have otherwise gone into production unnoticed. |
Databáze: | OpenAIRE |
Externí odkaz: |