Autonomic Management of Large Clusters and Their Integration into the Grid

Autor: Rafael Garcia Leiva, Thomas Röblitz, Thorsten Kleinwort, Jaroslaw Polok, Andrew Washbrook, Alan Silverman, Alastair Scobie, Jan Iven, Michele Michelotto, David Groep, B. Panzer-Steindel, Tim Colles, Piotr Poznański, Karim Chouikh, Alexander Reinefeld, Jan van Eldik, Alexander Holt, D. Front, Germán Cancio, T. J. Smith, T.M. Steinbeck, O. Barring, Enrico Ferro, Frank Pister, Maite Barroso Lopez, O Koeroo, Michael George, G. Venekamp, Wim Som de Cerff, Lord Hess, Catherine Rafflin, Andrea Chierici, Massimo Biasotto, Sylvain Chapeland, Marco Serra, Volker Lindenstruth, Lionel Cons, C. Aiftimiei, Philippe Defert, Paul Anderson, Gaetano Maron, L. dell'Agnello, Florian Schintke, M.F.M. Steenbakkers
Rok vydání: 2004
Předmět:
Zdroj: Journal of Grid Computing. 2:247-260
ISSN: 1572-9184
1570-7873
DOI: 10.1007/s10723-004-7647-3
Popis: We present a framework for the co-ordinated, autonomic management of multiple clusters in a compute center and their integration into a Grid environment. Site autonomy and the automation of administrative tasks are prime aspects in this framework. The system behavior is continuously monitored in a steering cycle and appropriate actions are taken to resolve any problems. All presented components have been implemented in the course of the EU project DataGrid: The Lemon monitoring components, the FT fault-tolerance mechanism, the quattor system for software installation and configuration, the RMS job and resource management system, and the Gridification scheme that integrates clusters into the Grid.
Databáze: OpenAIRE