DRMaestro: orchestrating disaggregated resources on virtualized data-centers

Autor: Alaa Youssef, Jordà Polo, Chih-Chieh Yang, Bruce D'Amora, Marcelo Amaral, Alessandro Morari, David Carrera, Malgorzata Steinder, Nelson Mimura Gonzalez
Přispěvatelé: Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Speedup
lcsh:Computer engineering. Computer hardware
Computació en núvol
Computer Networks and Communications
Computer science
Distributed computing
GPU
Cloud computing
lcsh:TK7885-7895
02 engineering and technology
01 natural sciences
lcsh:QA75.5-76.95
Resource (project management)
Resources disaggregation
0103 physical sciences
0202 electrical engineering
electronic engineering
information engineering

Composable architecture
Resource allocation
Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC]
010302 applied physics
020203 distributed computing
Job shop scheduling
business.industry
Quality of service
Orchestration
Provisioning
Assignació de recursos
Data center
lcsh:Electronic computers. Computer science
business
Centres informàtics
Software
Data processing service centers
Zdroj: Journal of Cloud Computing: Advances, Systems and Applications, Vol 10, Iss 1, Pp 1-20 (2021)
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Popis: Modern applications demand resources at an unprecedented level. In this sense, data-centers are required to scale efficiently to cope with such demand. Resource disaggregation has the potential to improve resource-efficiency by allowing the deployment of workloads in more flexible ways. Therefore, the industry is shifting towards disaggregated architectures, which enables new ways to structure hardware resources in data centers. However, determining the best performing resource provisioning is a complicated task. The optimality of resource allocation in a disaggregated data center depends on its topology and the workload collocation. This paper presents DRMaestro, a framework to orchestrate disaggregated resources transparently from the applications. DRMaestro uses a novel flow-network model to determine the optimal placement in multiple phases while employing best-efforts on preventing workload performance interference. We first evaluate the impact of disaggregation regarding the additional network requirements under higher network load. The results show that for some applications the impact is minimal, but other ones can suffer up to 80% slowdown in the data transfer part. After that, we evaluate DRMaestro via a real prototype on Kubernetes and a trace-driven simulation. The results show that DRMaestro can reduce the total job makespan with a speedup of up to ≈1.20x and decrease the QoS violation up to ≈2.64x comparing with another orchestrator that does not support resource disaggregation. This project is supported by the IBM/BSC Technology Center for Supercomputing collaboration agreement. It has also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 639595). It is also partially supported by the Ministry of Economy of Spain under contract TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051, by the ICREA Academia program, and by the BSC-CNS Severo Ochoa program (SEV-2015-0493).
Databáze: OpenAIRE