HotROD: Managing grid storage with on-demand replication

Autor:	B. Reed, Adam Silberstein, Sriram Rao
Rok vydání:	2013
Předmět:	Computer science business.industry Distributed computing Data_MISCELLANEOUS Cloud computing Data loss computer.software_genre Replication (computing) Grid computing On demand Cluster (physics) Grid energy storage business computer
Zdroj:	ICDE Workshops
DOI:	10.1109/icdew.2013.6547458
Popis:	Enterprises (such as, Yahoo!, LinkedIn, Facebook) operate their own compute/storage infrastructure, which is effectively a “private cloud”. The private cloud consists of multiple clusters, each of which is managed independently. With HDFS, whenever data is stored in the cluster, it is replicated within the cluster for availability. Unfortunately, for datasets shared across the enterprise, this leads to the problem of over-replication within the private cloud. An analysis of Yahoo!'s HDFS usage suggests that the disk space consumed due to replication of shared datasets is substantial (viz., to the tune of PB's of storage). New data sets are typically popular and requested by many processing jobs in (different) clusters. This demand is satisfied by copying the dataset to each of the clusters. As data sets age, however, they get used less and become cold. We then have the opposite problem of having data overreplicated across clusters: each cluster has enough replicas to recover from data loss locally, and the sum total of replicas is high. We address both the problems of initially replicating data and cross cluster recovery in a private cloud setting using the same technique: on-demand replication, which we refer to as Hot Replication On-Demand (HotROD). By making files visible across HDFS clusters, we let a cluster pull in remote replicas as needed, both for initial replication and later recovery. We implemented HotROD as an extension to a standard HDFS installation.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::1514b3cbe54095c320b3e414e9da24cd https://doi.org/10.1109/icdew.2013.6547458 Zobrazit plný text záznamu