Perform wordcount Map-Reduce Job in Single Node Apache Hadoop cluster and compress data using Lempel-Ziv-Oberhumer (LZO) algorithm

Autor:	Mirajkar, Nandan, Bhujbal, Sandeep, Deshmukh, Aaradhana
Rok vydání:	2013
Předmět:	Computer Science - Distributed Parallel and Cluster Computing
Zdroj:	IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 1, No 2, January 2013 ISSN (Print): 1694-0784 \| ISSN (Online): 1694-0814 www.IJCSI.org
Druh dokumentu:	Working Paper
Popis:	Applications like Yahoo, Facebook, Twitter have huge data which has to be stored and retrieved as per client access. This huge data storage requires huge database leading to increase in physical storage and becomes complex for analysis required in business growth. This storage capacity can be reduced and distributed processing of huge data can be done using Apache Hadoop which uses Map-reduce algorithm and combines the repeating data so that entire data is stored in reduced format. The paper describes performing a wordcount Map-Reduce Job in Single Node Apache Hadoop cluster and compress data using Lempel-Ziv-Oberhumer (LZO) algorithm. Comment: 10 pages, 17 figures, Journal
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/1307.1517 Zobrazit plný text záznamu View this record from Arxiv