Dynamic Merging based Small File Storage (DM-SFS) Architecture for Efficiently Storing Small Size Files in Hadoop
Autor: | Mohd Abdul Ahad, Ranjit Biswas |
---|---|
Rok vydání: | 2018 |
Předmět: |
business.industry
Computer science 020206 networking & telecommunications Cryptography 02 engineering and technology Hard disk drive performance characteristics computer.software_genre Encryption Data_FILES 0202 electrical engineering electronic engineering information engineering Operating system General Earth and Planetary Sciences Overhead (computing) 020201 artificial intelligence & image processing Routing (electronic design automation) Software-defined networking Distributed File System business computer File storage Merge (version control) General Environmental Science |
Zdroj: | Procedia Computer Science. 132:1626-1635 |
ISSN: | 1877-0509 |
DOI: | 10.1016/j.procs.2018.05.128 |
Popis: | In today’s computing era, the voluminous data that is generated every moment needs special tools and techniques for its effective and efficient handling and storage. In this paper, a technique for efficiently storing small size files in Hadoop distributed file system has been proposed. The proposal works by filtering the incoming files on the basis of two parameters- “file-type” (text, pdf, document, binary etc) and “file-size” (the amount of storage space required by the file). In order to secure the contents of the files we also propose to encrypt the files using Twofish cryptographic technique. This filtration and encryption is carried out before the files are passed onto the Hadoop distributed file system. For efficient storage of file, the small files are merged together into a single unit. The basic criteria for merging small size files here is the “dynamic merging techniques” with respect to the type of file instead of a generalized merging strategy. Furthermore, for efficient routing of files from source to destination and vice-versa, the concept of Software Defined Networking (SDN) has been adopted in the proposal. The empirical results shows that the proposed architecture is helpful in saving the Namenode memory overhead as well as reducing the disk seek time to a greater extent. |
Databáze: | OpenAIRE |
Externí odkaz: |