A high-bandwidth and low-cost data processing approach with heterogeneous storage architectures
Autor: | Bing Wei, Wei Wei, Limin Xiao, Song Yao, Baicheng Yan, Zhisheng Huo |
---|---|
Rok vydání: | 2020 |
Předmět: |
Data processing
Computer science business.industry Distributed computing Big data Volume (computing) Construct (python library) Management Science and Operations Research Library and Information Sciences Field (computer science) Computer Science Applications Storage area network Hardware and Architecture Bandwidth (computing) Distributed File System business |
Zdroj: | Personal and Ubiquitous Computing. 27:159-176 |
ISSN: | 1617-4917 1617-4909 |
Popis: | How to efficiently process big data at a low cost is a substantial challenge. Many efficient and economical data processing approaches have been proposed in the fields including business, scientific research, and public administration. Unfortunately, seismic data processing has not achieved the same level of devolvement in the field of oil exploration. While many storage architectures, such as network-attached storage (NAS) and storage area network (SAN), have been widely used to process massive amounts of seismic data, these architectures are expensive in terms of bandwidth and capacity. In this paper, we propose a high-bandwidth and low-cost approach to fill this gap. NASStore is our data store built on NAS for processing seismic data. However, it cannot provide a high bandwidth at a low cost when it comes to data-intensive computing scenarios due to the massive bandwidth requirement and the huge volume of data to be stored. Distributed file systems, such as the Hadoop Distributed File System (HDFS), offer an alternative approach to store data. It delivers high aggregate performance to user applications while running on inexpensive commodity hardware. In order to overcome the shortcomings of NASStore, we first present HDFSStore that is built on HDFS for processing seismic data. We then couple NASStore and HDFSStore to construct a new hybrid data store, called SeisStore, in which efficient parallel write, read, and update mechanisms are employed to improve the system performance. The experiment results show that SeisStore reduces the storage cost than NASStore by up to 23.20% and improves the access bandwidth than NASStore and HDFSStore by up to 478.84% and 16.99%, respectively. |
Databáze: | OpenAIRE |
Externí odkaz: |