Experimentation and Analysis of Dynamic Checkpoint on Apache Hadoop with Failure Scenarios
Autor: | Paulo Vinicius Cardoso, Patricia Pitthan Barcelos |
---|---|
Rok vydání: | 2018 |
Předmět: |
020203 distributed computing
Process (engineering) Computer science Reliability (computer networking) 020206 networking & telecommunications Fault tolerance 02 engineering and technology computer.software_genre Fault (power engineering) Data_FILES 0202 electrical engineering electronic engineering information engineering Operating system Distributed File System computer |
Zdroj: | WSCAD |
DOI: | 10.1109/wscad.2018.00035 |
Popis: | The growth of reliability problems on high performance systems has motivated searches for fault tolerance mechanisms. The Apache Hadoop framework, created to store and process large amounts of data, implements Checkpoint and Recovery to help on recovery process of its distributed file system (Hadoop Distributed File System - HDFS) in presence of failure. However, once configuration attributes can not be changed at runtime, bad choices may cause performance and reliability problems. This work uses a dynamic configuration mechanism for checkpoint on Hadoop and evaluates its performance on scenarios with induced fault on the master element of HDFS. |
Databáze: | OpenAIRE |
Externí odkaz: |