Popis: |
Despite the increasing popularity of shared-memory systems, there is a lack of tools for providing fault tolerance support to shared-memory applications. Check pointing is one of the most popular fault tolerance techniques. However, check pointing cost in terms of computing time, network utilization or storage resources can be a limitation for its practical use. This work proposes different techniques for the optimization of the I/O cost in the check pointing of shared-memory parallel applications. The proposals are extensively evaluated using the OpenMP NAS Parallel Benchmarks. Results show a significant decrease of the check pointing overhead. |