Towards Practical Default-On Multi-Core Record/Replay
Autor: | Tal Garfinkel, Ali Mashtizadeh, David Terei, Mendel Rosenblum, David Mazières |
---|---|
Rok vydání: | 2017 |
Předmět: |
Multi-core processor
Source code Replay system Computer science media_common.quotation_subject 020207 software engineering Fault tolerance General Medicine 02 engineering and technology computer.software_genre Computer Graphics and Computer-Aided Design 020202 computer hardware & architecture 020204 information systems 0202 electrical engineering electronic engineering information engineering Operating system General Earth and Planetary Sciences Compiler Throughput (business) computer Software General Environmental Science media_common |
Zdroj: | ASPLOS |
Popis: | We present Castor, a record/replay system for multi-core applications that provides consistently low and predictable overheads. With Castor, developers can leave record and replay on by default, making it practical to record and reproduce production bugs, or employ fault tolerance to recover from hardware failures. Castor is inspired by several observations: First, an efficient mechanism for logging non-deterministic events is critical for recording demanding workloads with low overhead. Through careful use of hardware we were able to increase log throughput by 10x or more, e.g., we could record a server handling 10x more requests per second for the same record overhead. Second, most applications can be recorded without modifying source code by using the compiler to instrument language level sources of non-determinism, in conjunction with more familiar techniques like shared library interposition. Third, while Castor cannot deterministically replay all data races, this limitation is generally unimportant in practice, contrary to what prior work has assumed. Castor currently supports applications written in C, C++, and Go on FreeBSD. We have evaluated Castor on parallel and server workloads, including a commercial implementation of memcached in Go, which runs Castor in production. |
Databáze: | OpenAIRE |
Externí odkaz: |