Postmortem accurate IR-level state recovery for deployed concurrent programs

Autor: Shinji Hoshino, Katsuhiko Gondow, Yoshitaka Arahori
Rok vydání: 2021
Předmět:
Zdroj: ACM SIGAPP Applied Computing Review. 21:33-48
ISSN: 1931-0161
1559-6915
Popis: Debugging failures of deployed concurrent software is important for quality assurance. However, such failures are difficult to debug because their behavior is non-deterministic and limited information can be obtained with conventional means. Reverse debuggers such as REPT [11] assists with debugging by recovering data values before the failure. This is achieved by using a hardware-tracer to log control-flow information, then using the information and a conventional coredump to recover data values via reverse-execution at machine-level. REPT's algorithm for data value recovery is reliable and fast. But the implementation cost is high because of its dependence on architecture. Applying REPT to more abstract IR (Intermediate Representation) level instructions to counter this yielded limited results with low accuracy compared to the original x86_64 implementation. The main reason for this is that the stack layout is abstracted at IR-level. In this paper, we present STRAB (State Recovery at Abstract-level), a collection of our proposed methods to solve these problems. STRAB works in two phases. First, the data values in the coredump are lifted from machine-level to IR-level using rich debug information (DWARF3) and a novel technique we call mid-recovery lifting, the latter helping to recover more heap data values at IR-level. Second, our novel hybrid memory location resolution reduces the accuracy loss due to the abstracted stack layout at IR-level. Experimental results on a variety of real-world concurrent programs show that STRAB has significantly higher accuracy compared to REPT at IR-level (+40% on average) with only minor slowdowns (x2.7 on average), while also achieving architecture-independence.
Databáze: OpenAIRE