Generalized Points-to Graphs: A Precise and Scalable Abstraction for Points-to Analysis
Autor: | Alan Mycroft, Pritam M. Gharat, Uday P. Khedker |
---|---|
Přispěvatelé: | Mycroft, Alan [0000-0001-7013-8572], Apollo - University of Cambridge Repository |
Rok vydání: | 2020 |
Předmět: |
Soundness
Theoretical computer science Indirection Computer science Scale (chemistry) procedure summaries 020207 software engineering 02 engineering and technology Construct (python library) bottom-up interprocedural analysis Flow- and context-sensitive interprocedural analysis Control flow Flow (mathematics) 020204 information systems Scalability 0202 electrical engineering electronic engineering information engineering Minification points-to analysis Software |
Popis: | Computing precise (fully flow- and context-sensitive) and exhaustive (as against demand-driven) points-to information is known to be expensive. Top-down approaches require repeated analysis of a procedure for separate contexts. Bottom-up approaches need to model unknown pointees accessed indirectly through pointers that may be defined in the callers and hence do not scale while preserving precision. Therefore, most approaches to precise points-to analysis begin with a scalable but imprecise method and then seek to increase its precision. We take the opposite approach in that we begin with a precise method and increase its scalability. In a nutshell, we create naive but possibly non-scalable procedure summaries and then use novel optimizations to compact them while retaining their soundness and precision. For this purpose, we propose a novel abstraction called the generalized points-to graph (GPG), which views points-to relations as memory updates and generalizes them using the counts of indirection levels leaving the unknown pointees implicit. This allows us to construct GPGs as compact representations of bottom-up procedure summaries in terms of memory updates and control flow between them. Their compactness is ensured by strength reduction (which reduces the indirection levels), control flow minimization (which removes control flow edges while preserving soundness and precision), and call inlining (which enhances the opportunities of these optimizations). The effectiveness of GPGs lies in the fact that they discard as much control flow as possible without losing precision. This is the reason GPGs are very small even for main procedures that contain the effect of the entire program. This allows our implementation to scale to 158 kLoC for C programs. At a more general level, GPGs provide a convenient abstraction to represent and transform memory in the presence of pointers. Future investigations can try to combine it with other abstractions for static analyses that can benefit from points-to information. |
Databáze: | OpenAIRE |
Externí odkaz: |