DFI：概括性价值流分析框架，可扩展到大型代码库

论文标题

DFI：概括性价值流分析框架，可扩展到大型代码库

DFI: An Interprocedural Value-Flow Analysis Framework that Scales to Large Codebases

论文作者

Hsu, Min-Yih, Hetzelt, Felicitas, Franz, Michael

论文摘要

上下文和流动敏感的价值流信息是许多静态分析工具的重要组成部分。不幸的是，由于高内存和运行时要求，当前计算价值流的方法不能扩展到大型代码库。本文提出了一种新的可扩展方法，以通过图形可及性计算值。为此，我们开发了一个新的图结构作为LLVM IR的扩展，其中包含两个其他操作，可显着简化指针混叠的建模。此外，通过处理与SSA def-链相反方向的节点，我们能够最大程度地减少所得图的树宽度。这使我们能够采用有效的树遍历算法来解决图形可达性。我们提出了一个实施我们的方法的价值分析框架DFI。我们将DFI与两个最先进的价值分析框架Phasar和SVF进行比较，以从4个真实世界软件项目中提取价值流。给定32GB的内存，Phasar和SVF无法完成对诸如OpenSSL或FFMPEG等较大项目的分析，而DFI可以完成所有评估。对于Phasar和SVF确实处理的基准子集的子集，DFI所需的记忆需要少得多（占Phasar的1.5％，平均占SVF的记忆足迹的6.4％），并且运行速度明显更快（与SVF相比，Phasar的23倍加速度为23倍，57倍的速度为57倍）。我们的分析表明，与以前的方法相比，DFI的内存和运行时需求几乎与分析指令的数量线性缩放。

Context- and flow-sensitive value-flow information is an important building block for many static analysis tools. Unfortunately, current approaches to compute value-flows do not scale to large codebases, due to high memory and runtime requirements. This paper proposes a new scalable approach to compute value-flows via graph reachability. To this end, we develop a new graph structure as an extension of LLVM IR that contains two additional operations which significantly simplify the modeling of pointer aliasing. Further, by processing nodes in the opposite direction of SSA def-use chains, we are able to minimize the tree width of the resulting graph. This allows us to employ efficient tree traversal algorithms in order to resolve graph reachability. We present a value-flow analysis framework,DFI, implementing our approach. We compare DFI against two state-of-the-art value-flow analysis frameworks, Phasar and SVF, to extract value-flows from 4 real-world software projects. Given 32GB of memory, Phasar and SVF are unable to complete analysis of larger projects such as OpenSSL or FFmpeg, while DFI is able to complete all evaluations. For the subset of benchmarks that Phasar and SVF do handle, DFI requires significantly less memory (1.5% of Phasar's, 6.4% of SVF's memory footprint on average) and runs significantly faster (23x speedup over Phasar, 57x compared to SVF). Our analysis shows that, in contrast to previous approaches, DFI's memory and runtime requirements scale almost linearly with the number of analyzed instructions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题