TY - JOUR

T1 - On-the-Fly Static Analysis via Dynamic Bidirected Dyck Reachability

AU - Krishna, Shankaranarayanan

AU - Lal, Aniket

AU - Pavlogiannis, Andreas

AU - Tuppe, Omkar

N1 - Publisher Copyright:
© 2024 Owner/Author.

PY - 2024/1

Y1 - 2024/1

N2 - Dyck reachability is a principled, graph-based formulation of a plethora of static analyses. Bidirected graphs are used for capturing dataflow through mutable heap data, and are usual formalisms of demand-driven points-to and alias analyses. The best (offline) algorithm runs in O(m+n· α(n)) time, where n is the number of nodes and m is the number of edges in the flow graph, which becomes O(n2) in the worst case. In the everyday practice of program analysis, the analyzed code is subject to continuous change, with source code being added and removed. On-the-fly static analysis under such continuous updates gives rise to dynamic Dyck reachability, where reachability queries run on a dynamically changing graph, following program updates. Naturally, executing the offline algorithm in this online setting is inadequate, as the time required to process a single update is prohibitively large. In this work we develop a novel dynamic algorithm for bidirected Dyck reachability that has O(n· α(n)) worst-case performance per update, thus beating the O(n2) bound, and is also optimal in certain settings. We also implement our algorithm and evaluate its performance on on-the-fly data-dependence and alias analyses, and compare it with two best known alternatives, namely (i) the optimal offline algorithm, and (ii) a fully dynamic Datalog solver. Our experiments show that our dynamic algorithm is consistently, and by far, the top performing algorithm, exhibiting speedups in the order of 1000X. The running time of each update is almost always unnoticeable to the human eye, making it ideal for the on-the-fly analysis setting.

AB - Dyck reachability is a principled, graph-based formulation of a plethora of static analyses. Bidirected graphs are used for capturing dataflow through mutable heap data, and are usual formalisms of demand-driven points-to and alias analyses. The best (offline) algorithm runs in O(m+n· α(n)) time, where n is the number of nodes and m is the number of edges in the flow graph, which becomes O(n2) in the worst case. In the everyday practice of program analysis, the analyzed code is subject to continuous change, with source code being added and removed. On-the-fly static analysis under such continuous updates gives rise to dynamic Dyck reachability, where reachability queries run on a dynamically changing graph, following program updates. Naturally, executing the offline algorithm in this online setting is inadequate, as the time required to process a single update is prohibitively large. In this work we develop a novel dynamic algorithm for bidirected Dyck reachability that has O(n· α(n)) worst-case performance per update, thus beating the O(n2) bound, and is also optimal in certain settings. We also implement our algorithm and evaluate its performance on on-the-fly data-dependence and alias analyses, and compare it with two best known alternatives, namely (i) the optimal offline algorithm, and (ii) a fully dynamic Datalog solver. Our experiments show that our dynamic algorithm is consistently, and by far, the top performing algorithm, exhibiting speedups in the order of 1000X. The running time of each update is almost always unnoticeable to the human eye, making it ideal for the on-the-fly analysis setting.

KW - CFL reachability

KW - dynamic algorithms

KW - static analysis

U2 - 10.1145/3632884

DO - 10.1145/3632884

M3 - Journal article

AN - SCOPUS:85182280076

SN - 2475-1421

VL - 8

SP - 1239

EP - 1268

JO - Proceedings of the ACM on Programming Languages

JF - Proceedings of the ACM on Programming Languages

IS - POPL

ER -