Globally Analyzing Million Lines of C, a General Sparse Global Analysis Framework

Kwangkeun Yi

Global static analysis of programs has been considered impractical, but not anymore. I will present a general method for achieving global static analyzers that are precise, sound, yet also scalable. Our method uses spatial and temporal localization techniques to globally analyze million lines of C programs for non-trivial value analysis of memory addresses.

The method generalizes existing sparse analysis techniques on top of the abstract interpretation framework to support arbitrary semantic analysis for C-like languages. Our method asks to design a global static analyzer (an abstract interpreter) whose scalability is unattended. Upon this sound static analyzer, the method prescribes steps to improve its scalability while preserving the precision of the underlying analysis. Existing sparse analyses are all restricted instances of our framework.

I will present performance numbers, a technique of guaranteeing the analysis correctenss by using a verified validator, and our Zoo system(on-going work) that automatically generate sparse anlayzers and their validators from high-level specs.

Bio: Kwangkeun Yi is a professor at the School of Computer Science and Engineering at Seoul National University. He has been working on semantic-based static analysis and type systems, publishing papers in leading conferences including POPL, PLDI, SAS, CAV, VMCAI and ICSE, and having successfully commercialized his group's static bug-finding tools. His homepage has more details.