SAS 2006

Unleashing the Power of Static Analysis

Manuvir Das (Microsoft)

Abstract

The last few years have seen a surge of activity in the static analysis community on the application of static analysis to program verification and defect detection. Researchers have long believed in the benefit of exposing and fixing potential defects in a program before it is ever run, especially when the program can be made correct by construction, as in the case of compiler-enforced type systems. But every static analysis tool (other than a compiler's type checker) ever built, no matter how precise, suffers from the same fatal flaw in the eyes of the programmer: Defect reports do not come with known user scenarios that expose the defects. Therefore, programmers have been loathe to examine and fix defect reports produced by static analysis tools as a routine part of the software development process. In spite of recent advancements in analysis techniques, there are no papers we are aware of that report programmers fixing more than a few dozen defects.

Like many others, we at the Program Analysis group at Microsoft have spent the last few years building defect detection tools based on static analysis. For the last two years, we have focused our efforts on pushing these tools into the regular software development process of the largest products groups at Microsoft, involving thousands of developers working on tens of millions of lines of code against strict deadlines. Our goal was to answer the following question: If we do enough engineering on the tools, will a large group of programmers who build software for money adopt the tools? In other words, will programmers recognize the inherent preventive value of static analysis?

We are now in a position to answer this question. Programmers have fixed over 25,000 defects reported by our tools in the last 12 months, and they have added over 500,000 formal specifications of function pre-conditions and post-conditions to their programs. Today, many of our tools are enabled by default on the desktop machines of every programmer in the organization. Programmers only enter properly specified, defect-free code into the source code repository. Some of our tools are based on heavyweight global static analysis; these tools are run periodically in a centralized manner, and the defects identified by the tools are filed automatically into the defect database of the product.

In order to achieve this result, we have leaned heavily on advancements from the static analysis community, including but not limited to abstract interpretation, inter-procedural dataflow analysis, linear constraint solving, memory alias analysis, and modular analysis with formal specifications.

We have used these ideas to build a suite of static analysis tools that includes: A global inter-procedural symbolic evaluator (PREfix) for detecting memory usage errors; a global inter-procedural path-sensitive dataflow analysis (ESP) for detecting security vulnerabilities and concurrency defects; a local intra-procedural abstract interpretation with widening and linear constraint solving (espX) for detecting buffer overruns; a global inter-procedural dataflow analysis (SALinfer) for inferring function pre-conditions and post-conditions; and a formal language of function specifications (SAL) that is understood and enforced by all of our tools. SAL is now available to programmers at large via the Visual Studio compiler and various developer kits that are released periodically by Microsoft.

Along the way, we have learnt important lessons about what it takes to convince programmers to adopt static analysis tools, and which areas of static analysis research would be of the most benefit to software developers.