Sound floating-point and non-convex static analysis using interval linear abstract domains

Liqian Chen

PhD Thesis, April 2010.
Abstract

Trustworthiness of software has become the center of attention when considering modern software quality. Finding bugs before release is fundamental to build trustworthy software, which is extremely important for safety-critical applications such as aerospace, defense and automotive. Scientific and engineering programs especially safety-critical embedded software are usually related to mathematics and physics, and thus often involve a lot of numerical computations. Hence, many program bugs including division by zero, arithmetic overflows and array out-of-bounds, are closely related to numerical properties in the program.

Abstract interpretation is a theory of semantics approximation, allowing the design of sound and efficient static analyses. Abstract domains are a key ingredient in this framework: They enable semantic choices (what properties to infer) and algorithmic choices (how to compute). In this thesis, we focus on design and implementation of numerical abstract domains that are used to automatically discover numerical properties over program variables. Over the past decades, a wide variety of numerical abstract domains have been proposed. However, most existing abstract domains (especially relational ones) need arbitrary precision rational numbers to build implementations, which may degrade the time and memory efficiency. Most of them can only represent convex properties, and hence have limitations in dealing with disjunctions. Besides, currently there is no abstract domain that supports interval variable coefficients which appear naturally in real-life program analysis.

The goal of this thesis is to explore more efficient numerical abstract domains, using sound floating-point constructions, and to design more precise ones, allowing non-convex properties and interval coefficients.

Major contributions of this thesis are listed as follows.

1) We present a so-called floating-point polyhedra abstract domain. The classical convex polyhedra domain is one of the most powerful and commonly used abstract domains in the field, but rational implementations may suffer from scalability problems. To solve this issue, we present an implementation using floating-point arithmetic without sacrificing soundness, based on a constraint-only representation using floating-point coefficients. Since its complexity mainly comes from the costly join operation (i.e., polyhedral convex hull), a series of cheap weak join operations are then proposed.

2) We introduce interval linear algebra to static analysis and propose interval linear abstract domains that support interval coefficients. Intervals are naturally suited to construct sound floating-point abstract domains. First, a so-called interval polyhedra domain is proposed to infer interval linear inequalities. Then, a restricted abstract domain is proposed based on a system of interval linear equalities in row echelon form, polynomial in time and memory. The two domains can be considered as interval extensions of the existing convex polyhedra domain and the affine equality domain respectively. By interpreting solutions as weak solutions of interval linear systems, both domains can express certain non-convex (even unconnected) properties. They are soundly implemented using floating-point numbers based on interval arithmetic with outward rounding.

3) We present linear absolute value abstract domains that natively allow to express non-convex properties. Absolute value (AV) is fundamental in mathematics and often used to describe piecewise linear behavior. The equivalence among linear AV inequality systems, interval linear inequality systems and extended linear complementarity problem (XLCP) systems is stated. We construct a double description method for XLCP on top of that for polyhedra, based on which a new method for solving AV linear programming problems is shown. On this basis, we propose an abstract domain of linear AV equalities, and an abstract domain of linear AV inequalities that has the same expressiveness as interval polyhedra domain but enjoys optimal transfer functions. They are implemented using rational numbers based on the double description method for XLCP.

The numerical abstract domains presented in this thesis provide different trade-offs between precision and efficiency, and can be applied to different problems. They are implemented in the APRON library, and experimental results are encouraging.