# Symbolic execution

In computer science, symbolic execution (also symbolic evaluation) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for inputs rather than obtaining actual inputs as normal execution of the program would, a case of abstract interpretation. It thus arrives at expressions in terms of those symbols for expressions and variables in the program, and constraints in terms of those symbols for the possible outcomes of each conditional branch.

The field of symbolic simulation applies the same concept to hardware. Symbolic computation applies the concept to the analysis of mathematical expressions.

## Example

Consider the program below, which reads in a value and fails if the input is 6.

y = read()
y = 2 * y
if (y == 12)
fail()
print("OK")


When a program analyzer symbolically executes this program, it does not have a concrete number for the input value — the result of read() — so assigns the symbol 's' to it. The statement "y = read()" thus assigns the value 's' to program variable y. Then the statement "y = 2 * y" assigns the value '2 * s' to y. The next statement has two possible control flows: the true branch and the false branch. Which gets taken when the program executes for real depends upon the input value ('s'). The analyzer associates the constraint '2 * s == 12' with the true branch, which means it recognizes that the "fail()" statement executes in real execution if and only if '2 * s == 12' is true. The contraint "2 * s == 12 is true" is known as a path constraint.

Assume the goal of the analysis is to determine what inputs cause the "fail()" statement to execute. The analyzer uses a constraint solver to determine what values of 's' make '2 * s == 12' true, and thus determines that '6' is the answer.

## Limitations

### Path Explosion

Symbolically executing all feasible program paths does not scale to large programs. The number of feasible paths in a program grows exponentially with an increase in program size and can even be infinite in the case of programs with unbounded loop iterations.[1] Solutions to the path explosion problem generally use either heuristics for path-finding to increase code coverage,[2] or reduce execution time by parallelizing independent paths.[3]

### Program-Dependent Efficacy

Symbolic execution is used to reason about a program path-by-path which is an advantage over reasoning about a program input-by-input as other testing paradigms use (e.g. Dynamic program analysis). However, if few inputs take the same path through the program, there is little savings over testing each of the inputs separately.

### Tools

Tool Arch/Lang url
KLEE LLVM http://klee.github.io/klee/
FuzzBALL VineIL/native http://bitblaze.cs.berkeley.edu/fuzzball.html
JPF java http://babelfish.arc.nasa.gov/trac/jpf
S2E llvm/qemu/x86 http://dslab.epfl.ch/proj/s2e

## History

The concept of symbolic execution was introduced academically with descriptions of: the Select system ,[4] the EFFIGY system ,[5] the DISSECT system [6] , and Clarke's system .[7] See a bibliography of more technical papers published on symbolic execution.

## References

1. ^ Anand, Saswat; Patrice Godefroid, Nikolai Tillmann (2008). "Demand-Driven Compositional Symbolic Execution". Tools and Algorithms for the Construction and Analysis of Systems, Lecture Notes in Computer Science 4963: 367–381. Retrieved 03/04/2013.
2. ^ Ma, Kin-Keng; Khoo Yit Phang, Jeffrey S. Foster, Michael Hicks (2011). "Directed Symbolic Execution". Proceedings of the 18th International Conference on Statis Analysis: 95–111. Retrieved 03/04/2013.
3. ^ Staats, Matt; Corina Pasareanu (2010). "Parallel symbolic execution for structural test generation". Proceedings of the 19th international symposium on Software testing and analysis: 183–194. Retrieved 03/04/2011.
4. ^ Robert S. Boyer and Bernard Elspas and Karl N. Levitt SELECT--a formal system for testing and debugging programs by symbolic execution, Proceedings of the International Conference on Reliable Software, 1975,page 234--245, Los Angeles, California
5. ^ James C. King,Symbolic execution and program testing, Communications of the ACM, volume 19, number 7, 1976, 385--394
6. ^ William E. Howden, Experiments with a symbolic evaluation system, Proceedings, National Computer Conference, 1976.
7. ^ Lori A. Clarke, A program testing system, ACM 76: Proceedings of the Annual Conference, 1976, pages 488-491, Houston, Texas, United States