Jump to content

Intermediate representation

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Sillyfolkboy (talk | contribs) at 13:45, 27 April 2009 (→‎See also: clean up). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer science, an intermediate language is the language of an abstract machine designed to aid in the analysis of computer programs. The term comes from their use in compilers, where a compiler first translates the source code of a program into a form more suitable for code-improving transformations, as an intermediate step before generating object or machine code for a target machine. The design of an intermediate language typically differs from that of a practical machine language in three fundamental ways:

  • Each instruction represents exactly one fundamental operation; e.g. "shift-add" addressing modes common in microprocessors are not present.
  • Control flow information may not be included in the instruction set.
  • The number of registers available may be large, even limitless.

A popular format for intermediate languages is three address code.

A variation in the meaning of this term, is those languages used as an intermediate language by some high-level programming languages which do not output object or machine code, but output the intermediate language only, to submit to a compiler for such language, which then outputs finished object or machine code. This is usually done to gain optimization much as treated above, or portability by using an intermediate language that has compilers for many processors and operating systems, such as C. Languages used for this fall in complexity between high-level languages and low-level languages, such as assembly languages.

Languages

C is used as an intermediate language by many programming languages including Eiffel, Sather, Esterel, some dialects of Lisp (Lush, Gambit), Haskell (Glasgow Haskell Compiler), Squeak's C-subset Slang, and others. Variants of C have been designed to provide C's features as a portable assembly language, including one of the two languages called C--, the C Intermediate Language and the Low Level Virtual Machine.

Sun Microsystem's Java bytecode is the intermediate language used by all compilers targeting the Java Virtual Machine. The JVM can then do just-in-time compilation to get executable machine code to improve performances. Similarly, Microsoft's Common Intermediate Language is an intermediate language designed to be shared by all compilers for the .NET Framework, before static or dynamic compilation to machine code.

The GNU Compiler Collection (GCC) uses internally several intermediate languages to simplify portability and cross-compilation. Among these languages are

While most intermediate languages are designed to support statically typed languages, the Parrot intermediate representation is designed to support dynamically typed languages -- initially Perl and Python.

The ILOC intermediate language[1] is used in classes on compiler design as a simple target language. [2]

See also

References

  1. ^ "An ILOC Simulator" by W. A. Barrett 2007, paraphrasing Keith Cooper and Linda Torczon, "Engineering a Compiler", Morgan-Kaufman 2004. ISBN 1-55860-698-X
  2. ^ "CISC 471 Compiler Design" by Uli Kremer