Jump to content

Common Lisp

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 128.61.105.190 (talk) at 23:51, 27 August 2007 (→‎Scalar types). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard X3.226-1994. Developed to standardize the divergent variants of Lisp which predated it, it is not an implementation but rather a language specification. Several implementations of the Common Lisp standard are available, including commercial products and open source software.

Common Lisp is a multiparadigm, general-purpose programming language that:

Syntax

Common Lisp is a dialect of Lisp; it uses S-expressions to denote both code and data structure. Function and macro calls are written as lists, with the name of the function first, as in these examples:

(+ 2 2)           ; adds 2 and 2, yielding 4.
(setf p 3.1415)   ; sets the variable p equal to 3.1415
; Define a function that squares a number:
(defun square (x) 
  (* x x))
; Execute the function:
(square 3)        ; Returns 9
; construction - variables existing only in 'let' block, so you can easily give any new
; values to any existing variables without changing them. Old values are restored after
; the end of the block.
(let ((a 6) (b 4)) 
  (+ a b)) ; returns 10

Data types

Common Lisp has a plethora of data types, more than many other languages.

Scalar types

Number types include integers, ratios, floating-point numbers, and complex numbers. Common Lisp uses bignums to represent numerical values of arbitrary size and precision. The ratio type represents fractions exactly, a facility not available in many languages. Common Lisp automatically coerces numeric values among these types as appropriate.

The Common Lisp character type is not limited to ASCII characters -- unsurprising, as Lisp predates ASCII. Some modern implementations allow Unicode characters. [1]

The symbol type is common to Lisp languages, but largely unknown outside them. A symbol is a unique, named data object with several slots. Of these, value cell and function cell are the most important. Symbols in Lisp are often used similarly to identifiers in other languages: to hold value of a variable; however there are many other uses. Normally, when a symbol is evaluated, its value is returned. Some symbols evaluate to themselves, for example all symbols in keyword package are self-evaluating. Boolean values in Common Lisp are represented by the self-evaluating symbols T and NIL.

Data structures

Sequence types in Common Lisp include lists, vectors, bit-vectors, and strings. There are many operations which can work on any sequence type.

As in any other Lisp, lists in Common Lisp are composed of conses, sometimes called cons cells or pairs. A cons is a data structure with two slots, called its car and cdr. A list is a linked chain of conses. Each cons's car refers to a member of the list (possibly another list). Each cons's cdr refers to the next cons -- except for the last cons, whose cdr refers to the nil value. Conses can also easily be used to implement trees and other complex data structures; though it is usually advised to use structure or class instances instead.

Common Lisp supports multidimensional arrays, and can dynamically resize arrays if required. Multidimensional arrays can be used for matrix mathematics. A vector is a one-dimensional array. Arrays can carry any type as members (even mixed types in the same array) or can be specialized to contain a specific type of members, as in a vector of integers. Many implementations can optimize array functions when the array used is type-specialized. Two type-specialized array types are standard: a string is a vector of characters, while a bit-vector is a vector of bits.

Hash tables store associations between data objects. Any object may be used as key or value. Hash tables, like arrays, are automatically resized as needed.

Packages are collections of symbols, used chiefly to separate the parts of a program into namespaces. A package may export some symbols, marking them as part of a public interface.

Structures, similar in use to C structs and Pascal records, represent arbitrary complex data structures with any number and type of fields (called slots).

Class instances are similar to structures, but created by the object system, CLOS.

Functions

Common Lisp supports first-class functions. For instance, it is possible to write functions that take other functions as arguments or return functions as well. This makes it possible to describe very general operations.

The Common Lisp library relies heavily on such higher-order functions. For example, the sort function takes a relational operator as an argument. This can be used not only to sort any type of data, but also to sort data structures according to a key.

(sort (list 5 2 6 3 1 4) #'>)
; Sorts the list using the > function as the relational operator.
; Returns (6 5 4 3 2 1).
(sort (list '(9 a) '(3 b) '(4 c))
      (lambda (x y) (< (car x) (car y))))
; Sorts the list according to the first element (car) of each sub-list.
; Returns ((3 b) (4 c) (9 a)).

The evaluation model for functions is very simple. When the evaluator encounters a form (F A1 A2...) then it is to assume that the symbol named F is one of the following:

  1. A special operator (easily checked against a fixed list)
  2. A macro operator (must have been defined previously)
  3. The name of a function (default), which may either be a symbol, or a sub-form beginning with the symbol lambda.

If F is the name of a function, then the arguments A1, A2, ..., An are evaluated in left-to-right order, and the function is found and invoked with those values supplied as parameters.

Defining functions

The macro defun defines functions. A function definition gives the name of the function, the names of any arguments, and a function body:

(defun square (x)
   (* x x))

Function definitions may include declarations, which provide hints to the compiler about optimization settings or the data types of arguments. They may also include documentation strings (docstrings), which the Lisp system may use to provide interactive documentation:

(defun square (x)
   (declare (number x) (optimize (speed 3) (debug 0) (safety 1)))
   "Calculates the square of the number x."
   (* x x))

Anonymous functions are defined using the lambda expressions. Lisp programming style frequently uses higher-order functions for which it is useful to provide anonymous functions as arguments.

There are a number of other operators related to the definition and manipulation of functions. For instance, a function may be recompiled with the compile operator. (Some Lisp systems run functions in an interpreter by default unless instructed to compile; others compile every entered function on the fly.)

The function namespace

The namespace for function names is separate from the namespace for data variables. This is a key difference between Common Lisp and Scheme. Operators which define names in the function namespace include defun, flet, and labels.

To pass a function by name as an argument to another function, one must use the function special operator, commonly abbreviated as #'. The first sort example above refers to the function named by the symbol > in the function namespace, with the code #'>.

Scheme's evaluation model is simpler: there is only one namespace, and all positions in the form are evaluated (in any order) -- not just the arguments. Code written in one dialect is therefore sometimes confusing to programmers more experienced in the other. For instance, many CL programmers like to use descriptive variable names such as list or string which could cause problems in Scheme as they would locally shadow function names.

Whether a separate namespace for functions is an advantage is a source of contention in the Lisp community. It is usually referred to as the Lisp-1 vs. Lisp-2 debate. These names were coined in a 1988 paper by Richard P. Gabriel and Kent Pitman, which extensively compares the two approaches. [2]

Other types

Other data types in Common Lisp include:

  • Pathnames represent files and directories in the filesystem. The Common Lisp pathname facility is more general than most operating systems' file naming conventions, making Lisp programs' access to files broadly portable across diverse systems.
  • Input and output streams represent sources and sinks of binary or textual data, such as the terminal or open files.
  • Common Lisp has a built-in pseudo-random number generator (PRNG). Random state objects represent reusable sources of pseudo-random numbers, allowing the user to seed the PRNG or cause it to replay a sequence.
  • Conditions are a type used to represent errors, exceptions, and other "interesting" events to which a program may respond.
  • Classes are first-class objects, and are themselves instances of classes called metaclasses.

Macros

A macro in Lisp superficially resembles a function in usage. However, rather than representing an expression which is evaluated, it represents a transformation of the program source code.

Macros allow Lisp programmers to create new syntactic forms in the language. For instance, this macro provides the until loop form, which may be familiar from languages such as Perl:

 (defmacro until (test &body body)
   `(do ()
        (,test)
      ,@body))

 ;; example
 (until (= (random 10) 0) 
   (write-line "Hello"))

All macros must be expanded before the source code containing them can be evaluated or compiled normally. Macros can be considered functions that accept and return abstract syntax trees (Lisp S-expressions). These functions are invoked before the evaluator or compiler to produce the final source code. Macros are written in normal Common Lisp, and may use any Common Lisp (or third-party) operator available. The backquote notation used above is provided by Common Lisp specifically to simplify the common case of substitution into a code template.

Variable capture and shadowing

Common Lisp macros are capable of variable capture, where symbols in the macro-expansion body coincide with those in the calling context, allowing the programmer to create macros wherein various symbols have special meaning.

Variable capture can introduce unexpected and unusual errors. Some Lisp systems, such as Scheme, avoid variable capture by using macro syntax — so-called "hygienic macros" — that does not allow it. In Common Lisp, one can usually avoid unwanted capture by using gensyms: guaranteed-unique symbols which can be used in a macro-expansion without threat of capture.

Another issue is the inadvertent shadowing of operators used in a macroexpansion. For example, consider the following (incorrect) code:

(macrolet ((do (...) ... something else ...))
   (until (= (random 10) 0) (write-line "Hello")))

The UNTIL macro will expand into a form which calls DO which is intended to refer to the built-in macro DO. However, in this context, DO may have a completely different meaning.

Common Lisp ameliorates the problem of operator shadowing by forbidding the redefinition of built-in operators, such as DO in this example. Moreover, users may separate their own code into packages. Built-in symbols are found in the COMMON-LISP package, which will not be shadowed by a symbol in a user package.

Common Lisp Object System

Common Lisp includes a toolkit for object-oriented programming, the Common Lisp Object System or CLOS, which is one of the most powerful object systems available in any language. Originally proposed as an add-on, CLOS was adopted as part of the ANSI standard for Common Lisp. CLOS is a dynamic object system, and differs radically from the OOP facilities found in static languages such as C++ or Java.

Comparison with other Lisps

Common Lisp is most frequently compared with, and contrasted to, Scheme—if only because they are the two most popular Lisp dialects. Scheme predates CL, and comes not only from the same Lisp tradition but from some of the same engineers—Guy L. Steele, with whom Gerald Jay Sussman designed Scheme, chaired the standards committee for Common Lisp.

Common Lisp is a general-purpose programming language, in contrast to Lisp variants such as Emacs Lisp and AutoLISP which are embedded extension languages in particular products. Unlike many earlier Lisps, Common Lisp (like Scheme) uses lexical variable scope.

Most of the Lisp systems whose designs contributed to Common Lisp—such as ZetaLisp and Franz Lisp—used dynamically scoped variables in their interpreters and lexically scoped variables in their compilers. Scheme introduced the sole use of lexically-scoped variables to Lisp; an inspiration from ALGOL 68 which was widely recognized as a good idea. CL supports dynamically-scoped variables as well, but they must be explicitly declared as "special". There are no differences in scoping between ANSI CL interpreters and compilers.

Common Lisp is sometimes termed a Lisp-2 and Scheme a Lisp-1, referring to CL's use of separate namespaces for functions and variables. (In fact, CL has many namespaces, such as those for go tags, block names, and loop keywords.) There is a long-standing controversy between CL and Scheme advocates over the tradeoffs involved in multiple namespaces. In Scheme, it is (broadly) necessary to avoid giving variables names which clash with functions; Scheme functions frequently have arguments named lis, lst, or lyst so as not to conflict with the system function list. However, in CL it is necessary to explicitly refer to the function namespace when passing a function as an argument -- which is also a common occurrence, as in the sort example above.

CL also differs from Scheme in its handling of boolean values. Scheme uses the special values #t and #f to represent truth and falsity. CL follows the older Lisp convention of using the symbols T and NIL, with NIL standing also for the empty list. In CL, any non-NIL value is treated as true by conditionals such as if as are non-#f values in Scheme. This allows some operators to serve both as predicates (answering a boolean-valued question) and as returning a useful value for further computation.

Lastly, the Scheme standards documents require tail-call optimization, which the CL standard does not. Most CL implementations do offer tail-call optimization, although often only when the programmer uses an optimization directive. Nonetheless, common CL coding style does not favor the ubiquitous use of recursion that Scheme style prefers -- what a Scheme programmer would express with tail recursion, a CL user would usually express with an iterative expression in do, dolist, loop, or (more recently) with the iterate package.

Implementations

Common Lisp is defined by a specification (like Ada and C) rather than by a single implementation (like Perl). There are many implementations, and the standard spells out areas in which they may validly differ.

In addition, implementations tend to come with library packages, which provide functionality not covered in the standard. Free Software libraries have been created to support such features in a portable way, most notably Common-Lisp.net and the Common Lisp Open Code Collection project.

Common Lisp has been designed to be implemented by incremental compilers. Standard declarations to optimize compilation (such as function inlining) are proposed in the language specification. Most Common Lisp implementations compile functions to native machine code. Others compile to bytecode, which reduces speed but eases binary-code portability. The misconception that Lisp is a purely-interpreted language is most likely due to the fact that Common Lisp environments provide an interactive prompt and that functions are compiled one-by-one, in an incremental way.

Some Unix-based implementations, such as CLISP, can be used as script interpreters; that is, invoked by the system transparently in the way that a Perl or Unix shell interpreter is.

List of implementations

Freely redistributable implementations include:

  • CMUCL, originally from Carnegie Mellon University, now maintained as Free Software by a group of volunteers. CMUCL uses a fast native-code compiler. It is available on Linux and BSD for Intel x86; Linux for Alpha; and Solaris, IRIX, and HP-UX on their native platforms.
  • Steel Bank Common Lisp (SBCL), a branch from CMUCL. "Broadly speaking, SBCL is distinguished from CMU CL by a greater emphasis on maintainability." [3] SBCL runs on the platforms CMUCL does, except HP/UX; in addition, it runs on Linux for PowerPC, SPARC, and MIPS, and on Mac OS X. SBCL does not use an interpreter; all expressions are compiled to native code. Experimental Windows support.
  • CLISP, a bytecode-compiling implementation, portable and runs on a number of Unix and Unix-like systems (including Mac OS X), as well as Microsoft Windows and several other systems.
  • GNU Common Lisp (GCL), the GNU Project's Lisp compiler. Not yet fully ANSI-compliant, GCL is however the implementation of choice for several large projects including the mathematical tools Maxima, AXIOM and ACL2. GCL runs under GNU/Linux on eleven different architectures, and also under Windows, Solaris, and FreeBSD.
  • Embeddable Common Lisp (ECL), designed to be embedded in C programs.
  • OpenMCL [4], a free software / open source branch of Macintosh Common Lisp. As the name implies, OpenMCL was originally native to the Macintosh; it now runs on Mac OS X, Darwin, and GNU/Linux for PowerPC and Intel x86-64.
  • Movitz implements a Lisp environment for x86 computers without relying on any underlying OS.
  • The Poplog system implements a version of CL, with POP-11, and optionally Prolog, and Standard ML (SML), allowing mixed language programming. For all, the implementation language is POP-11, which is compiled incrementally. It also has an integrated Emacs-like editor that communicates with the compiler.
  • Java-oriented

Commercial implementations include:

Applications

Common Lisp is used in many successful commercial applications, the most famous (no doubt due to Paul Graham's promotion) was the Yahoo! Store web-commerce site, which later was rewritten in C++ and Perl. [1] Other notable examples include:

  • BioBike is a bioinformatics workflow development environment implemented entirely in Common Lisp. Its distinguishing feature is through-the-browser programmability.
  • Genworks International's General-purpose Declarative Language (GDL), a development tool for creating web-based engineering, design, and business applications.
  • Igor Engraver: a 1st-Class Music notation and engraving program written in Common Lisp.
  • Jak and Daxter video games for Playstation2
  • Knowledge Technologies International's ICAD mechanical design software.
  • Mirai, Izware LLC's fully integrated 2d/3d computer graphics content creation suite that features a polygonal modeler, an advanced IK/FK and non-linear animation system (later popularized by such products as Sega's Animanium and Softimage XSI, respectively), and advanced 2d and 3d painting. It was used in major motion pictures (most famously in New Line Cinema's The Lord of the Rings), video games and military simulations.
  • Orbitz, a major travel booking site.
  • Piano, a package for commercial aircraft preliminary design and competitor evaluation.
  • Xanalys Corp.'s line of investigation software, used by police, security and fraud prevention services worldwide.

There also exist successful open-source applications written in Common Lisp, such as:

As well, Common Lisp is used by many government and non-profit institutions. Examples of its use in NASA include:

See also

References

  1. ^ "In January 2003, Yahoo released a new version of the editor written in C++ and Perl. It's hard to say whether the program is no longer written in Lisp, though, because to translate this program into C++ they literally had to write a Lisp interpreter: the source files of all the page-generating templates are still, as far as I know, Lisp code." Paul Graham, Beating the Averages

Tutorials