Scheme (programming language): Difference between revisions

Content deleted Content added

Inline

Revision as of 20:34, 2 October 2005

The Knights of the Lambda Calculus' recursive emblem celebrates Scheme's theoretical foundation, the lambda calculus.

Scheme is a functional programming language and a dialect of Lisp. It was developed by Guy L. Steele and Gerald Jay Sussman in the 1970s initially as an attempt to understand the Actor model and introduced to the academic world via a series of papers now referred to as Sussman and Steele's Lambda Papers. Minor implementation details tend to differ slightly, so sometimes Scheme is referred to as a family of closely related programming languages.

Scheme's philosophy is unashamedly minimalist. Its goal is not to pile feature upon feature, but to remove weaknesses and restrictions that make new features appear necessary. Therefore, Scheme provides as few primitive notions as possible, and where this is practical in an implementation, tends to let everything else be provided by libraries that are built on top of them. For example, the main mechanism for governing control flow is tail recursion.

Scheme was the first dialect of Lisp to use lexical variable scoping (aka static scoping, as opposed to dynamic variable scoping) exclusively. It was also one of the first programming languages to support explicit continuations. Scheme also supports garbage collection of unreferenced data.

Scheme uses lists as the primary data structure, but also has good support for arrays. Owing to the minimalist specification, there is no standard syntax for creating structures with named fields, or for doing object oriented programming, but many individual implementations have such features.

Origin of Scheme

In their paper on the evolution of Lisp, Richard Gabriel and Guy Steele explain the origin of Scheme as follows:

The dialect of Lisp known as Scheme was originally an attempt by Gerald Jay Sussman and Guy Steele during Autumn 1975 to explicate for themselves some aspects of Carl Hewitt’s theory of actors as a model of computation. Hewitt’s model was object-oriented (and influenced by Smalltalk); every object was a computationally active entity capable of receiving and reacting to messages. The objects were called actors, and the messages themselves were also actors. An actor could have arbitrarily many acquaintances; that is, it could “know about” (in Hewitt’s language) other actors and send them messages or send acquaintances as (parts of) messages. Message-passing was the only means of interaction. Functional interactions were modeled with the use of continuations; one might send the actor named “factorial” the number 5 and another actor to which to send the eventually computed value (presumably 120).

Sussman and Steele had some trouble understanding some of the consequences of the model from Hewitt’s papers and language design, so they decided to construct a toy implementation of an actor language in order to experiment with it. Using MacLisp as a working environment, they decided to construct a tiny Lisp interpreter and then add the necessary mechanisms for creating actors and sending messages. The toy Lisp would provide the necessary primitives for implementing the internal behavior of primitive actors.

Scheme was originally called "Schemer", in the tradition of the languages Planner and Conniver. The current name resulted from the authors' use of the ITS operating system, which limited filenames to two components of at most six characters each. Currently, "Schemer" is commonly used to refer to a Scheme programmer.

Advantages

Scheme, as all Lisp dialects, has very little syntax compared to many other programming languages. It has no operator precedence rules because fully nested notation is used for all function calls, and there are no ambiguities as are found in infix notation, which mimics conventional algebraic notation.

Many people are put off by all the parentheses used in Scheme notation. However, Scheme is usually processed and displayed using editors which automatically indent the code in a conventional manner, and after a period of accommodation the parentheses supposedly become unobtrusive. Also, R6RS, the next version of the Scheme standard, will specify that square brackets (and possibly curly braces) may be used for S-expressions.

Scheme's macro facilities allow it to be adapted to many different problem domains. They can be used to add support for object-oriented programming. Scheme provides a hygienic macro system which, while not quite as powerful as Common Lisp's macro system, is much safer and often easier to work with. The advantage of a hygienic macro system (as found in Scheme and other languages such as Dylan) is that any name clashes in the macro and surrounding code will be automatically avoided. The hygienic macro system is always built on some low-level facility which provides the full power of non-hygienic macros, including arbitrary syntax-time computations.

Scheme encourages functional programming. Purely functional programs have no state and don't have side-effects, and are therefore automatically thread-safe and considerably easier to verify than imperative programs.

In Scheme, functions are first-class objects. This allows for higher-order functions which can further abstract program logic. Functions can also be created anonymously.

Scheme has a minimalistic standard. While this can be seen as a disadvantage, it can also be valuable. For example, writing a conforming Scheme compiler is easier (since there are fewer features to implement) than a Common Lisp one; embedding Lisp in low-memory hardware may also be more feasible with Scheme than Common Lisp. Schemers find it amusing to note that the whole Scheme standard is smaller than the index to Guy Steele's Common Lisp: The Language (that is, about 50 pages).

Disadvantages

The Scheme standard is very minimalist, specifying only the core language. This means that there are many different implementations, each with their own incompatible extensions to the language and libraries. The Scheme Requests for Implementation (SRFI) process tries to remedy this.

Some Common Lispers see the fact that functions and variables lie in the same namespace as a disadvantage. Scheme programmers do not see this as a problem, and prefer a language that encourages using higher-order functions rather than the extra verbosity that such code requires in Common Lisp. (The two communities often have long fights over such issues.)

Scheme has a cryptic and unintuitive syntax is therefore relatively hard to learn. It is fairly cumbersome and not a practical language for developing larger applications. Because of the way the language is implemented, Scheme programs only run at a fraction of the speed of programs written in a modern compiler language.

Some schools still teach Scheme in their computer schience classes, claiming that it was a good introductory language. This myth has been debunked and critics emphasize that modern programming languages are easier and more fun to learn and are much more important for the majority of students who want to do serious software development later on.

Standards

There are two standards that define the Scheme language: the official IEEE standard, and a de facto standard called the Revisedⁿ Report on the Algorithmic Language Scheme, nearly always abbreviated RnRS, where n is the number of the revision. The latest RnRS version is R5RS, also available online.

A new language standardization process was begun at the 2003 Scheme workshop that has the remit of producing an R6RS standard in 2006. It breaks with the earlier RnRS approach of unanimity. Their current progress can be found on the web.

Possibly the most important new feature in R6RS will be a standard module system (currently being designed). This will allow a split between the core language and libraries.

Language elements

Comments

Comments are preceded by a semicolon (;) and continue for the rest of the line.

Variables

Variables are dynamically typed. Variables are bound by a define, a let expression, and a few other Scheme forms. Variables bound at the top level with a define are in global scope.

 (define var1 value)

Variables bound in a let are in scope for the body of the let.

 (let ((var1 value))
   ...
   scope of var1
   ...)

Functions

Functions are first-class objects in Scheme. They can be assigned to variables. For example a function with two arguments arg1 and arg2 can be defined as

 (define fun
   (lambda (arg1 arg2)
     ...))

which can be abbreviated as follows:

 (define (fun arg1 arg2)
   ...)

Functions are applied with the following syntax:

 (fun value1 value2)

Note that the function being applied is in the first position of the list while the rest of the list contain the arguments. The apply function will take the first argument and apply it to a given list of arguments, so the previous function call can also be written as

 (apply fun (list value1 value2))

In Scheme, functions are divided into two basic categories: procedures and primitives. All primitives are procedures, but not all procedures are primitives. Primitives are pre-defined functions in the Scheme language. These include +, -, *, /, set!, car, cdr, and other basic procedures. Procedures are user-defined functions. In several variations of Scheme, a user can redefine a primitive. For example, the code

(define (+ x y)
  (- x y))

actually redefines the + primitive to perform subtraction, rather than addition.

Lists

Scheme uses the linked list data structure in the same form as it exists in Lisp.

Data types

Other common data types in Scheme besides functions and lists are: integer, rational, real, complex numbers, symbols, strings, ports. Most Scheme implementations also offer association lists, hash tables, vectors, arrays and structures. Since the IEEE Scheme standard and the R4RS Scheme standard, Scheme has asserted that all of the above types are disjoint, that is no value can belong to more than one of these types; however some ancient implementations of scheme predate these standards and have #f and '() to be the same value, as is the case in Common Lisp.

Most Scheme implementations offer a full numerical tower as well as exact and inexact arithmetic.

True and false are represented by the #t and #f. Actually only #f is really false when a Boolean type is required, everything else will be considered true, including the empty list.

Symbols can be created in at least the following ways:

 'symbol
 (string->symbol "symbol")

Equality

Scheme has three different types of equality:

eq?: Returns #t if its parameters represent the same data object in memory.
eqv?: Generally the same as eq? but treats some objects (eg. characters and numbers) specially so that numbers that are = are eqv? even if they are not eq?
equal?: Compares data structures such as lists, vectors and strings to determine if they have congruent structure and eqv? contents.

Type dependent equivalence operations also exist in Scheme:

string=?: To compare two strings
char=?: To compare characters
=: To compare numbers

Control structures

Conditional evaluation

 (if test then-expr else-expr)

The test expression is evaluated, and if the evaluation result is true (anything other than #f) then the then-expr is evaluated, otherwise else-expr is evaluated.

A form that is more convenient when conditionals are nested is cond:

 (cond (test1 expr1)
       (test2 expr2)
       ...
       (else exprn))

The first expression for which the test evaluates to true will be evaluated. If all tests result in #f, the else clause is evaluated.

A variant of the cond clause is

 (cond ...
       (test => expr)
       ...)

In this case, expr should evaluate to a function that takes one argument. If test evaluates to true, the function is called with the return value of test.

Loops

Loops in Scheme usually take the form of tail recursion. Scheme implementations are required to optimize tail recursion so as to eliminate use of stack space where possible, so arbitrarily long loops can be executed using this technique.

A classical example is the factorial function, which can be defined non-tail-recursively:

 (define (factorial n)
   (if (= n 0)
     1
     (* n (factorial (- n 1)))))

 (factorial 5)
 ;; => 120

This is a direct translation of the mathematical recursive definition of the factorial: the factorial of zero (usually written 0!) is equal to 1, while the factorial of any greater natural number n is defined as $n!=n*(n-1)!$ .

However, plain recursion is by nature less efficient, since the Scheme system must keep track of the returns of all the nested function calls. A tail-recursive definition is one that ensures that in the recursive case, the outermost call is one back to the top of the recurring function. In this case, we recur not on the factorial function itself, but on a helper routine with two parameters representing the state of the iteration:

 (define (factorial n)
   (let loop ((total 1)
              (n n))
     (if (= n 0)
       total
       (loop (* n total) (- n 1)))))

 (factorial 5)
 ;; => 120

A higher order function like map which applies a function to every element of a list, and can be defined non-tail-recursively:

 (define (map f lst)
   (if (null? lst)
     lst
     (cons (f (car lst))
           (map f (cdr lst)))))

 (map (lambda (x) (* x x)) '(1 2 3 4))
 ;;  => (1 4 9 16)

This can also be defined tail-recursively:

 (define (map f lst)
   (let loop ((lst lst)
              (res '()))
     (if (null? lst)
       (reverse res)
       (loop (cdr lst)
             (cons (f (car lst)) res)))))

 (map (lambda (x) (* x x)) '(1 2 3 4))
 ;; => (1 4 9 16)

In both cases the tail-recursive version is preferable due to its decreased use of space.

Input/output

Scheme has the concept of ports to read from or to write to. R5RS defines two default ports, accessible with the functions: current-input-port, current-output-port. Most implementations also provide current-error-port.

Examples

Hello World

 (define (hello-world)
   (display "Hello, World!") 
   (newline))
 (hello-world)

Or even simpler:

 (display "Hello, World!\n")

Scheme code can be found in the following articles: Arithmetic-geometric mean, Church numeral, Continuation passing style, Currying, Fibonacci number program, Hello world program, Levenshtein distance, Tail recursion, Queue.

Implementations

Bigloo is a Scheme-to-C, Scheme-to-.NET and Scheme-to-Java compiler. It has much more to offer than just a compiler: Bigloo features an explicit type system which improves the readability and debugging of code. Bigloo is a good Scheme implementation if you are looking to write numerical applications.
Chez Scheme is a proprietary freeware Scheme interpreter and commercial Scheme compiler for Microsoft Windows, Mac OS X, Linux, and SunOS.
Chicken is a Scheme-to-C compiler.
The Gambit Scheme System is a Scheme interpreter and high-performance Scheme-to-C compiler.
Gauche is an R5RS Scheme implementation developed to be a handy script interpreter.
Guile is the GNU project's official extension language. This Scheme interpreter is packaged as a library to provide scripting to applications.
JScheme is a Scheme environment, implemented in Java, that provides a natural and transparent interface to Java called the Javadot notation.
Kawa is a Scheme environment, written in Java, that compiles Scheme source code into Java bytecode. Any Java library can be easily used in Kawa.
LispMe is an open-source Scheme environment for the PalmOS family of PDAs.
Lists and Lists is an implementation of Scheme as part of an adventure game, implemented on the Z-machine; it can be run on most platforms.
MIT/GNU Scheme is a free (GPL-licensed) implementation for the x86 architecture only. It runs on GNU/Linux, FreeBSD, IBM OS/2, and Microsoft Windows (95, 98, ME, NT, 2000, and XP)
Oaklisp is an object-oriented dialect of Scheme with first-class classes.
PLT Scheme is a suite of Scheme programs for Windows, Mac, and Unix platforms including an interpreter (MzScheme), a graphical toolkit (MrEd), a pedagogically-oriented graphical editor (DrScheme), and various other components including Component object model and ODBC libraries.
scsh (SCheme SHell) is a Unix shell language in Scheme. It is derived from Scheme48
Scheme48 is a implementation of scheme using a bytecode interpreter. It is designed for experimentation for implementation techniques.
SISC (Second Interpreter of Scheme Code) is a full R5RS Scheme environment written in Java, which can access Java libraries.
The GIMP currently embeds SIOD very successfully for scripting image manipulation (scripts create a GUI and call plugins and internal functions) but the future plan is to replace SIOD with Guile.
STklos is a Scheme implementation which provides an object system similar to CLOS and a simple interface to the GTK toolkit
T is an implementation of Scheme designed for efficiency.
Unlikely Scheme is an open-source lightweight implementation of Scheme in C++.
XLISP is a superset of Scheme developed by David Betz.
Many more implementations are listed in the schemers.org FAQ.

External links

Schemers.org Large set of resources.
- Home of current scheme standardization process.
Scheme Requests for Implementation (SRFI).
Community Scheme Wiki A wiki for all things Scheme.
Open Directory: Scheme Many resources.
Structure and Interpretation of Computer Programs by Abelson, Sussman and Sussman, considered a classic computer science text.
HTDP (How to Design Programs) by Felleisen et al. Intended to teach program design using Scheme.
The Scheme Programming Language by R. Kent Dybvig. Useful language reference.
Bibliography of Scheme-related research, with links to online versions of many academic papers, including all of the original Lambda Papers.
The Scheme Cookbook Wiki-based book of tasty recipes.
Scheme Related Elemental programming.

References

Gerald Sussman and Guy Steele. SCHEME: An Interpreter for Extended Lambda Calculus AI Memo 349, MIT Artificial Intelligence Laboratory, Cambridge, Massachusetts, December 1975.
Richard Kelsey, William Clinger, Jonathan Rees (eds.), Revised⁵ Report on the Algorithmic Language Scheme
Guy L. Steele, Jr., Richard P. Gabriel, The Evolution of Lisp

Template:Major programming languages small

@@ Line 21: / Line 21: @@
 Scheme, as all [[Lisp programming language|Lisp]] dialects, has very little syntax compared to many other programming languages. It has no [[operator precedence]] rules because [[fully nested notation]] is used for all function calls, and there are no ambiguities as are found in [[infix notation]], which mimics conventional algebraic notation.
-Some people are at first put off by all the parentheses used in Scheme notation. However, Scheme is usually processed and displayed using editors which [[Prettyprint|automatically indent]] the code in a conventional manner, and after a short period of accommodation the parentheses become unobtrusive. Also, R6RS, the next version of the Scheme [[#Standards|standard]], will specify that square brackets (and possibly curly braces) may be used for S-expressions.
+Many people are put off by all the parentheses used in Scheme notation. However, Scheme is usually processed and displayed using editors which [[Prettyprint|automatically indent]] the code in a conventional manner, and after a period of accommodation the parentheses supposedly become unobtrusive. Also, R6RS, the next version of the Scheme [[#Standards|standard]], will specify that square brackets (and possibly curly braces) may be used for S-expressions.
 Scheme's [[macro]] facilities allow it to be adapted to many different problem domains. They can be used to add support for [[object-oriented programming]]. Scheme provides a hygienic macro system which, while not quite as powerful as [[Common Lisp|Common Lisp's]] macro system, is much safer and often easier to work with.  The advantage of a hygienic macro system (as found in Scheme and other languages such as [[Dylan_programming_language|Dylan]]) is that any name clashes in the macro and surrounding code will be automatically avoided.  The hygienic macro system is always built on some low-level facility which provides the full power of non-hygienic macros, including arbitrary syntax-time computations.
@@ Line 42: / Line 42: @@
 Some [[Common Lisp|Common Lispers]] see the fact that functions and variables lie in the same namespace as a disadvantage.  Scheme programmers do not see this as a problem, and prefer a language that encourages using [[higher-order function]]s rather than the extra verbosity that such code requires in Common Lisp.  (The two communities often have long fights over such issues.)
+Scheme has a cryptic and unintuitive syntax is therefore relatively hard to learn. It is fairly cumbersome and not a practical language for developing larger applications. Because of the way the language is implemented, Scheme programs only run at a fraction of the speed of programs written in a modern compiler language.
+Some schools still teach Scheme in their computer schience classes, claiming that it was a good introductory language. This myth has been debunked and critics emphasize that modern programming languages are easier and more fun to learn and are much more important for the majority of students who want to do serious software development later on.
 ==Standards==