Jump to content

OCaml: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
CanisRufus (talk | contribs)
m dab Java
improved links; academic and industrial applications; unified philosophy and features, mostly in a list
Line 4: Line 4:
'''Objective Caml''' ('''OCaml''') is a general-purpose [[programming language]] descended from the [[ML programming language|ML]] family, created by [[Xavier Leroy]], [[Jerome Vouillon]], [[Damien Doligez]], [[Didier Rémy]] and others in 1996. OCaml is an [[open source]] project managed and principally maintained by [[Institut National de Recherche en Informatique et en Automatique|INRIA]].
'''Objective Caml''' ('''OCaml''') is a general-purpose [[programming language]] descended from the [[ML programming language|ML]] family, created by [[Xavier Leroy]], [[Jerome Vouillon]], [[Damien Doligez]], [[Didier Rémy]] and others in 1996. OCaml is an [[open source]] project managed and principally maintained by [[Institut National de Recherche en Informatique et en Automatique|INRIA]].


OCaml shares the [[functional programming|functional]] and [[imperative programming|imperative]] features of ML, but adds [[object-oriented programming|object-oriented]] constructs and has minor [[syntax]] differences. It has a large standard library that makes it useful for many of the same applications as [[Python programming language|Python]] or [[Perl]], as well as robust modular and object -oriented programming constructs that make it applicable for large-scale software engineering.
OCaml shares the [[functional programming|functional]] and
[[imperative programming|imperative]] features of ML, but adds [[object-oriented programming|object-oriented]] constructs and has minor [[syntax]] differences. It has a large standard library that makes it useful for many of the same applications as [[Python programming language|Python]] or [[Perl]], as well as robust modular and object -oriented programming constructs that make it applicable for large-scale software engineering.


OCaml is a successor to [[CAML|Caml Light]]. The acronym [[CAML]] originally stood for ''Categorical Abstract Machine Language'', although OCaml abandons [[Categorical_Abstract_Machine_Language|this abstract machine]].
OCaml is a successor to [[CAML|Caml Light]]. The acronym [[CAML]] originally stood for ''Categorical Abstract Machine Language'', although OCaml has abandoned this [[Categorical_Abstract_Machine_Language|abstract machine]].


==Philosophy==
==Philosophy==


ML-derived languages are most well known for their strict type systems and [[type inference|type-inferring]] compilers. OCaml unifies functional, imperative, and object-oriented programming under an ML-like type system.
ML-derived languages are most well known for their [[type inference]] and strong, [[semantic analysis|static]] [[type systems]], brevity and performance. OCaml unifies functional, imperative, and object-oriented programming under an ML-like type system.


OCaml was designed to provide many useful features:
Like Python and Perl, OCaml provides an interactive toplevel and a script interpreter, but it also provides bytecode and optimizing native code compilers. Many high-level programming languages, even when compiled to native code, achieve slower performance than might be possible with [[C programming language|C/C++]] because of runtime type and safety checks. OCaml's strict type system renders runtime type mismatches impossible, and thus obviates the need for this overhead, while still guaranteeing runtime safety. Code generated by OCaml's native code compiler can achieve speeds comparable to [[C programming language|C/C++]] on algorithmic tasks [http://shootout.alioth.debian.org/].


* <b>Safety</b> OCaml programs are thoroughly checked at compile-time such that they are proven to be entirely safe to run, e.g. a compiled OCaml program should not be able to cause a segmentation fault.
In addition to reducing overhead, OCaml's strict type system eliminates a large class of programmer errors that may cause problems at runtime. However, it also forces the programmer to conform to the constraints of the type system, which can require careful thought and close attention. Like all ML-derived languages, the OCaml compiler can infer types, greatly reducing the need for manual type annotation (for example, the datatype of variables and the signature of functions usually do not need to be expressly declared, as they do in [[Java programming language|Java]]). Nonetheless, effective use of OCaml's type system can require some sophistication on the part of the programmer.


* <b>[[Closure_(computer_science)|First-class lexical closures]]</b> Functions may be nested, passed as arguments to other functions and stored in data structures as values.
==Features==


* <b>Strongly typed</b> The types of all values are checked during compilation to ensure that they are well defined and validly used.
OCaml features: a [[semantic analysis|static]]

[[type system]], [[type inference]],
* <b>Statically typed</b> Any typing errors in a program are picked up at compile-time by the compiler, instead of at run-time as in many other languages.
[[polymorphism (computer science)|parametric polymorphism]], [[tail recursion]],

[[pattern matching]],
* <b>Type inference</b> The types of values are automatically inferred during compilation by the context in which they occur. Therefore, the types of variables and functions in OCaml code does not need to be specified explicitly, dramatically reducing source code size whilst retaining excellent performance.
[[Closure_(computer_science)|first class lexical closures]],

[[function object|functors (parametric modules)]], [[exception handling]], and
* <b>[[polymorphism (computer science)|Parametric polymorphism]]</b> In cases where any of several different types may be valid, any such type can be used. This greatly simplifies the writing of generic, reusable code.
incremental generational [[garbage collection (computer science)|automatic garbage collection]].

* <b>Pattern matching</b> Values, particularly the contents of data structures, can be matched against arbitrarily-complicated patterns in order to determine the appropriate action.

* <b>Modules</b> Programs can be structured by grouping their data structures and related functions into modules.

* <b>Objects</b> Data structures and related functions can also be grouped into objects (object-oriented programming).

* <b>Separate compilation</b> Source files can be compiled separately into object files which are then linked together to form an executable. When linking, object files are automatically type checked and optimized before the final executable is created.

* <b>Brevity</b> OCaml's syntax was designed to be concise.

Also: [[tail recursion]], [[function object|functors (parametric modules)]], [[exception handling]], and incremental generational [[garbage collection (computer science)|automatic garbage collection]].


OCaml is particularly notable for extending ML-style type inference to an object system in a general purpose language. This permits structural subtyping, where object types are compatible if their method signatures are compatible, regardless of their declared inheritance; an unusual feature in statically-typed languages.
OCaml is particularly notable for extending ML-style type inference to an object system in a general purpose language. This permits structural subtyping, where object types are compatible if their method signatures are compatible, regardless of their declared inheritance; an unusual feature in statically-typed languages.
Line 31: Line 42:
A foreign function interface for easy [[linker|linking]] with [[C programming language|C]] primitives is provided, including language support for efficient numerical arrays in formats compatible with both C and [[FORTRAN]].
A foreign function interface for easy [[linker|linking]] with [[C programming language|C]] primitives is provided, including language support for efficient numerical arrays in formats compatible with both C and [[FORTRAN]].


OCaml distribution components:
The OCaml distribution contains:
* [[Preprocessor]] named [[Camlp4]] which permits syntactical extensions
* A [[Macro language]] named [[Camlp4]] which permits syntactical extensions
* [[Debugger]] which supports stepping backwards to investigate errors
* [[Debugger]] which supports stepping backwards to investigate errors
* [[Documentation generator]]
* [[Documentation generator]]
Line 38: Line 49:
* Numerous general purpose [[library (computer science)|libraries]]
* Numerous general purpose [[library (computer science)|libraries]]


The compiler is available for many platforms, including [[Unix]], [[Microsoft Windows|Windows]], and [[Apple Macintosh|Macintosh]]. Excellent portability is ensured through native code generation support for major architectures: [[IA32]], [[AMD64]], [[PowerPC]], [[Sparc]], [[IA64]], [[DEC Alpha|Alpha]], [[PA-RISC family|HP/PA]], [[MIPS architecture|MIPS]], and [[StrongARM]].
The native-code compiler ocamlopt is available for many platforms, including [[Unix]], [[Microsoft Windows|Windows]], and [[Apple Macintosh|Macintosh]]. Excellent portability is ensured through native code generation support for major architectures: [[IA32]], [[AMD64]], [[PowerPC]], [[Sparc]], [[IA64]], [[DEC Alpha|Alpha]], [[PA-RISC family|HP/PA]], [[MIPS architecture|MIPS]], and [[StrongARM]].


==Applications==
==Applications==
Line 242: Line 253:


===Programs written in OCaml===
===Programs written in OCaml===
Commonly used:
====Commonly used====
* [[MLdonkey]] - a popular multi-network P2P program
* [[MLdonkey]] - a popular multi-network P2P program
* [http://www.cis.upenn.edu/~bcpierce/unison/ Unison] file synchronizer
* [http://www.cis.upenn.edu/~bcpierce/unison/ Unison] file synchronizer


Fun:
====Fun====
* Several [[International Conference on Functional Programming Contest]] winners
* Several [[International Conference on Functional Programming Contest]] winners
* [http://handhelds.freshmeat.net/projects/planets/ Gravity simulator]
* [http://handhelds.freshmeat.net/projects/planets/ Gravity simulator]


Education:
====Education====
* [http://home.gna.org/geocaml/ Drgeocaml], a dynamic geometry software
* [http://home.gna.org/geocaml/ Drgeocaml], a dynamic geometry software
* [http://min-caml.sourceforge.net/index-e.html MinCaml] a small tutorial compiler written in OCaml.
* [http://min-caml.sourceforge.net/index-e.html MinCaml] a small tutorial compiler written in OCaml.


Engineering:
====Engineering====
* [http://www.confluent.org Confluence] is a language for synchronous reactive system design. A Confluence program can generate digital logic for an FPGA or ASIC platform, or C code for hard real-time software.
* [http://www.confluent.org Confluence] is a language for synchronous reactive system design. A Confluence program can generate digital logic for an FPGA or ASIC platform, or C code for hard real-time software.

====Academic====
The OCaml language is also widely used in academia outside computer science. Natural scientists are using OCaml for research into [[physics]], [[chemistry]], [[biology]], [[bioinformatics]] and many more-specialised subjects.

====Industrial====
[[Wolfram Research]] and [[The MathWorks]] are using OCaml in the development of their commercial systems for technical computing. The medical division of [[General Electric]] are using OCaml in the development of software embedded in MRI scanners. Other companies are using OCaml for tasks ranging from financial analysis to presentation software.


==See also==
==See also==
Line 267: Line 284:
* [http://caml.inria.fr/ Caml language family official website]
* [http://caml.inria.fr/ Caml language family official website]
** [http://caml.inria.fr/humps/caml_latest.html OCaml libraries]
** [http://caml.inria.fr/humps/caml_latest.html OCaml libraries]
** [http://caml.inria.fr/oreilly-book/ Developing applications with Objective CAML]
** [http://pauillac.inria.fr/caml/books-eng.html Books about Caml]
* [http://www.ocaml-tutorial.org/ OCaml tutorial for C, C++, Java and Perl programmers]
* [http://www.ocaml-tutorial.org/ OCaml tutorial for C, C++, Java and Perl programmers]
* [http://shootout.alioth.debian.org/ Comparison of the speed of various languages] (with favorable results for Ocaml)
* [http://shootout.alioth.debian.org/ Comparison of the speed of various languages] (with favorable results for Ocaml)
* [http://www.ffconsultancy.com/free/ray_tracer/languages.html Mini ray tracer benchmark measuring the verbosity and performance of different languages]
* [http://www.ffconsultancy.com/free/ray_tracer/languages.html Mini ray tracer benchmark] measuring the verbosity and performance of different languages
* [http://wwwfun.kurims.kyoto-u.ac.jp/soft/olabl/lablgl.html LablGL] ([[OpenGL]]+ interface)
* [http://wwwfun.kurims.kyoto-u.ac.jp/soft/olabl/ LablGL and LablGTK] [[OpenGL]]+ bindings (LablGL) and [[GTK]]+ bindings (LablGTK)
* [http://wwwfun.kurims.kyoto-u.ac.jp/soft/olabl/lablgtk.html LablGTK] ([[GTK]]+ interface)


[[Category:Functional languages]]
[[Category:Functional languages]]

Revision as of 05:27, 31 October 2005

Do not confuse it with the Occam programming language.

Objective Caml (OCaml) is a general-purpose programming language descended from the ML family, created by Xavier Leroy, Jerome Vouillon, Damien Doligez, Didier Rémy and others in 1996. OCaml is an open source project managed and principally maintained by INRIA.

OCaml shares the functional and imperative features of ML, but adds object-oriented constructs and has minor syntax differences. It has a large standard library that makes it useful for many of the same applications as Python or Perl, as well as robust modular and object -oriented programming constructs that make it applicable for large-scale software engineering.

OCaml is a successor to Caml Light. The acronym CAML originally stood for Categorical Abstract Machine Language, although OCaml has abandoned this abstract machine.

Philosophy

ML-derived languages are most well known for their type inference and strong, static type systems, brevity and performance. OCaml unifies functional, imperative, and object-oriented programming under an ML-like type system.

OCaml was designed to provide many useful features:

  • Safety OCaml programs are thoroughly checked at compile-time such that they are proven to be entirely safe to run, e.g. a compiled OCaml program should not be able to cause a segmentation fault.
  • Strongly typed The types of all values are checked during compilation to ensure that they are well defined and validly used.
  • Statically typed Any typing errors in a program are picked up at compile-time by the compiler, instead of at run-time as in many other languages.
  • Type inference The types of values are automatically inferred during compilation by the context in which they occur. Therefore, the types of variables and functions in OCaml code does not need to be specified explicitly, dramatically reducing source code size whilst retaining excellent performance.
  • Parametric polymorphism In cases where any of several different types may be valid, any such type can be used. This greatly simplifies the writing of generic, reusable code.
  • Pattern matching Values, particularly the contents of data structures, can be matched against arbitrarily-complicated patterns in order to determine the appropriate action.
  • Modules Programs can be structured by grouping their data structures and related functions into modules.
  • Objects Data structures and related functions can also be grouped into objects (object-oriented programming).
  • Separate compilation Source files can be compiled separately into object files which are then linked together to form an executable. When linking, object files are automatically type checked and optimized before the final executable is created.
  • Brevity OCaml's syntax was designed to be concise.

Also: tail recursion, functors (parametric modules), exception handling, and incremental generational automatic garbage collection.

OCaml is particularly notable for extending ML-style type inference to an object system in a general purpose language. This permits structural subtyping, where object types are compatible if their method signatures are compatible, regardless of their declared inheritance; an unusual feature in statically-typed languages.

A foreign function interface for easy linking with C primitives is provided, including language support for efficient numerical arrays in formats compatible with both C and FORTRAN.

The OCaml distribution contains:

The native-code compiler ocamlopt is available for many platforms, including Unix, Windows, and Macintosh. Excellent portability is ensured through native code generation support for major architectures: IA32, AMD64, PowerPC, Sparc, IA64, Alpha, HP/PA, MIPS, and StrongARM.

Applications

OCaml is a general-purpose programming language, but some of its more popular applications include:

Computer science

Natural science

OCaml is also widely used in physics, chemistry, biology and, more recently, bioinformatics:

Education

Ocaml is used as an introductory language in many universities, including:

OCaml is also used to teach Computer Science (mainly algorithms and complexity theories) in the French Classes Préparatoires (Preparation Courses), for students studying Computer Science (almost replacing Pascal).

Code examples

Snippets of OCaml code are most easily studied by entering them into the "top-level". This is an interactive OCaml session that prints the inferred types of resulting or defined expressions. The OCaml top-level is started by simply executing the "ocaml" program:

  $ ocaml
          Objective Caml version 3.08.0
  
  #

Code can then be entered at the "#" prompt. For example, to calculate 1+2*3:

  # 1 + 2 * 3;;
  - : int = 7

OCaml infers the type of the expression to be "int" (a machine-precision integer) and gives the result "7".

Hello World

The following program "hello.ml":

  print_endline "Hello world!";;

can be compiled to bytecode:

  $ ocamlc hello.ml -o hello

and executed:

  $ ./hello
  Hello world!
  $

Birthday paradox

OCaml may be used as a scripting language, as the following script calculates the number of people in a room before the probability of two sharing the same birthday becomes larger than 50% (the so-called birthday paradox).

On a unix-like machine, save it to a file, chmod to executable (chmod 0755 birthday.ml) and run it from the command line (./birthday.ml).

#!/usr/bin/ocamlrun ocaml

let size = 365. ;;

let rec loop p i =
  let p' = (size -. (float (i-1))) *. p /. size in
  if p' < 0.5 then
    Printf.printf "answer = %d\n" i
  else
    loop p' (i+1) ;;
loop 1.0 2

Factorial function (recursion and purely functional programming)

Many mathematical functions, such as factorial, are most naturally represented in a purely functional form. The following recursive, purely-functional OCaml function implements factorial:

  # let rec fact n = if n=0 then 1 else n * fact(n-1);;
  val fact : int -> int = <fun>

The function can be written equivalently using pattern matching:

  # let rec fact = function
      | 0 -> 1
      | n -> n * fact(n-1);;

This latter form is the mathematical definition of factorial as a recurrence relation.

Note that the compiler inferred the type of this function to be "int -> int", meaning that this function maps ints onto ints. For example, 12! is:

  # fact 12;;
  - : int = 479001600

Arbitrary-precision factorial function (libraries)

A wide variety of libraries are directly accessible from OCaml. For example, OCaml has a built-in library for arbitrary precision arithmetic. As the factorial function grows very rapidly, it quickly overflows machine-precision numbers (typically 32- or 64-bits). Thus, factorial is a suitable candidate for arbitrary-precision arithmetic.

In OCaml, the Num module provides arbitrary-precision arithmetic and can be loaded into a running top-level using:

  # #load "nums.cma";;
  # open Num;;

We begin by defining aliases for zero and one in this arithmetic:

  # let zero = num_of_int 0 and one = num_of_int 1;;
  val zero : Num.num = Int 0
  val one : Num.num = Int 1

The factorial function may then be written using the operators =/, */ and -/ that are the equivalent of =, * and - for arbitrary-precision numbers (of the type Num.num):

  # let rec fact n = if n =/ zero then one else n */ fact(n -/ one);;
  val fact : Num.num -> Num.num = <fun>

This function can compute much larger factorials, such as 120!:

  # string_of_num (fact (num_of_int 120));;
  - : string =
  "6689502913449127057588118054090372586752746333138029810295671352301633
  55724496298936687416527198498130815763789321409055253440858940812185989
  8481114389650005964960521256960000000000000000000000000000"

Numerical derivative (higher-order functions)

As a functional programming language, it is easy to create and pass around functions in OCaml programs. This capability has an enormous number of applications. Calculating the numerical derivative of a function is one such application. The following OCaml function "d" computes the numerical derivative of a given function "f" at a given point "x":

  # let d delta f x =
      (f (x +. delta) -. f (x -. delta)) /. (2. *. delta);;
  val d : float -> (float -> float) -> float -> float = <fun>

This function requires a small value "delta". A good choice for delta is the square root of the machine epsilon.

The type of the function "d" indicates that it maps a "float" onto another function with the type "(float -> float) -> float -> float". This allows us to partially apply arguments. This functional style is known as currying. In this case, it is useful to partially apply the first argument "delta" to "d", to obtain a more specialised function:

  # let d = d (sqrt epsilon_float);;
  val d : (float -> float) -> float -> float = <fun>

Note that the inferred type indicates that the replacement "d" is expecting a function with the type "float -> float" as its first argument. We can compute a numerical approximation to the derivative of x^3-x-1 at x=3 with:

  # d (fun x -> x *. x *. x -. x -. 1.) 3.;;
  - : float = 26.

The correct answer is f'(x) = 3x^2-1 => f'(3) = 27-1 = 26.

The function "d" is called a "higher-order function" because it accepts another function ("f") as an argument.

The concepts of curried and higher-order functions are clearly useful in mathematical programs. In fact, these concepts are equally applicable to most other forms of programming and can be used to factor code much more aggresively, resulting in shorter programs and fewer bugs.

Discrete Wavelet Transform (pattern matching)

The 1D Haar wavelet transform of an integer-power-of-two-length list of numbers can be implemented very succinctly in OCaml and is an excellent example of the use of pattern matching over lists, taking pairs of elements ("h1" and "h2") off the front and storing their sums and differences on the lists "s" and "d", respectively:

  # let haar l =
      let rec aux l s d = match l, s, d with
        [s], [], d -> s :: d
      | [], s, d -> aux s [] d
      | h1 :: h2 :: t, s, d -> aux t (h1 + h2 :: s) (h1 - h2 :: d)
      | _ -> invalid_arg "haar" in
      aux l [] [];;
  val haar : int list -> int list = <fun>

For example:

  # haar [1; 2; 3; 4; -4; -3; -2; -1];;
  - : int list = [0; 20; 4; 4; -1; -1; -1; -1]

Pattern matching is an incredibly useful construct that allows complicated transformations to be represented clearly and succintly. Moreover, the OCaml compiler turns pattern matches into very efficient code, resulting in programs that are not only much shorter but also much faster.

Triangle (graphics)

The following program "simple.ml" renders a rotating triangle in 2D using OpenGL:

  let _ =
    Glut.initDisplayMode ~double_buffer:true ();
    ignore (Glut.createWindow ~title:"OpenGL Demo");
    let render () =
      GlClear.clear [ `color ];
      GlMat.rotate ~angle:(Sys.time() *. 0.01) ~z:1. ();
      GlDraw.begins `triangles;
      List.iter GlDraw.vertex2 [-1., -1.; 0., 1.; 1., -1.];
      GlDraw.ends ();
      Glut.swapBuffers () in
    Glut.displayFunc ~cb:render;
    Glut.idleFunc ~cb:(Some Glut.postRedisplay);
    Glut.mainLoop ()

The LablGL bindings to OpenGL are required. The program may then be compiled to bytecode with:

  $ ocamlc -I +lablgl unix.cma lablglut.cma lablgl.cma simple.ml -o simple

and run:

  $ ./simple

Far more sophisticated, high-performance 2D and 3D graphical programs are easily developed in OCaml. Thanks to the use of OpenGL, the resulting programs are not only succinct and efficient but also cross-platform, compiling without any changes on all major platforms.

Programs written in OCaml

Commonly used

  • MLdonkey - a popular multi-network P2P program
  • Unison file synchronizer

Fun

Education

  • Drgeocaml, a dynamic geometry software
  • MinCaml a small tutorial compiler written in OCaml.

Engineering

  • Confluence is a language for synchronous reactive system design. A Confluence program can generate digital logic for an FPGA or ASIC platform, or C code for hard real-time software.

Academic

The OCaml language is also widely used in academia outside computer science. Natural scientists are using OCaml for research into physics, chemistry, biology, bioinformatics and many more-specialised subjects.

Industrial

Wolfram Research and The MathWorks are using OCaml in the development of their commercial systems for technical computing. The medical division of General Electric are using OCaml in the development of software embedded in MRI scanners. Other companies are using OCaml for tasks ranging from financial analysis to presentation software.

See also