Deterministic algorithm

In computer science, a deterministic algorithm is an algorithm which, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states. Deterministic algorithms are by far the most studied and familiar kind of algorithm, as well as one of the most practical, since they can be run on real machines efficiently.

Formally, a deterministic algorithm computes a mathematical function; a function has a unique value for any given input, and the algorithm is a process that produces this particular value as output.

Formal definition

Deterministic algorithms can be defined in terms of a state machine: a state describes what a machine is doing at a particular instant in time. State machines pass in a discrete manner from one state to another. Just after we enter the input, the machine is in its initial state or start state. If the machine is deterministic, this means that from this point onwards, its current state determines what its next state will be; its course through the set of states is predetermined. Note that a machine can be deterministic and still never stop or finish, and therefore fail to deliver a result.

Examples of particular abstract machines which are deterministic include the deterministic Turing machine and deterministic finite automaton.

What makes algorithms non-deterministic?

A variety of factors can cause an algorithm to behave in a way which is not deterministic, or non-deterministic:

If it uses external state other than the input, such as user input, a global variable, a hardware timer value, a random value, or stored disk data.
If it operates in a way that is timing-sensitive, for example if it has multiple processors writing to the same data at the same time. In this case, the precise order in which each processor writes its data will affect the result.
If a hardware error causes its state to change in an unexpected way.

Although real programs are rarely purely deterministic, it is easier for humans as well as other programs to reason about programs that are. For this reason, most programming languages and especially functional programming languages make an effort to prevent the above events from happening except under controlled conditions.

The prevalence of multi-core processors has resulted in a surge of interest in determinism in parallel programming and challenges of non-determinism have been well documented.^[1]^[2] A number of tools to help deal with the challenges have been proposed^[3]^[4]^[5]^[6]^[7] to deal with deadlocks and race conditions.

Problems with deterministic algorithms

Unfortunately, for some problems deterministic algorithms are also hard to find. For example, there are simple and efficient probabilistic algorithms that determine whether a given number is prime and have a very small chance of being wrong. These have been known since the 1970s (see for example Fermat primality test); the known deterministic algorithms remain considerably slower in practice.

As another example, NP-complete problems, which include many of the most important practical problems, can be solved quickly using a machine called a nondeterministic Turing machine, but efficient practical algorithms have never been found for any of them. At best, we can currently only find approximate solutions or solutions in special cases.

Another major problem with deterministic algorithms is that sometimes, we don't want the results to be predictable. For example, if you are playing an on-line game of blackjack that shuffles its deck using a pseudorandom number generator, a clever gambler might guess precisely the numbers the generator will choose and so determine the entire contents of the deck ahead of time, allowing him to cheat; for example, the Software Security Group at Reliable Software Technologies was able to do this for an implementation of Texas Hold 'em Poker that is distributed by ASF Software, Inc, allowing them to consistently predict the outcome of hands ahead of time.^[8] Similar problems arise in cryptography, where private keys are often generated using such a generator. This sort of problem is generally avoided using a cryptographically secure pseudo-random number generator.

Failure / Success in algorithms

Exceptions

Exception throwing is a usual mechanism to signal failure due to unexpected/undesired states.

Failure as a return value

In order to overcome the exception unhandling problem that may result in non termination, the "Total functional programming" way is to wrap the result of a partial function in an option type result.

the option type in ML and the Maybe type in Haskell

(* Standard ML *)
datatype 'a option = NONE | SOME of 'a

(* OCaml *)
type 'a option = None | Some of 'a

-- Haskell
data Maybe a = Nothing | Just a

the Either type in Haskell, include the failure reason.

data Either  errorType resultType = Right resultType | Left errorType

Failure in Monads, the Left zero property

As Monads model sequential composition, the Left zero property (z * s = z) in a monad means that the right side of the sequence will not be evaluated.

-- Left zero in the Maybe monad
  Nothing >> k = Nothing
  Nothing >>= f = Nothing

-- Left zero in the Either monad
  Left err >> k = Left err
  Left err >>= f = Left err

Determinism categories in languages

Mercury

This logic-functional programming language establish different determinism categories for predicate modes as explained in the ref.^[9]^[10]

Haskell

Haskell provides several mechanisms:

non-determinism or notion of Fail

the Maybe and Either types include the notion of success in the result.
the fail method of the class Monad, may be used to signal fail as exception.
the Maybe monad and MaybeT monad transformer provide for failed computations (stop the computation sequence and return Nothing)^[11]

determinism/non-det with multiple solutions: you may retrieve all possible outcomes of a multiple result computation, by wrapping its result type in a MonadPlus monad. (its method mzero makes an outcome fail and mplus collects the successful results).^[12]

ML family and derived languages

As seen in Standard ML, OCaml and Scala

The option type includes the notion of success.

Java

The null reference value may represent an unsuccessful (out-of-domain) result.

References

^ Edward A. Lee. "The Problem with Threads" (PDF). Retrieved 2009-05-29.
^ James Reinders. "Parallel terminology definitions". Retrieved 2009-05-29.
^ "Intel Parallel Inspector Thread Checker". Retrieved 2009-05-29.
^ Yuan Lin. "Data Race and Deadlock Detection with Sun Studio Thread Analyzer" (PDF). Retrieved 2009-05-29.
^ Intel. "Intel Parallel Inspector". Retrieved 2009-05-29.
^ David Worthington. "Intel addresses development life cycle with Parallel Studio". Retrieved 2009-05-26.
^ Parallel Studio
^ Gary McGraw and John Viega. Make your software behave: Playing the numbers: How to cheat in online gambling. http://www.ibm.com/developerworks/library/s-playing/#h4
^ Determinism categories in the Mercury programming language
^ Mercury predicate modes
^ Representing failure using the Maybe monad
^ The class MonadPlus

[1] Edward A. Lee. "The Problem with Threads" (PDF). Retrieved 2009-05-29.

[2] James Reinders. "Parallel terminology definitions". Retrieved 2009-05-29.

[3] "Intel Parallel Inspector Thread Checker". Retrieved 2009-05-29.

[4] Yuan Lin. "Data Race and Deadlock Detection with Sun Studio Thread Analyzer" (PDF). Retrieved 2009-05-29.

[5] Intel. "Intel Parallel Inspector". Retrieved 2009-05-29.

[6] David Worthington. "Intel addresses development life cycle with Parallel Studio". Retrieved 2009-05-26.

[7] Parallel Studio

[8] Gary McGraw and John Viega. Make your software behave: Playing the numbers: How to cheat in online gambling. http://www.ibm.com/developerworks/library/s-playing/#h4

[9] Determinism categories in the Mercury programming language

[10] Mercury predicate modes

[11] Representing failure using the Maybe monad

[monad-plus-12] The class MonadPlus

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]