Concurrent computing

From Wikipedia, the free encyclopedia
  (Redirected from Concurrent programming)
Jump to: navigation, search
For a more theoretical discussion, see Concurrency (computer science).

Concurrent computing is a form of computing in which several computations are executing during overlapping time periods – concurrently – instead of sequentially (one completing before the next starts). This is a property of a system – this may be an individual program, a computer, or a network – and there is a separate execution point or "thread of control" for each computation ("process"). A concurrent system is one where a computation can make progress without waiting for all other computations to complete – where more than one computation can make progress at "the same time" (see definition, below).[1]

As a programming paradigm, concurrent computing is a form of modular programming, namely factoring an overall computation into subcomputations that may be executed concurrently. Pioneers in the field of concurrent computing include Edsger Dijkstra, Per Brinch Hansen, and C.A.R. Hoare.

Definition[edit]

Concurrent computing is related to but distinct from parallel computing, though these concepts are frequently confused,[2] and both can be described as "multiple processes executing at the same time". In parallel computing, execution literally occurs at the same instant, for example on separate processors of a multi-processor machine – parallel computing is impossible on a (single-core) single processor, as only one computation can occur at any instant (during any single clock cycle).[a] By contrast, concurrent computing consists of process lifetimes overlapping, but execution need not happen at the same instant.

For example, concurrent processes can be executed on a single core by interleaving the execution steps of each process via time slices: only one process runs at a time, and if it does not complete during its time slice, it is paused, another process begins or resumes, and then later the original process is resumed. In this way multiple processes are part-way through execution at a single instant, but only one process is being executed at that instant.

Concurrent computations may be executed in parallel,[2][3] for example by assigning each process to a separate processor or processor core, or distributing a computation across a network. This is known as task parallelism, and this type of parallel computing is a form of concurrent computing.

By contrast, parallel computing by data parallelism may or may not be concurrent computing – a single process may control all computations, in which case it is not concurrent, or the computations may be spread across several processes, in which case this is concurrent. For example, SIMD (single instruction, multiple data) processing is (data) parallel but not concurrent – multiple computations are happening at the same instant (in parallel), but there is only a single process. Examples of this include vector processors and graphics processing units (GPUs). By contrast, MIMD (multiple instruction, multiple data) processing is both data parallel and task parallel, and is concurrent; this is commonly implemented as SPMD (single program, multiple data), where multiple programs execute concurrently and in parallel on different data.

The exact timing of when tasks in a concurrent system are executed depend on the scheduling, and tasks need not always be executed concurrently. For example, given two tasks, T1 and T2:

  • T1 may be executed and finished before T2
  • T2 may be executed and finished before T1
  • T1 and T2 may be executed alternatively (time-slicing)
  • T1 and T2 may be executed simultaneously at the same instant of time (parallelism)

The word "sequential" is used as an antonym for both "concurrent" and "parallel"; when these are explicitly distinguished, concurrent/sequential and parallel/serial are used as opposing pairs.[4]

Examples[edit]

Concurrency is pervasive in computing, occurring from low-level hardware on a single chip to world-wide networks. Examples follow.

At the hardware level:

At the programming language level:

At the operating system level:

At the network level, networked systems are generally concurrent by their nature, as they consist of separate devices.

A widespread basic example of concurrency in software engineering is a software pipeline, which considers individual steps in the pipeline as filters operating on a stream. For example, some or all phases of a compiler may be structured as a pipeline. Today this is most common for the compiler frontend, structured as a lexer followed by a parser, which start with a stream of source code, which is converted by the lexer to a stream of tokens, then by the parser to a parse tree – this can also be done for other computer languages like HTML. This may be followed by other pipeline stages, notably in a one-pass compiler, though most modern compilers in most uses are multi-pass and do not operate as strict pipelines.

Interactions[edit]

Concurrent computing can occur without interactions between the processes, for example with isolated virtual machines or time-sharing systems. This type of concurrency is transparent to users (other than speed), and poses no added complexity to programmers, though scheduling must be done by the operating system.

The simplest non-trivial interaction between concurrent processes is a pipeline – narrowly speaking, a linear, one-way process, where the output of one stage is the input for the next. This is of the same complexity as function composition of a fixed set of functions (without recursion, for instance), and can be analyzed similarly.

Many concurrent systems feature more complex interactions between the processes, analogous to mutual recursion of functions, which can result in significant complexity and even nondeterminism in results due to race conditions, since the order of individual steps may vary, depending on how the processes are scheduled. These interactions are often communication via message passing, which may be synchronous or asynchronous; or may be access to shared resources. The main challenges in designing concurrent programs is concurrency control: ensuring the correct sequencing of the interactions or communications between different computational executions, and coordinating access to resources that are shared among executions.[3] Problems that may occur include nondeterminism (from race conditions), deadlock, and resource starvation. In the case of multi-threaded programming, interaction occurs via shared memory, and thread safety refers to code running properly concurrently.

Implementation[edit]

A number of different methods can be used to implement concurrent programs, such as implementing each computational execution as an operating system process, or implementing the computational processes as a set of threads within a single operating system process.

Concurrent interaction and communication[edit]

In some concurrent computing systems, communication between the concurrent components is hidden from the programmer (e.g., by using futures), while in others it must be handled explicitly. Explicit communication can be divided into two classes:

Shared memory communication 
Concurrent components communicate by altering the contents of shared memory locations (exemplified by Java and C#). This style of concurrent programming usually requires the application of some form of locking (e.g., mutexes, semaphores, or monitors) to coordinate between threads.
Message passing communication 
Concurrent components communicate by exchanging messages (exemplified by Scala, Erlang and occam). The exchange of messages may be carried out asynchronously, or may use a rendezvous style in which the sender blocks until the message is received. Asynchronous message passing may be reliable or unreliable (sometimes referred to as "send and pray"). Message-passing concurrency tends to be far easier to reason about than shared-memory concurrency, and is typically considered a more robust form of concurrent programming.[citation needed] A wide variety of mathematical theories for understanding and analyzing message-passing systems are available, including the Actor model, and various process calculi. Message passing can be efficiently implemented on symmetric multiprocessors, with or without shared coherent memory.

Shared memory and message passing concurrency have different performance characteristics. Typically (although not always), the per-process memory overhead and task switching overhead is lower in a message passing system, but the overhead of message passing itself is greater than for a procedure call. These differences are often overwhelmed by other performance factors.

Coordinating access to resources[edit]

One of the major issues in concurrent computing is preventing concurrent processes from interfering with each other. For example, consider the following algorithm for making withdrawals from a checking account represented by the shared resource balance:

  1.   bool withdraw( int withdrawal )
    
  2.   {
    
  3.      if ( balance >= withdrawal )
    
  4.      {
    
  5.          balance -= withdrawal;
    
  6.          return true;
    
  7.      } 
    
  8.      return false;
    
  9.   }
    

Suppose balance=500, and two concurrent threads make the calls withdraw(300) and withdraw(350). If line 3 in both operations executes before line 5 both operations will find that balance > withdrawal evaluates to true, and execution will proceed to subtracting the withdrawal amount. However, since both processes perform their withdrawals, the total amount withdrawn will end up being more than the original balance. These sorts of problems with shared resources require the use of concurrency control, or non-blocking algorithms.

Because concurrent systems rely on the use of shared resources (including communication media), concurrent computing in general requires the use of some form of arbiter somewhere in the implementation to mediate access to these resources.

Unfortunately, while many solutions exist to the problem of a conflict over one resource, many of those "solutions" have their own concurrency problems such as deadlock when more than one resource is involved.

Advantages[edit]

  • Increased application throughout – parallel execution of a concurrent program allows the number of tasks completed in certain time period to increase.
  • High responsiveness for input/output – input/output-intensive applications mostly wait for input or output operations to complete. Concurrent programming allows the time that would be spent waiting to be used for another task.
  • More appropriate program structure – some problems and problem domains are well-suited to representation as concurrent tasks or processes.

Concurrent programming languages[edit]

Concurrent programming languages are programming languages that use language constructs for concurrency. These constructs may involve multi-threading, support for distributed computing, message passing, shared resources (including shared memory) or futures and promises. Such languages are sometimes described as Concurrency Oriented Languages or Concurrency Oriented Programming Languages (COPL).[5]

Today, the most commonly used programming languages that have specific constructs for concurrency are Java and C#. Both of these languages fundamentally use a shared-memory concurrency model, with locking provided by monitors (although message-passing models can and have been implemented on top of the underlying shared-memory model). Of the languages that use a message-passing concurrency model, Erlang is probably the most widely used in industry at present.[citation needed]

Many concurrent programming languages have been developed more as research languages (e.g. Pict) rather than as languages for production use. However, languages such as Erlang, Limbo, and occam have seen industrial use at various times in the last 20 years. Languages in which concurrency plays an important role include:

  • Ada - general purpose programming language with native support for message passing and monitor based concurrency.
  • Alef – concurrent language with threads and message passing, used for systems programming in early versions of Plan 9 from Bell Labs
  • Alice – extension to Standard ML, adds support for concurrency via futures.
  • Ateji PX – an extension to Java with parallel primitives inspired from pi-calculus
  • Axum – domain specific concurrent programming language, based on the Actor model and on the .NET Common Language Runtime using a C-like syntax.
  • Chapel – a parallel programming language being developed by Cray Inc.
  • Charm++C++-like language for thousands of processors.
  • Cilk – a concurrent C
  • – C Omega, a research language extending C#, uses asynchronous communication
  • C# – supports concurrent computing since version 5.0 using lock, yield, async and await keywords, as well as the TPL
  • Clojure – a modern Lisp targeting the JVM
  • Concurrent Clean – a functional programming language, similar to Haskell
  • Concurrent Collections (CnC) Achieves implicit parallelism independent of memory model by explicitly defining data- and control flow
  • Concurrent Haskell – lazy, pure functional language operating concurrent processes on shared memory
  • Concurrent ML – a concurrent extension of Standard ML
  • Concurrent Pascal – by Per Brinch Hansen
  • Curry
  • Dmulti-paradigm system programming language with explicit support for concurrent programming (Actor model)
  • E – uses promises, ensures deadlocks cannot occur
  • ECMAScript – promises available in various libraries, proposed for inclusion in standard in ECMAScript 6
  • Eiffel – through its SCOOP mechanism based on the concepts of Design by Contract
  • Elixir – dynamic and functional meta-programming aware language running on the Erlang VM.
  • Erlang – uses asynchronous message passing with nothing shared
  • Faust – Realtime functional programming language for signal processing. The Faust compiler provides automatic parallelization using either OpenMP or a specific work-stealing scheduler.
  • FortranCoarrays and "do concurrent" are part of Fortran 2008 standard
  • Go – systems programming language with explicit support for concurrent programming
  • Hume functional concurrent lang. for bounded space and time environments where automata processes are described by synchronous channels patterns and message passing.
  • Io – actor-based concurrency
  • Janus features distinct "askers" and "tellers" to logical variables, bag channels; is purely declarative
  • JoCaml Concurrent and distributed channel based language (extension of OCaml) that implements the Join-calculus of processes.
  • Join Java – concurrent language based on the Java programming language
  • Joule – dataflow language, communicates by message passing
  • Joyce – a concurrent teaching language built on Concurrent Pascal with features from CSP by Per Brinch Hansen
  • LabVIEW – graphical, dataflow programming language, in which functions are nodes in a graph and data is wires between those nodes. Includes object oriented language extensions.
  • Limbo – relative of Alef, used for systems programming in Inferno (operating system)
  • MultiLispScheme variant extended to support parallelism
  • Modula-2 – systems programming language by N.Wirth as a successor to Pascal with native support for coroutines.
  • Modula-3 – modern language in Algol family with extensive support for threads, mutexes, condition variables.
  • Newsqueak – research language with channels as first-class values; predecessor of Alef
  • occam – influenced heavily by Communicating Sequential Processes (CSP).
  • Orc – a heavily concurrent, nondeterministic language based on Kleene algebra.
  • Oz – multiparadigm language, supports shared-state and message-passing concurrency, and futures
  • ParaSail – a pointer-free, data-race-free, object-oriented parallel programming language
  • Pict – essentially an executable implementation of Milner's π-calculus
  • Perl with AnyEvent and Coro
  • Python with Twisted, greenlet and gevent.
  • Reia – uses asynchronous message passing between shared-nothing objects
  • Rust – a systems language with a focus on massive concurrency, utilizing message-passing with move semantics, shared immutable memory, and shared mutable memory that is provably free of data races.[6]
  • SALSA – actor language with token-passing, join, and first-class continuations for distributed computing over the Internet
  • Scala – a general purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way
  • SequenceL – general purpose functional programming language whose primary design objectives are ease of programming, code clarity/readability, and automatic parallelization for performance on multicore hardware, which is provably free of Race condition
  • SR – research language
  • Stackless Python
  • StratifiedJS – a combinator-based concurrency language based on JavaScript
  • SuperPascal – a concurrent teaching language built on Concurrent Pascal and Joyce by Per Brinch Hansen
  • Swift (parallel scripting language) – a concurrent programming language with a C syntax for massively parallel architectures
  • Unicon – Research language.
  • Termite Scheme adds Erlang-like concurrency to Scheme
  • TNSDL – a language used at developing telecommunication exchanges, uses asynchronous message passing
  • VHDL – VHSIC Hardware Description Language, aka IEEE STD-1076
  • XC – a concurrency-extended subset of the C programming language developed by XMOS based on Communicating Sequential Processes. The language also offers built-in constructs for programmable I/O.

Many other languages provide support for concurrency in the form of libraries (on level roughly comparable with the above list).

Models of concurrency[edit]

There are several models of concurrent computing, which can be used to understand and analyze concurrent systems. These models include:

History[edit]

Concurrent computing developed out of earlier work on railroads and telegraphy, from the 19th and early 20th century, and some terms date to this period, such as semaphores. These arose to address the question of how to handle multiple trains on the same railroad system (avoiding collisions and maximizing efficiency) and how to handle multiple transmissions over a given set of wires (improving efficiency), such as via time-division multiplexing (1870s).

The academic study of concurrent algorithms started in the 1960s, with Dijkstra (1965) credited with being the first paper in this field, identifying and solving mutual exclusion.[7]

See also[edit]

Notes[edit]

  1. ^ This is discounting parallelism internal to a processor core, such as pipelining or vectorized instructions. A single-core, single-processor machine may be capable of some parallelism, such as with a coprocessor, but the processor itself is not.

References[edit]

  1. ^ Operating System Concepts 9th edition, Abraham Silberschatz. "Chapter 4: Threads"
  2. ^ a b "Concurrency is not Parallelism", Waza conference Jan 11, 2012, Rob Pike (slides) (video)
  3. ^ a b Ben-Ari, Mordechai (2006). Principles of Concurrent and Distributed Programming (2nd ed.). Addison-Wesley. ISBN 978-0-321-31283-9. 
  4. ^ Patterson & Hennessy 2013, p. 503.
  5. ^ Armstrong, Joe (2003). "Making reliable distributed systems in the presence of software errors". 
  6. ^ Blum, Ben (2012). "Typesafe Shared Mutable State". Retrieved 2012-11-14. 
  7. ^ "PODC Influential Paper Award: 2002", ACM Symposium on Principles of Distributed Computing, retrieved 2009-08-24 
  • Patterson, David A.; Hennessy, John L. (2013). Computer Organization and Design: The Hardware/Software Interface. The Morgan Kaufmann Series in Computer Architecture and Design (5 ed.). Morgan Kaufmann. ISBN 978-0-12407886-4.  edit

Further reading[edit]

External links[edit]