In computing, SPMD (single program, multiple data) is a technique employed to achieve parallelism; it is a subcategory of MIMD. Tasks are split up and run simultaneously on multiple processors with different input in order to obtain results faster. SPMD is the most common style of parallel programming. It is also a prerequisite for research concepts such as active messages and distributed shared memory.
|Single instruction||Multiple instruction|
SPMD vs SIMD 
In SPMD, multiple autonomous processors simultaneously execute the same program at independent points, rather than in the lockstep that SIMD imposes on different data. With SPMD, tasks can be executed on general purpose CPUs; SIMD requires vector processors to manipulate data streams. Note that the two are not mutually exclusive.
Distributed memory 
SPMD usually refers to message passing programming on distributed memory computer architectures. A distributed memory computer consists of a collection of independent computers, called nodes. Each node starts its own program and communicates with other nodes by sending and receiving messages, calling send/receive routines for that purpose. Barrier synchronization may also be implemented by messages. The messages can be sent by a number of communication mechanisms, such as TCP/IP over Ethernet, or specialized high-speed interconnects such as Myrinet and Supercomputer Interconnect. Serial sections of the program are implemented by identical computation on all nodes rather than computing the result on one node and sending it to the others.
On a shared memory machine (a computer with several CPUs that access the same memory space), messages can be sent by depositing their contents in a shared memory area. This is often the most efficient way to program shared memory computers with large number of processors, especially on NUMA machines, where memory is local to processors and accessing memory of another processor takes longer. SPMD on a shared memory machine is usually implemented by standard (heavyweight) processes.
Unlike SPMD, shared memory multiprocessing, also called symmetric multiprocessing or SMP, presents the programmer with a common memory space and the possibility to parallelize execution by having the program take different paths on different processors. The program starts executing on one processor and the execution splits in a parallel region, which is started when parallel directives are encountered. In a parallel region, the processors execute a single program on different data. A typical example is the parallel DO loop, where different processors work on separate parts of the arrays involved in the loop. At the end of the loop, execution is synchronized, only one processor continues, and the others wait. The current standard interface for shared memory multiprocessing is OpenMP. It usually implemented by lightweight processes, called threads.
Combination of levels of parallelism 
Current computers allow exploiting of many parallel modes at the same time for maximum combined effect. A distributed memory program using MPI may run on a collection of nodes. Each node may be a shared memory computer and execute in parallel on multiple CPUs using OpenMP. Within each CPU, SIMD vector instructions (usually generated automatically by the compiler) and superscalar instruction execution (usually handled transparently by the CPU itself), such as pipelining and the use of multiple parallel functional units, are used for maximum single CPU speed.
SPMD was proposed in 1984 by Frederica Darema at IBM for highly parallel machines like the RP3 (the IBM Research Parallel Processor Prototype), in an unpublished IBM memo. By late 1980s, there were many distributed computers with proprietary message passing libraries. The first SPMD standard was PVM. The current de facto standard is MPI.
- F. Darema, SPMD model: past, present and future, Recent Advances in Parallel Virtual Machine and Message Passing Interface: 8th European PVM/MPI Users' Group Meeting, Santorini/Thera, Greece, September 23–26, 2001. Lecture Notes in Computer Science 2131, p. 1, 2001.
- Parallel job management and message passing
- Single Program Multiple Data stream
- Distributed-memory programming
See also 
- Flynn's taxonomy - View diagram comparing classifications.