Synchronization (computer science)
|This article does not cite any references or sources. (May 2011)|
In computer science, synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronization refers to the idea that multiple processes are to join up or handshake at a certain point, in order to reach an agreement or commit to a certain sequence of action. Data synchronization refers to the idea of keeping multiple copies of a dataset in coherence with one another, or to maintain data integrity. Process synchronization primitives are commonly used to implement data synchronization.
Thread or process synchronization
Thread synchronization or serialization, strictly defined, is the application of particular mechanisms to ensure that two concurrently-executing threads or processes do not execute specific portions of a program at the same time, referred to as mutual exclusion. If one thread has begun to execute a serialized portion of the program called a critical section, any other thread trying to execute this portion must wait until the first thread finishes. If such synchronization measures are not taken, it can result in a race condition where variable values depend on the timings of the thread or process context switch.
Synchronization is used to control access to state both in small-scale multiprocessing systems -- in multithreaded environments and multiprocessor computers -- and in distributed computers consisting of thousands of units -- in banking and database systems, in web servers, and so on.
- Lock (computer science) and mutex
- Monitor (synchronization)
- Semaphore (programming)
- Simple Concurrent Object-Oriented Programming (SCOOP)
A distinctly different (but related) concept is that of data synchronization. This refers to the need to keep multiple copies of a set of data coherent with one another.
- File synchronization, such as syncing a hand-held MP3 player to a desktop computer.
- Cluster file systems, which are file systems that maintain data or indexes in a coherent fashion across a whole computing cluster.
- Cache coherency, maintaining multiple copies of data in sync across multiple caches.
- RAID, where data is written in a redundant fashion across multiple disks, so that the loss of any one disk does not lead to a loss of data.
- Database replication, where copies of data on a database are kept in sync, despite possible large geographical separation.
- Journaling, a technique used by many modern file systems to make sure that file metadata are updated on a disk in a coherent, consistent manner.
Synchronization was originally a process based concept whereby a lock could be obtained on an object. Its primary usage was in databases. There are two types of (file) lock; read-only and read-write. Read-only locks may be obtained by many processes or threads. Read-write locks are exclusive, as they may only be used by a single process/thread at a time.
Although locks were derived for file databases, data is also shared in memory between processes and threads. Sometimes more than one object (or file) is locked at a time. If they are not locked simultaneously they can overlap, causing a deadlock exception.
Java and Ada only have exclusive locks because they are thread based and rely on the compare-and-swap processor instruction (see mutex).
An abstract mathematical foundation for synchronization primitives is given by the history monoid. There are also many higher-level theoretical devices, such as process calculi and Petri nets, which can be built on top of the history monoid.
- Futures and promises, synchronization mechanisms in pure functional paradigms.
- Anatomy of Linux synchronization methods at IBM developerWorks
- The Little Book of Semaphores, by Allen B. Downey