Synchronization (computer science)

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer science, synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronization refers to the idea that multiple processes are to join up or handshake at a certain point, in order to reach an agreement or commit to a certain sequence of action. Data synchronization refers to the idea of keeping multiple copies of a dataset in coherence with one another, or to maintain data integrity. Process synchronization primitives are commonly used to implement data synchronization.

Thread or process synchronization[edit]

Thread synchronization or serialization, strictly defined, is the application of particular mechanisms to ensure that two concurrently-executing threads or processes do not execute specific portions of a program at the same time, referred to as mutual exclusion. If one thread has begun to execute a serialized portion of the program called a critical section, any other thread trying to execute this portion must wait until the first thread finishes. If such synchronization measures are not taken, it can result in a race condition where variable values depend on the timings of the thread or process context switch.

Synchronization is used to control access to state both in small-scale multiprocessing systems -- in multithreaded environments and multiprocessor computers -- and in distributed computers consisting of thousands of units -- in banking and database systems, in web servers, and so on.

See[edit]

Data synchronization[edit]

Main article: Data synchronization

A distinctly different (but related) concept is that of data synchronization. This refers to the need to keep multiple copies of a set of data coherent with one another.

Examples include:

  • File synchronization, such as syncing a hand-held MP3 player to a desktop computer.
  • Cluster file systems, which are file systems that maintain data or indexes in a coherent fashion across a whole computing cluster.
  • Cache coherency, maintaining multiple copies of data in sync across multiple caches.
  • RAID, where data is written in a redundant fashion across multiple disks, so that the loss of any one disk does not lead to a loss of data.
  • Database replication, where copies of data on a database are kept in sync, despite possible large geographical separation.
  • Journaling, a technique used by many modern file systems to make sure that file metadata are updated on a disk in a coherent, consistent manner.

Mathematical foundations[edit]

Synchronization was originally a process based concept whereby a lock could be obtained on an object. Its primary usage was in databases. There are two types of (file) lock; read-only and read-write. Read-only locks may be obtained by many processes or threads. Read-write locks are exclusive, as they may only be used by a single process/thread at a time.
Although locks were derived for file databases, data is also shared in memory between processes and threads. Sometimes more than one object (or file) is locked at a time. If they are not locked simultaneously they can overlap, causing a deadlock exception.
Java and Ada only have exclusive locks because they are thread based and rely on the compare-and-swap processor instruction (see mutex).
An abstract mathematical foundation for synchronization primitives is given by the history monoid. There are also many higher-level theoretical devices, such as process calculi and Petri nets, which can be built on top of the history monoid.

See also[edit]

References[edit]

External links[edit]