ACID

From Wikipedia, the free encyclopedia

Jump to: navigation, search

In computer science, ACID (atomicity, consistency, isolation, durability) is a set of properties that guarantee that database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction. An example of a transaction is a transfer of funds from one bank account to another, even though it might consist of multiple individual operations (such as debiting one account and crediting another).

Although Jim Gray is credited with defining, in the late 1970s, these key transaction properties of a reliable system, and with helping to develop the technologies that automatically achieve these,[1] the acronym ACID was coined by Andreas Reuter and Theo Haerder in 1983.[2]

Contents

[edit] Properties

[edit] Atomicity

Atomicity refers to the ability of the DBMS to guarantee that either all of the tasks of a transaction are performed or none of them are. For example, the transfer of funds from one account to another can be completed or it can fail for a multitude of reasons, but atomicity guarantees that one account won't be debited if the other is not credited.

Atomicity states that database modifications must follow an “all or nothing” rule. Each transaction is said to be “atomic” if when one part of the transaction fails, the entire transaction fails. It is critical that the database management system maintains the atomic nature of transactions in spite of any DBMS, operating system or hardware failure.

Either all transactions are carried out or none are. The meaning is the transaction cannot be subdivided, and hence, it must be processed in its entirety or not at all. Users should not have to worry about the effect of incomplete transactions in case of any system crash occurs. Transactions can be incomplete for three kinds of reasons:

  1. Due to being aborted or unsuccessful termination: this happens due to some anomalies arises during execution. If a transaction is aborted by the DBMS for some internal reason, it is automatically restarted and executed as new.
  2. Due to system crash: this may be happen due to power supply failure while one or more transactions in execution.
  3. Due to unexpected situations: this may be happen due to unexpected data value or inability to access some disk.

[edit] Consistency

The consistency property ensures that the database remains in a consistent state; more precisely, it says that any transaction will take the database from one consistent state to another consistent state.

The consistency property does not say how the DBMS should handle an inconsistency other than ensure the database is clean at the end of the transaction. If, for some reason, a transaction is executed that violates the database’s consistency rules, the entire transaction could be rolled back to the pre-transactional state - or it would be equally valid for the DBMS to take some patch-up action to get the database in a consistent state. Thus, if the database schema says that particular field is for holding integer numbers, the DBMS could decide to reject attempts to put fractional values in there, or it could round the supplied values to the nearest whole number: both options maintain consistency.

A DBMS that claims to enforce consistency is only responsible for those rules that are known to it. Thus, if a DBMS allows fields of a record to act as references to another record, then consistency implies the DBMS should enforce referential integrity: by the time any transaction ends, each and every reference in the database must be valid. If a transaction consisted of an attempt to delete a record referenced by another, each of the following mechanisms would maintain consistency:

  • abort the transaction, rolling back to the consistent, pre-transactional state;
  • delete all records that point at the deleted record (this is known as cascaded deletes); or,
  • clear the relevant fields for all records that point at the deleted record.

(These are examples of Propagation constraints; some database systems allow the database designer to specify which option to choose when setting up the schema for a database.) There are other choices available to the DBMS in the way it enforces consistency: for example, it could perform the checks 'on the fly' as the transaction proceeds (so that the database is consistent at all times within a transaction) or it could do the checks at the end of the transaction (so that the database can be in an inconsistent state mid-transaction, and one relies on the isolation principle to ensure inconsistencies are not visible externally/to other clients).

Application developers are usually responsible for ensuring an 'application level' transaction consistency, over and above that offered by the DBMS. Thus, if the transaction of money between two accounts “A” and “B” is manually done by the user, then first thing he has to do is, he deducts the amount (say $100) from the account “A” and add it with the account “B.” DBMS do not know whether the user subtracted the exact amount from account “B.” User has to do it correctly. If the user subtracted $99 from account “B” instead of $100 DBMS is not responsible for that; as far as the DBMS is concerned, the database is in a consistent state even though external rules (not known to the DBMS) have been violated.

[edit] Isolation

Isolation refers to the requirement that other operations cannot access or see the data in an intermediate state during a transaction. This constraint is required to maintain the performance as well as the consistency between transactions in a DBMS. Thus, each transaction is unaware of other transactions executing concurrently in the system. In a DBMS, many transactions may be executed simultaneously. These transactions should be isolated from each other. One’s execution should not affect the execution of other transactions. To enforce this concept DBMS has to maintain certain scheduling algorithms.

[edit] Durability

Durability refers to the guarantee that once the user has been notified of success, the transaction will persist, and not be undone. This means it will survive system failure, and that the database system has checked the integrity constraints and won't need to abort the transaction. Many databases implement durability by writing all transactions into a transaction log that can be played back to recreate the system state right before a failure. A transaction can only be deemed committed after it is safely entered in the log.

Durability does not imply a permanent state of the database. Another transaction may overwrite any changes made by the current transaction without hindering durability.

[edit] Examples

The following examples are used to further explain the ACID properties. In these examples, the database has two fields, A and B. As a consistency rule, the value in A and the value in B must be integers that add up to 100.

[edit] Atomicity failure

Assume that a transaction attempts to subtract 10 from A and add 10 to B. If it were to succeed, this would be a valid transaction because A+B would still be 100. However, assume there is a problem with the system. After 10 is removed from A, the attempt to add 10 to B fails. The problem might be network failure, disk failure, power outage, program bug, etc. Atomicity requires that both parts of this transaction complete or none at all. There are two options: Attempt to add 10 to B again or undo the change to A. By undoing the change to A (adding 10 back to A), atomicity is accomplished.

[edit] Consistency failure

Consistency is a very general term that demands the data meets all validation rules that the overall application expects - but to satisfy the consistency property a database system only needs to enforce those rules that are within its scope. In the previous example, one rule was a requirement that A + B = 100; most database systems would not allow such a rule to be specified, and so would have no responsibility to enforce it - but they would be able to ensure the values were whole numbers. Example of rules that can be enforced by the database system are that the primary keys values of a record uniquely identify that record, that the values stored in fields are the right type (the schema might require that both A and B are integers, say) and in the right range, and that foreign keys are all valid.

Validation rules that cannot be enforced by the database system are the responsibility of the application programs using the database.

[edit] Isolation failure

To demonstrate isolation, at least two transactions must be executed at the same time. Isolation is easy to achieve if only one transaction is executed at a time. However, an extremely long transaction will block access to the database if it must run to completion before other transactions may begin. Therefore, the independent actions of each transaction are run in an interleaved manner.

Consider two transactions. One will transfer 10 from A to B. The other will transfer 10 from B to A. There are four actions. The first transaction will subtract 10 from A and add 10 to B. The second transaction will subtract 10 from B and add 10 to A. By interleaving the transactions, the actual order of actions will be: A-10, B-10, B+10, A+10. If isolation is maintained, the result after the first transaction is finished adding 10 to B will be identical to the result if the second transaction is not run. However, B will be 10 less due to the first action of the second transaction. This is known as a write-write failure because two transactions attempted to write to the same data field.

[edit] Durability failure

Assume that a transaction transfers 10 from A to B. It removes 10 from A. It then adds 10 to B. At this point, a "success" message is sent to the user. However, the changes are still queued in the disk buffer waiting to be committed to the disk. Power fails and the changes are lost. The user assumes that the changes have been made, but they are lost.

To satisfy the durability constraint, the database system must ensure the success message is delayed until the transaction is safely on disk. (Depending on the architecture of the database system, it may be enough to ensure that a transaction log has been fully written to disk; in the event of a crash and restart, the log will be replayed as far as possible before allowing applications to query or update the database.)

[edit] Implementation

Implementing the ACID properties correctly is not simple. Processing a transaction often requires a number of small changes to be made, including updating indices that are used by the system to speed up searches. This sequence of operations is subject to failure for a number of reasons; for instance, the system may have no room left on its disk drives, or it may have used up its allocated CPU time.

ACID suggests that the database be able to perform all of these operations at once. In fact this is difficult to arrange. There are two popular families of techniques: write ahead logging and shadow paging. In both cases, locks must be acquired on all information that is updated, and depending on the implementation, possibly on all data that is being read as well. In write ahead logging, atomicity is guaranteed by ensuring that information about all changes is written to a log before it is written to the database. That allows the database to return to a consistent state in the event of a crash. In shadowing, updates are applied to a copy of the database, and the new copy is activated when the transaction commits. The copy refers to unchanged parts of the old version of the database, rather than being an entire duplicate.

Most databases rely upon locking to provide ACID capabilities. This means that a lock must always be acquired before processing data in a database, even on read operations. Maintaining a large number of locks, however, results in substantial overhead as well as hurting concurrency. If user A is running a transaction that has to read a row of data that user B wants to modify, for example, user B must wait until user A's transaction is finished. Two phase locking is often applied to guarantee full isolation.[citation needed]

An alternative to locking is multiversion concurrency control, in which the database maintains separate copies of any data that is modified. This allows users to read data without acquiring any locks. Going back to the example of user A and user B, when user A's transaction gets to data that user B has modified, the database is able to retrieve the exact version of that data that existed when user A started their transaction. This ensures that user A gets a consistent view of the database even if other users are changing data that user A needs to read. A natural implementation of this idea results in a relaxation of the isolation property, namely snapshot isolation.

It is difficult to guarantee ACID properties in a distributed transaction across a distributed database where no single node is responsible for all data affecting a transaction. Network connections might fail, or one node might successfully complete its part of the transaction and then be required to roll back its changes because of a failure on another node. Two-phase commit (not to be confused with two-phase locking) is typically applied in distributed transactions to ensure that each participant in the transaction agrees on whether the transaction should be committed or not.[citation needed]

[edit] See also

[edit] Notes

  1. ^ "Gray to be Honored With A. M. Turing Award This Spring". Microsoft PressPass. 1998-11-23. http://www.microsoft.com/presspass/features/1998/11-23gray.mspx. Retrieved 2009-01-16. 
  2. ^ Reuter, Andreas; Haerder, Theo (December 1983). "Principles of Transaction-Oriented Database Recovery" (PDF). ACM Computing Surveys 15 (4): 287–317. doi:10.1145/289.291. http://portal.acm.org/ft_gateway.cfm?id=291&type=pdf&coll=GUIDE&dl=GUIDE&CFID=18545439&CFTOKEN=99113095. Retrieved 2009-01-16. "These four properties, atomicity, consistency, isolation, and durability (ACID), describe the major highlights of the transaction paradigm, which has influenced many aspects of development in database systems.". 

[edit] References