Transaction processing

From Wikipedia, the free encyclopedia

In computer science, transaction processing is information processing [1] that is divided into individual, indivisible operations called transactions. Each transaction must succeed or fail as a complete unit; it can never be only partially complete.

For example, when you purchase a book from an online bookstore, you exchange money (in the form of credit) for a book. If your credit is good, a series of related operations ensures that you get the book and the bookstore gets your money. However, if a single operation in the series fails during the exchange, the entire exchange fails. You do not get the book and the bookstore does not get your money. The technology responsible for making the exchange balanced and predictable is called transaction processing. Transactions ensure that data-oriented resources are not permanently updated unless all operations within the transactional unit complete successfully. By combining a set of related operations into a unit that either completely succeeds or completely fails, one can simplify error recovery and make one's application more reliable.

Transaction processing systems consist of computer hardware and software hosting a transaction-oriented application that performs the routine transactions necessary to conduct business. Examples include systems that manage sales order entry, airline reservations, payroll, employee records, manufacturing, and shipping.

Since most, though not necessarily all, transaction processing today is interactive, the term is often treated as synonymous with online transaction processing.

Description[edit]

Transaction processing is designed to maintain a system's Integrity (typically a database or some modern filesystems) in a known, consistent state, by ensuring that interdependent operations on the system are either all completed successfully or all canceled successfully.

For example, consider a typical banking transaction that involves moving $700 from a customer's savings account to a customer's checking account. This transaction involves at least two separate operations in computer terms: debiting the savings account by $700, and crediting the checking account by $700. If one operation succeeds but the other does not, the books of the bank will not balance at the end of the day. There must, therefore, be a way to ensure that either both operations succeed or both fail so that there is never any inconsistency in the bank's database as a whole.

Transaction processing links multiple individual operations in a single, indivisible transaction, and ensures that either all operations in a transaction are completed without error, or none of them are. If some of the operations are completed but errors occur when the others are attempted, the transaction-processing system "rolls back" all of the operations of the transaction (including the successful ones), thereby erasing all traces of the transaction and restoring the system to the consistent, known state that it was in before processing of the transaction began. If all operations of a transaction are completed successfully, the transaction is committed by the system, and all changes to the database are made permanent; the transaction cannot be rolled back once this is done.

Transaction processing guards against hardware and software errors that might leave a transaction partially completed. If the computer system crashes in the middle of a transaction, the transaction processing system guarantees that all operations in any uncommitted transactions are cancelled.

Generally, transactions are issued concurrently. If they overlap (i.e. need to touch the same portion of the database), this can create conflicts. For example, if the customer mentioned in the example above has $150 in his savings account and attempts to transfer $100 to a different person while at the same time moving $100 to the checking account, only one of them can succeed. However, forcing transactions to be processed sequentially is inefficient. Therefore, concurrent implementations of transaction processing is programmed to guarantee that the end result reflects a conflict-free outcome, the same as could be reached if executing the transactions sequentially in any order (a property called serializability). In our example, this means that no matter which transaction was issued first, either the transfer to a different person or the move to the checking account succeeds, while the other one fails.

Methodology[edit]

The basic principles of all transaction-processing systems are the same. However, the terminology may vary from one transaction-processing system to another, and the terms used below are not necessarily universal.

Rollback[edit]

Transaction-processing systems ensure database integrity by recording intermediate states of the database as it is modified, then using these records to restore the database to a known state if a transaction cannot be committed. For example, copies of information on the database prior to its modification by a transaction are set aside by the system before the transaction can make any modifications (this is sometimes called a before image). If any part of the transaction fails before it is committed, these copies are used to restore the database to the state it was in before the transaction began.

Rollforward[edit]

It is also possible to keep a separate journal of all modifications to a database management system. (sometimes called after images). This is not required for rollback of failed transactions but it is useful for updating the database management system in the event of a database failure, so some transaction-processing systems provide it. If the database management system fails entirely, it must be restored from the most recent back-up. The back-up will not reflect transactions committed since the back-up was made. However, once the database management system is restored, the journal of after images can be applied to the database (rollforward) to bring the database management system up to date. Any transactions in progress at the time of the failure can then be rolled back. The result is a database in a consistent, known state that includes the results of all transactions committed up to the moment of failure.

Deadlocks[edit]

In some cases, two transactions may, in the course of their processing, attempt to access the same portion of a database at the same time, in a way that prevents them from proceeding. For example, transaction A may access portion X of the database, and transaction B may access portion Y of the database. If at that point, transaction A then tries to access portion Y of the database while transaction B tries to access portion X, a deadlock occurs, and neither transaction can move forward. Transaction-processing systems are designed to detect these deadlocks when they occur. Typically both transactions will be cancelled and rolled back, and then they will be started again in a different order, automatically, so that the deadlock does not occur again. Or sometimes, just one of the deadlocked transactions will be cancelled, rolled back, and automatically restarted after a short delay.

Deadlocks can also occur among three or more transactions. The more transactions involved, the more difficult they are to detect, to the point that transaction processing systems find there is a practical limit to the deadlocks they can detect.

Compensating transaction[edit]

In systems where commit and rollback mechanisms are not available or undesirable, a compensating transaction is often used to undo failed transactions and restore the system to a previous state.

ACID criteria[edit]

Jim Gray defined properties of a reliable transaction system in the late 1970s under the acronym ACID—atomicity, consistency, isolation, and durability.[1]

Atomicity[edit]

A transaction's changes to the state are atomic: either all happen or none happen. These changes include database changes, messages, and actions on transducers.

Consistency[edit]

Consistency: A transaction is a correct transformation of the state. The actions taken as a group do not violate any of the integrity constraints associated with the state.

Isolation[edit]

Even though transactions execute concurrently, it appears to each transaction T, that others executed either before T or after T, but not both.

Durability[edit]

Once a transaction completes successfully (commits), its changes to the database survive failures and retain its changes.

Benefits[edit]

Transaction processing has these benefits:

  • It allows sharing of computer resources among many users
  • It shifts the time of job processing to when the computing resources are less busy
  • It avoids idling the computing resources without minute-by-minute human interaction and supervision
  • It is used on expensive classes of computers to help amortize the cost by keeping high rates of utilization of those expensive resources

Disadvantages[edit]

  • They have relatively expensive setup costs
  • There is a lack of standard formats
  • Hardware and software incompatibility

Implementations[edit]

Standard transaction-processing software, such as IBM's Information Management System, was first developed in the 1960s, and was often closely coupled to particular database management systems. Client–server computing implemented similar principles in the 1980s with mixed success. However, in more recent years, the distributed client–server model has become considerably more difficult to maintain. As the number of transactions grew in response to various online services (especially the Web), a single distributed database was not a practical solution. In addition, most online systems consist of a whole suite of programs operating together, as opposed to a strict client–server model where the single server could handle the transaction processing. Today a number of transaction processing systems are available that work at the inter-program level and which scale to large systems, including mainframes.

One effort is the X/Open Distributed Transaction Processing (DTP) (see also Java Transaction API (JTA). However, proprietary transaction-processing environments such as IBM's CICS are still very popular,[citation needed] although CICS has evolved to include open industry standards as well.

The term extreme transaction processing (XTP) was used to describe transaction processing systems with uncommonly challenging requirements, particularly throughput requirements (transactions per second). Such systems may be implemented via distributed or cluster style architectures. It was used at least by 2011.[2][3]

References[edit]

  1. ^ a b Gray, Jim; Reuter, Andreas. "Transaction Processing – Concepts and Techniques (Powerpoint)". Retrieved Nov 12, 2012.
  2. ^ Koen Vanderkimpen and Dirk Deridder. "Going eXtreme for Health Care". Devoxx 2011 presentation. Retrieved March 18, 2017.
  3. ^ Kevin Roebuck (2011). Extreme Transaction Processing. Lightning Source. ISBN 978-1-74304-266-3.

Further reading[edit]

  • Gerhard Weikum, Gottfried Vossen, Transactional information systems: theory, algorithms, and the practice of concurrency control and recovery, Morgan Kaufmann, 2002, ISBN 1-55860-508-8
  • Jim Gray, Andreas Reuter, Transaction Processing—Concepts and Techniques, 1993, Morgan Kaufmann, ISBN 1-55860-190-2
  • Philip A. Bernstein, Eric Newcomer, Principles of Transaction Processing, 1997, Morgan Kaufmann, ISBN 1-55860-415-4
  • Ahmed K. Elmagarmid (Editor), Transaction Models for Advanced Database Applications, Morgan-Kaufmann, 1992, ISBN 1-55860-214-3

External links[edit]