User:Tyler.norton12/Sandbox/Art2

A deadlock is a situation wherein two or more competing actions are each waiting for the other to finish, and thus neither ever does.

In an operating system, a deadlock is a situation which occurs when a process enters a waiting state because a resource requested by it is being held by another waiting process, which in turn is waiting for another resource. If a process is unable to change its state indefinitely because the resources requested by it are being used by other waiting process, then the system is said to be in a deadlock.^[1]

Deadlock is a common problem in multiprocessing systems, parallel computing and distributed systems, where software and hardware locks are used to handle shared resources and implement process synchronization.^[2]

In telecommunication systems, deadlocks occur mainly due to lost or corrupt signals instead of resource contention.^[3]

Examples[edit]

Deadlock situation can be compared to the classic "chicken or egg" problem.^[4]

It can be also considered a paradoxical "Catch-22" situation.^[5]

A real world analogical example would be a illogical statute passed by the Kansas legislature in early 20th century which stated:

When two trains approach each other at a crossing, both shall come to a full stop and neither shall start up again until the other has gone. ^[6]^[1]

A simple computer based example is as follows. Suppose a computer has three CD drives and three process. Each of the three processes hold one of these drives. If each process now requests another drive, the three process will be in a deadlocked state. Each process will be waiting for the CD drive released event, which can be only cause by one of the other waiting processes. Thus, it results in a circular chain.

Necessary Conditions[edit]

A deadlock situation can arise only if all of the following conditions hold simultaneously in a system:^[1]

Mutual Exclusion: At least, one resource must be non-shareable.^[1] Only one process can use the resource at any given instant of time.
Hold and Wait: A process is currently holding at least one resource and requesting additional resources which are being held by other processes.
No Preemption: The operation system can de-allocate resources once they have been allocated. They must released by the holding process voluntarily.
Circular Wait: A process waiting for a resource which is being held by another process, which in turn is waiting for another process to release a resource. In general, there is a set of waiting processes, P = {P₁, P₂, ... , P_N}, such that P₁ is waiting for a resource held by P₂, P₂ is waiting for a resource held by P₃ and so on till P_N is waiting for a resource held by P₁.^[1]^[7]

These four conditions are known as the Coffman conditions from their first description in a 1971 article by Edward G. Coffman, Jr.^[7] Unfulfillment of any of these conditions is enough to preclude a deadlock from occurring.

Deadlock Handling[edit]

Most current operating systems cannot prevent a deadlock from occuring.^[1] When a deadlock occurs, different operating system respond to them in different non-standard manners. Most approaches work by preventing one of the four Coffman conditions from occurring, especially the fourth one.^[8] Major approaches are as follows.

Ignoring Deadlocks[edit]

In this approach, it is assumed that a deadlock will never occur. This is also called as the Ostrich algorithm.^[8]^[9] This approach was initially used by MINIX and UNIX.^[7] This is used when the time intervals between occurances of deadlocks is large and the data loss incurred each time is tolerable. It is avoided in very critical systems.

Deadlock Detection[edit]

Here deadlocks are allowed to occur. Then the state of the system is examined to detect that a deadlock has occurred and subsequently it is corrected. An algorithm is employed that tracks resource allocation and process states, it rolls back and restarts one or more of the processes in order to remove the a detected deadlock. Detecting a deadlock that has already occurred is easily possible since the resources that each process has locked and/or currently requested are known to the resource scheduler of the operating system.^[9]

This approach is simpler than deadlock avoidance or deadlock prevention. It is so because predicting a deadlock before it happens is difficult, as it is generally an undecidable problem, which itself results in a halting problem. However, in specific environments, using specific means of locking resources, deadlock detection may be decidable. In the general case, it is not possible to distinguish between algorithms that are merely waiting for a very unlikely set of circumstances to occur and algorithms that will never finish because of deadlock.

Deadlock detection techniques include, but are not limited to model checking. This approach constructs a finite state-model on which it performs a progress analysis and finds all possible terminal sets in the model. These then each represent a deadlock.

After a deadlock is determined, it can be corrected by using one of the following methods:

Process Termination: One or more process involved in the deadlock may be aborted. We can choose to abort all processes involved in the deadlock. This ensures that deadlock is resolved with certainty and speed. But the expense is high as partial computations will be lost. Or, we can choose to abort one process at a time until the deadlock is resolved. This approach has high overheads because after each abortion an algorithm must detect if the system is still in deadlock. Several factors must be considered while choosing a file for termination, such as priority and age of the process.
Resource Preemption: Resource allocated to various processes may be successively preempted and allocated to other processes until deadlock is broken.

Deadlock Prevention[edit]

This approach works by preventing one of the four Coffman conditions from occurring.

Removing the mutual exclusion condition means that no process will have exclusive access to a resource. This proves impossible for resources that cannot be spooled. But even with spooled resources, deadlock could still occur. Algorithms that avoid mutual exclusion are called non-blocking synchronization algorithms.
The hold and wait conditions may be removed by requiring processes to request all the resources they will need before starting up (or before embarking upon a particular set of operations). This advance knowledge is frequently difficult to satisfy and, in any case, is an inefficient use of resources. Another way is to require processes to request resources only when it has none. Thus, first they must release all their currently resources before requesting all the resources they will need from scratch. This too is often impractical. It is so because resource may be allocated and remain unused for long periods. Also, a process requiring a popular resource may have to wait indefinitely as such a process may be always allocated to some process, resulting in resource starvation. ^[1] (These algorithms, such as serializing tokens, are known as the all-or-none algorithms.)
The no preemption condition may also be difficult or impossible to avoid as a process has to be able to have a resource for a certain amount of time, or the processing outcome may be inconsistent or thrashing may occur. However, inability to enforce preemption may interfere with a priority algorithm. Preemption of a "locked out" resource generally implies a rollback, and is to be avoided, since it is very costly in overhead. Algorithms that allow preemption include lock-free and wait-free algorithms and optimistic concurrency control.
The final condition is the circular wait condition. Approaches that avoid circular waits include disabling interrupts during critical sections and using a hierarchy to determine a partial ordering of resources". If no obvious hierarchy exists, even the memory address of resources has been used to determine ordering and resources are requested in the increasing order of the enumeration.^[1] The Dijkstra's solution can also be used.

Deadlock Avoidance[edit]

Deadlock can be avoided if certain information about processes are available to the operating system before allocation of resources, such as which resources a process will consume in its lifetime. For every resource request, the system sees if granting the request will mean that the system will enter an unsafe state, meaning a state that could result in deadlock. The system then only grants requests that will lead to safe states.^[1] In order for the system to be able to determine whether the next state will be safe or unsafe, it must know in advance at any time:

resources currently available
resources currently allocated to each process
resources that will be required and released by these processes in the future

One known algorithm that is used for deadlock avoidance is the Banker's algorithm, which requires resource usage limit to be known in advance.^[1] However, for many systems it is impossible to know in advance what every process will request. This means that deadlock avoidance is often impossible.

Two other algorithms are Wait/Die and Wound/Wait, each of which uses a symmetry-breaking technique. In both these algorithms there exists an older process (O) and a younger process (Y). Process age can be determined by a timestamp at process creation time. Smaller time stamps are older processes, while larger timestamps represent younger processes.

	Wait/Die	Wound/Wait
O needs a resource held by Y	O waits	Y dies
Y needs a resource held by O	Y dies	Y waits

It is important to note that a process may be in an unsafe state but would not result in a deadlock. The notion of safe/unsafe states only refers to the ability of the system to enter a deadlock state or not. For example, if a process requests A which would result in an unsafe state, but releases B which would prevent circular wait, then the state is unsafe but the system is not in deadlock.

Livelock[edit]

A livelock is similar to a deadlock, except that the states of the processes involved in the livelock constantly change with regard to one another, none progressing.^[10] Livelock is a special case of resource starvation; the general definition only states that a specific process is not progressing.^[11]

A real-world example of livelock occurs when two people meet in a narrow corridor, and each tries to be polite by moving aside to let the other pass, but they end up swaying from side to side without making any progress because they both repeatedly move the same way at the same time.

Livelock is a risk with some algorithms that detect and recover from deadlock. If more than one process takes action, the deadlock detection algorithm can be repeatedly triggered. This can be avoided by ensuring that only one process (chosen randomly or by priority) takes action.^[12]

Distributed deadlock[edit]

Distributed deadlocks can occur in distributed systems when distributed transactions or concurrency control is being used. Distributed deadlocks can be detected either by constructing a global wait-for graph, from local wait-for graphs at a deadlock detector or by a distributed algorithm like edge chasing.

In a commitment ordering-based distributed environment (including the strong strict two-phase locking (SS2PL, or rigorous) special case) distributed deadlocks are resolved automatically by the atomic commitment protocol (like a two-phase commit (2PC)), and no global wait-for graph or other resolution mechanism is needed. Similar automatic global deadlock resolution occurs also in environments that employ 2PL that is not SS2PL (and typically not CO; see Deadlocks in 2PL). However, 2PL that is not SS2PL is rarely utilized in practice.

Phantom deadlocks are deadlocks that are detected in a distributed system due to system internal delays but no longer actually exist at the time of detection.

Distributed deadlock prevention[edit]

Consider the "when two trains approach each other at a crossing" example defined above. Just-in-time prevention works like having a person standing at the crossing (the crossing guard) with a switch that will let only one train onto "super tracks" which runs above and over the other waiting train(s).

For non-recursive locks, a lock may be entered only once (where a single thread entering twice without unlocking will cause a deadlock, or throw an exception to enforce circular wait prevention).
For recursive locks, only one thread is allowed to pass through a lock. If any other threads enter the lock, they must wait until the initial thread that passed through completes n number of times it has entered.

So the issue with the first one is that it does no deadlock prevention at all. The second does not do distributed deadlock prevention. But the second one is redefined to prevent a deadlock scenario the first one does not address.

Recursively, only one thread is allowed to pass through a lock. If other threads enter the lock, they must wait until the initial thread that passed through completes n number of times. But if the number of threads that enter locking equal the number that are locked, assign one thread as the super-thread, and only allow it to run (tracking the number of times it enters/exits locking) until it completes.

After a super-thread is finished, the condition changes back to using the logic from the recursive lock, and the exiting super-thread

sets itself as not being a super-thread
notifies the locker that other locked, waiting threads need to re-check this condition

If a deadlock scenario exists, set a new super-thread and follow that logic. Otherwise, resume regular locking.

Issues not addressed above

A lot of confusion revolves around the halting problem. But this logic does not solve the halting problem because the conditions in which locking occurs are known, giving a specific solution (instead of the otherwise required general solution that the halting problem requires). Still, this locker prevents all deadlocked only considering locks using this logic. But if it is used with other locking mechanisms, a lock that is started never unlocks (exception thrown jumping out without unlocking, looping indefinitely within a lock, or coding error forgetting to call unlock), deadlocking is very possible. To increase the condition to include these would require solving the halting issue, since one would be dealing with conditions that one knows nothing about and is unable to change.

Another issue is it does not address the temporary deadlocking issue (not really a deadlock, but a performance killer), where two or more threads lock on each other while another unrelated threads is running. These temporary deadlocks could have a thread running exclusively within them, increasing parallelism. But because of how the distributed deadlock detection works for all locks, and not subsets therein, the unrelated running thread must complete before performing the super-thread logic to remove the temporary deadlock.

One can see the temporary live-lock scenario in the above. If another unrelated running thread begins before the first unrelated thread exits, another duration of temporary deadlocking will occur. If this happens continuously (extremely rare), the temporary deadlock can be extended until right before the program exits, when the other unrelated threads are guaranteed to finish (because of the guarantee that one thread will always run to completion).

Also, deadlocks cannot be opened with the sonic screwdriver (doctor who)

Further expansion

This can be further expanded to involve additional logic to increase parallelism where temporary deadlocks might otherwise occur. But for each step of adding more logic, we add more overhead.

A couple of examples include: expanding distributed super-thread locking mechanism to consider each subset of existing locks; Wait-For-Graph (WFG) [1] algorithms, which track all cycles that cause deadlocks (including temporary deadlocks); and heuristics algorithms which don't necessarily increase parallelism in 100% of the places that temporary deadlocks are possible, but instead compromise by solving them in enough places that performance/overhead vs parallelism is acceptable (e.g. for each processor available, work towards finding deadlock cycles less than the number of processors + 1 deep).

References[edit]

^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j Silberschatz, Abraham (2006). Operating System Principles (7 ed.). Wiley-India. p. 237. ISBN 9788126509621. Retrieved 29 January 2012. Cite error: The named reference "os_galvin" was defined multiple times with different content (see the help page).
^ Padua, David (2011). Encyclopedia of Parallel Computing. Springer. p. 524. ISBN 9780387097657. Retrieved 28 January 2012.
^ Schneider, G. Michael (2009). Invitation to Computer Science. Cengage Learning. p. 271. ISBN 9780324788594. Retrieved 28 January 2012.
^ Rolling, Andrew (2009). Andrew Rollings and Ernest Adams on game design. New Riders. p. 421. ISBN 9781592730018. Retrieved 28 January 2012.
^ Oaks, Scott (2004). Java Threads. O'Reilly. p. 64. ISBN 9780596007829. Retrieved 28 January 2012.
^ A Treasury of Railroad Folklore, B.A. Botkin & A.F. Harlow, p. 381
^ ^a ^b ^c Shibu (2009). Intro To Embedded Systems (1st ed.). McGraw Hill Education. p. 446. ISBN 9780070145894. Retrieved 28 January 2012.
^ ^a ^b Stuart, Brian L. (2008). Principles of operating systems (1st ed.). Cengage Learning. p. 446. Retrieved 28 January 2012.
^ ^a ^b Tanenbaum, Andrew S. (1995). Distributed Operating Systems (1st ed.). Pearson Education. p. 117. Retrieved 28 January 2012. Cite error: The named reference "distri_tanen" was defined multiple times with different content (see the help page).
^ Mogul, Jeffrey C. (1996). "Eliminating receive livelock in an interrupt-driven kernel". {{cite web}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Anderson, James H. (2001). "Shared-memory mutual exclusion: Major research trends since 1986". {{cite web}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Zöbel, Dieter (October 1983). "The Deadlock problem: a classifying bibliography". ACM SIGOPS Operating Systems Review. 17 (4): 6–15. doi:10.1145/850752.850753. ISSN 0163-5980.{{cite journal}}: CS1 maint: date and year (link)

External links[edit]

"Advanced Synchronization in Java Threads" by Scott Oaks and Henry Wong
Deadlock Detection Agents
DeadLock at the Portland Pattern Repository
Etymology of "Deadlock"
ARCS - A Web Service approach to alleviating deadlock
Non-Hard Locking Read-Write Locker

Category:Concurrency (computer science) Category:Software bugs Category:Software anomalies Category:Distributed computing problems

[os_galvin-1] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j Silberschatz, Abraham (2006). Operating System Principles (7 ed.). Wiley-India. p. 237. ISBN 9788126509621. Retrieved 29 January 2012. Cite error: The named reference "os_galvin" was defined multiple times with different content (see the help page).

[para_enclo-2] Padua, David (2011). Encyclopedia of Parallel Computing. Springer. p. 524. ISBN 9780387097657. Retrieved 28 January 2012.

[invi_comp-3] Schneider, G. Michael (2009). Invitation to Computer Science. Cengage Learning. p. 271. ISBN 9780324788594. Retrieved 28 January 2012.

[game_des-4] Rolling, Andrew (2009). Andrew Rollings and Ernest Adams on game design. New Riders. p. 421. ISBN 9781592730018. Retrieved 28 January 2012.

[java-5] Oaks, Scott (2004). Java Threads. O'Reilly. p. 64. ISBN 9780596007829. Retrieved 28 January 2012.

[6] A Treasury of Railroad Folklore, B.A. Botkin & A.F. Harlow, p. 381

[embb-7] Shibu (2009). Intro To Embedded Systems (1st ed.). McGraw Hill Education. p. 446. ISBN 9780070145894. Retrieved 28 January 2012.

[pric_os-8] Stuart, Brian L. (2008). Principles of operating systems (1st ed.). Cengage Learning. p. 446. Retrieved 28 January 2012.

[distri_tanen-9] Tanenbaum, Andrew S. (1995). Distributed Operating Systems (1st ed.). Pearson Education. p. 117. Retrieved 28 January 2012. Cite error: The named reference "distri_tanen" was defined multiple times with different content (see the help page).

[10] Mogul, Jeffrey C. (1996). "Eliminating receive livelock in an interrupt-driven kernel". {{cite web}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[11] Anderson, James H. (2001). "Shared-memory mutual exclusion: Major research trends since 1986". {{cite web}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[12] Zöbel, Dieter (October 1983). "The Deadlock problem: a classifying bibliography". ACM SIGOPS Operating Systems Review. 17 (4): 6–15. doi:10.1145/850752.850753. ISSN 0163-5980.{{cite journal}}: CS1 maint: date and year (link)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]