|This article does not cite any references or sources. (December 2009)|
A lock convoy occurs when multiple threads of equal priority contend repeatedly for the same lock. Unlike deadlock and livelock situations, the threads in a lock convoy do progress; however, each time a thread attempts to acquire the lock and fails, it relinquishes the remainder of its scheduling quantum and forces a context switch. The overhead of repeated context switches and underutilization of scheduling quanta degrade overall performance.
Lock convoys often occur when concurrency control primitives such as critical sections serialize access to a commonly used resource, such as a memory heap or a thread pool. They can sometimes be addressed by using non-locking alternatives such as lock-free algorithms or by altering the relative priorities of the contending threads.
Critical sections as implemented in Microsoft Windows operating systems provide a good example of how lock convoys can occur. In Windows, critical sections use a combination of a spinlock and a kernel synchronization object called an "event" to ensure mutual exclusion. For low-contention critical sections, the spinlock will provide mutual exclusion most of the time, falling back on the event only when a thread fails to acquire the spinlock within a certain amount of time. When contention is high, however, it is possible for many threads to fail to acquire the spinlock and enter a waiting state, all waiting on the same event.
When the event is signaled, all threads that are waiting on the event are woken, but only one will be allowed to acquire the critical section and continue execution; the remaining threads will each block again.
As of Windows 2003, a thread waiting on an event is boosted to 1 priority level more than the thread which "set" (i.e. signaled) the event associated to the critical section (aka, the thread releasing the critical section, which notifies other waiters by signaling the event). On the other hand, the setting thread will also lose the boost it may have requested while calling the "Set Event" API, which takes such a boost as a parameter.
These two improvements help against a lock convoy, because now, each waiting thread should be able to run its full quantum, while the thread releasing the lock will probably have to wait more before being able to acquire the resource again.