Talk:Thread safety

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing (Rated Start-class)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.
 

Difficulties[edit]

Earlier versions of this page contained the following advice:

One approach to making data thread-safe that combines several of the above elements is to make changes to a private copy of the shared data and then atomically update the shared data from the private copy. Thus, most of the code is concurrent, and little time is spent serialized.

As stated above that advice is incorrect and will lead to lost updates of the shared data. Consider two threads each maintaining their private copy of a shared integer. If each thread increments their own private copy of the integer and copies that to the shared integer the net result will be an increment of the shared integer by 1 instead of 2. The threads must check that there have been no updates to the shared data since they took their private copies before updating the shared data. If available, the Compare-and-swap instruction may be useful.

    • update**

There are cases where the above mentioned advice is not wrong. For example when shared data is only read by some threads and written by a unique thread.

Debates belong on the talk page, not in the article. 96.60.118.196 (talk) 22:12, 15 February 2010 (UTC)

Actually I wanted the old, bad, advice on the main page so that readers could see that even people who think they understand thread safety find it hard. As the page stands the advice is presented as general altho you claim here it only applies in a special case - one writer, multiple readers. Actually each reader thread needs to access the shared data atomically also - so the local copies provide no benefit whatsoever.

I'm not going to get into a revert war, so I suggest you think this thru carefully. The best advice is not to use shared data but rather to communicate by message passing only. —Preceding unsigned comment added by 217.155.175.25 (talk) 15:15, 7 March 2010 (UTC)

Layout[edit]

This article needs to be a little clear formatted to allow easy dissemination of what it saying. Falls End (T, C) 00:36, 1 December 2005 (UTC)

I've rewikfied a bit. I have removed this phrase from the intro, as it's a repeat of what is found in the (new) section 'Achieving thread safety'. "Common ways of creating thread-safe code include writing reentrant code, using Thread-local storage to localize data to each thread, guarding shared data with mutual exclusion so that only one thread uses it at a time, and modifying shared data with atomic operations." In a longer article it's probably worth keeping this repetition, but I don't think it's justified yet.
What do you think of my changes? :) --Stevage 02:57, 1 December 2005 (UTC)

Page move[edit]

I think this article should be renamed to 'Thread safety'. Can anyone work out how to do it? Jimbletang 03:05, 8 December 2006 (UTC)

I thought the same thing, so I moved it. - furrykef (Talk at me) 00:58, 17 June 2007 (UTC)


Reentrancy does not always imply thread-safety[edit]

I think the article is wrong when it infers that reentrancy is a sufficient condition to ensure thread safety. It's not, and the two concepts of thread safety and reentrancy are distinct. It's true that reentrant functions are often thread-safe too, but it's easy to construct examples of reentrant functions that are not thread-safe.

So, I think it would be better to eliminate phrases like "A subroutine is reentrant, and thus thread-safe ..." because they give incorrect impressions that reentrancy is some sort of a stronger guarantee of thread-safety.

The reentrant (subroutine) article also claims that "Every reentrant function is thread-safe, however, not every thread-safe function is reentrant.".
Since it is so "easy to construct examples of reentrant functions that are not thread-safe", could you -- or anyone -- give an example "reentrant function that is not thread-safe"?
Something kind of like this (but, of course, going the other direction):
  // thread-safe function that is not re-entrant
  Tuple temporary_buffer // global variable
  swap_tuples( Tuple * pa, Tuple * pb )
      lock(temporary_buffer)
      temporary_buffer = *pa
      *pa = *pb
      *pb = temporary_buffer
      unlock(temporary_buffer)
      return
--68.0.124.33 (talk) 03:22, 21 October 2008 (UTC)
I think that the problem is that re-entrant isn't a black-and-white thing. In particular, recursive functions exhibit a form of re-entrancy; however, because the writer has full control of where the overlapping execution occurs, he/she can take advantage of that knowledge. For example, here's a (crappy) function that counts how many nodes in a binary tree are greater than a particular target value. It's re-entrant in the sense that there are multiple stack frames for this function on the stack at the same time, but it's not thread-safe/generally re-entrant.
  int count_greater(Node *node, int target, int level = 0) {
    static bool found_it;
    if (level == 0) found_it = false;
    if (node == NULL) return 0;
    int left = count_something(node->left, target, level + 1);
    if (node->value == target) found_it = true;
    return left + (found_it ? 1 : 0) + count_something(node->right, target, level + 1);
  }
Acertain (talk) 04:16, 28 December 2009 (UTC)

Difference between mutual exclusion and atomic operations?[edit]

The article shows 4 ways to make functions thread-safe, two of which are guarding shared data with mutual exclusions and modifying shared data with atomic operations. How is atomic operations not a subclass of mutual exclusion? Acertain (talk) 04:16, 28 December 2009 (UTC)

Intro[edit]

Thread safety is a key challenge in multi-threaded programming. It was not a concern for most application programmers of little home applications, but since the 1990s, as Windows became multithreaded, and with the expansion of BSD and Linux operating systems, it has become a commonplace issue.

The world didn't start to spin with the invention of PC's.--195.113.23.117 (talk) 14:10, 25 May 2011 (UTC)

Vagueness[edit]

There is in my opinion an unnecessary vagueness in the current definition of "thread safety" as "[being] usable in a multi-threaded environment", as even functions that are not thread-safe can be used in a multi-thread environment, provided sufficient precautions (such as calling them from the same thread). Some of the examples push this confusion even further, as they show examples of code that is not "process-safe" (code that fails when another code deletes a file, including when there are no threads involved) as being thread-unsafe. I edited the article to use the definition found both in the well-known book "Linux Programing Interface" and in the Oracle documentation. I also updated the corresponding example. --un_brice (talk) 08:07, 9 June 2011 (UTC)

Circular definition[edit]

The first paragraph:

Thread safety is a computer programming concept applicable in the context of multi-threaded programs. A piece of code is thread-safe if it only manipulates shared data structures in a thread-safe manner, which enables safe execution by multiple threads at the same time. There are various strategies for making thread-safe data structures [1].

This is a completely circular definition. "Thread-safe code is is thread-safe." Who knew? --Cromas (talk) 22:04, 19 October 2011 (UTC)

Incomplete explanation?[edit]

At present the page includes: "

Examples[edit]

In the following piece of C code, the function is thread-safe, but not reentrant:

int function()
{
	mutex_lock();
	...
	function body
	...
	mutex_unlock();
}

In the above, function can be called by different threads without any problem. But if the function is used in a reentrant interrupt handler and a second interrupt arises inside the function, the second routine will hang forever. As interrupt servicing can disable other interrupts, the whole system could suffer. "

The sentence "But if the function is used in a reentrant interrupt handler and a second interrupt arises inside the function, the second routine will hang forever" seems incorrect and incomplete to me, and therefore confusing.

- There's no "second routine"; there's only a second invocation of a single routine. This, surely, is the whole point of thread-safety.
- In the scenario described, certainly the second [invocation of the] routine will hang forever - no argument with that. But there's no mention of the fact that, for it to be true, the first invocation must also hang forever; nor is there any explanation of why either invocation hangs at all. In fact they hang for quite different reasons.
What's really happening? The first invocation acquires the lock and enters the function body. The caller's described as a "re-entrant interrupt handler", so we know interrupts are enabled. Therefore when another interrupt occurs it too can invoke the handler, which again calls the function. This is the second invocation; it hangs because the first invocation is holding the lock.
But why does it hang forever? The reason is that the first invocation has also hung forever; it's prevented from continuing and eventually releasing the lock. That happens because it was interrupted. In order to invoke the handler, the second interrupt must have been given priority over the first and allowed to interrupt it. The first invocation is therefore sitting in the function body, waiting for whatever interrupted it to finish and return. There's the problem - a deadlock: the second invocation is blocked by the first which holds the lock; but the first invocation has been interrupted so can't release the lock until the second invocation finishes and returns.

I think the first point - about there being no "second routine" - needs correcting.

Re. the second point, anyone reading that "the second routine will hang forever" could be forgiven for inferring that only the second invocation hangs, not the first also. Deducing (incorrectly) from that that the first routine doesn't hang, they might then conclude that the lock would eventually be released and wonder why the second routine hangs forever. 118.92.40.55 (talk) 01:45, 29 April 2012 (UTC) L Blythen