Talk:Thread (computing)/Archive 1
This is an archive of past discussions about Thread (computing). Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 |
threads vs processes
Not always are processes provided only by operating system. Erlang programming language has support for processes in language - they are completely isolated from one another (while running within single OS process). Pavel Vozenilek 22:02, 29 July 2005 (UTC)
- The question is, are these truly "processes" if not provided by the operating system? For instance .NET provides AppDomains that are very "process-like", but are not true processes. That is to say that the AppDomains provide memory isolation, context, and fault tolerance (in that it can be unloaded or ab-end without affecting other appdomains in the system). These are still not considered processes, however, because they still run in the same (although isolated) address space as the rest of the process. Is this the same thing with Erlang? Being that I've never worked with it at all, I couldn't make a valid argument one way or another. - Sleepnomore 15:53, August 26, 2005 (UTC)
Threads are also used by web servers
Many web servers use threads somehow, sure. However, I find those two paragraphs not very useful. In my opinion, it gives the wrong idea that either multi-threading or multi-processing is necessary for web servers and similar kind of servers in general. That's far from the truth. At least on single-CPU machines an approach using select, poll, kevent, epoll etc. is usually far more efficient and also a fairly popular approach for all kinds of servers at least in C/C++. Even in Java using tons of threads does not seem to scale very well and using non-blocking I/O instead is actually better even if the naive approach of one thread per connection - or in C/C++ even one process per connection (resp. client) - is easier to code.
The part about Apache 1.3 using multi-processing almost qualifies as FUD. At the very least a lot of context or explanations are missing and most of those "dangers" apply to multi-threading equivalently. In short, I think the article was better without those two paragraphs. --82.141.49.144 05:14, 4 December 2005 (UTC)
- You should get more informed before calling this FUD. Personally having extensive server-side programming experience, I can say that depending on the application, one of the following is a best approach: 1) single threaded, single process, using non-blocking I/O (i.e. an IRCd, where many memory needs to be shared among clients without the overhead of locks, and without need for any CPU intensive tasks or expensive system calls other than non-blocking I/O) 2) multiple processes using a pool of processes (ideal when you need very solid production quality, avoiding memory leaks of system libraries, avoiding a client from crashing the whole application, using processes of various credentials, avoiding thread-related issues if the application is non-trivial and a fair number of syscalls and processing is needed to serve the clients. 3) a multithreaded approach, which is prone to some of the problems of (1), but is more powerful and can allow more processing to be done to serve a client, but without some obvious security protections of (2) (at least if not also using multiple processes). I had to program applications using all of these models. For every situation, one model is clearly superior over the others, these are not merely style or paradigm programming issues. Of course, this doesn't mean that you can't appropriately use both multiple processes and threads in an application. You just can't in any sanity avoid using multiple processes for some applications.
- It should also be noted that the launching time of processes or threads is a less important factor (because pools can be used, and because process creation is usually light enough on modern OSs using Copy-on-write) than a) the stability/security implications, b) frequency of shared resources access and c) type of processing needed, which will generally define which of the models an application should ideally use. Also note that systems with Symmetric multiprocessing permit both multiple processes and multiple threads to utilize more than a single processor.
- Back on topic with Apache and PHP, it took a fair amount of time for PHP to decently run under Apache 2 in multithreaded mode, and depending on the PHP extensions and OS you're using, you can still experience issues. This is partly caused by the fact that many libraries expect to run under a normal process and use blocking or process-wide affecting system calls, and most PHP extensions just map to those library calls. As for the OS-specifics, despite the POSIX thread standard, which defines a widely spread API, implementations widely vary. There shouldn't be much issues if on your particular operating system every system call is thread-friendly, and its C library uses all the necessary hacks to mutex-wrap thread-unfriendly functions (other than security/stability/memory leaks issues solved by multiple processes). The PHP wrappers can also limit problems lock-wrapping calls made to a non-reentrant C function (loosing the concurrency adventage of threads when using this function of course). Also, weither using threads provides greater concurrency than processes also depends on your OS and the thread's implementation.
- A few real-world examples of implementation specifics: some of my own multithreaded applications couldn't properly run under LinuxThreads (while ran fine under Solaris and NetBSD) because LinuxThreads wasn't up to the POSIX standard, other than providing a compatible API). These now run fine under Linux 2.6 with the Native POSIX Thread Library. Still today, however, a number of people decide to keep 2.4 on their production servers for stability considerations. Another example, several years back, on NetBSD it was only possible to use user-space threads (what some call fibers), through the GNU PTh library, unproven-threads or unreal-threads libraries. Under these libraries, some system calls had to be avoided alltogether to avoid problems, and heavy processing loops had to be done in a separate process, or to explicitely yield control frequently back to the thread scheduler voluntarily, because these particular implementations did not use a preemptive scheduler (not worthwhile to provide without special kernel support). Another aspect is that POSIX did not properly define a relation between threads and the standard I/O subsystems of unix systems using file desriptors. For instance, while a thread may wait into select(2) or poll(2), it cannot at the same time use a more efficient mode of inter-thread messages based on shared memory queues and notification through conditional variables, without some hackery, like dedicating a thread to file descriptor polling and having that thread communicate in a thread-efficient manner with other threads of the process. Therefore, if your application heavily relies on interprocess communication because of a need like privilege separation, is using threads still worth it considering the hassle it adds? (an implementation of a test project I wrote to fusion inter-thread messages and file descriptor polling can be found at [1] (does not work with LinuxThreads or PTh). I am certain that there are many other examples and that many programmers can relate. As an ending note, interestingly, the Apache 1.3 branch isn't about to die, and is still being actively maintained and used worldwide. The very powerful Postgresql database also benefits from multiple processes, and there have been heated discussions about the process vs thread models between its community and that of MySQL. MySQL uses a MySQL_safe process which role is to restart the crashing main MySQL multithreaded process (not that it crashes often in my personal experience, though :) but an example where an extra process was required.
- This was a lengthy reply, but points which are worthwhile to consider. I'll leave for others to relate and if necessary update the article if they consider it useful. It's probably sufficient to keep these notes on this discussion page, though. --66.11.179.30 03:36, 3 March 2006 (UTC)
- I think all your questions are answered in the Process, Thread, Fiber Table that was deleted. Why the hell was that table deleted? It was the most useful part of the article.
Ambiguity
"""while in other operating systems there is not so big a difference"""
- What exactly are the other operating systems? Linux, BeOS. It's nice to be specific.
- Probably most other commonly used operating systems, those that I know well being all of the BSDs, Linux, Solaris, MacOS (I did not use BeOS, but it well might be using COW (Copy-on-write) in fork(2) to avoid actually creating all pages of a newly created process too). I would not be surprised if IRIX, HPUX, Tru64 also do, but someone who knows should confirm this. --66.11.179.30 09:54, 3 March 2006 (UTC)
process--thread--fiber interrelation
The concept of a process, thread, and fiber are interrelated by a sense of "ownership" and of containment. .... A fiber can be scheduled to run in any thread in the same process.
There seems a contradiction. If a fiber can be run in any thread, then it's not cointained within any specific thread, nor owned by any specific thread. --tyomitch 15:25, 8 May 2006 (UTC)
For context enrichment, could be nice to add some thread history (OS related, but makes sense also into this context). --User:faragon
12:52, 7 Jun 2006 (GMT+1)
Example
A while ago I changed the example to a C# example that I believed showed purely the concept of multi-threading more clearly than our current example which would be more suited to Algorithms For Prime Number Generation ;) I just feel that the example should be very closely tied to threading as opposed to having an exterior purpose. Here is the reverted example I propsed before: [2] Any thoughts? Martin Hinks 17:43, 29 January 2006 (UTC)
The commentary pertaining to the example says, "of course, this problem [the race condition] is easily corrected using standard programming techniques" raises the question: How? 68.50.203.109 04:17, 9 July 2006 (UTC)
Getting It Str8
Ey people, i'm kind of a newbie to CPU's. The terminology used in this article (and most wikipedia articles) is kind of complex. So, i'm wondering if someone can tell me a little about it in plain english like;
1) What exactly are Threads, what do they do, and how do they work?
2) What is the difference between a thread, process, and the other thing?
3) What exactly does a kernel do? Is it embedded in an Operating System? Why 'Linux Kernel' and not just 'Linux'?
4) Tell me some other things i should know about this stuff please.
Any help would be greatly appreciated and if you do help me, ill make it worth a lot of other people's time by editing this main article into simpler words. Thanks. KittenKiller
This article may be too technical for most readers to understand.(September 2010) |
It seems to me that the terminology used in most computer science wikipedia articles is kind of complex. However, the same cannot be said for most wikipedia articles. Care to comment? 68.50.203.109 04:21, 9 July 2006 (UTC)
table changes
Personally I like the old table format more -- the new format is too compact and is difficult to read. Neilc 00:48, 23 Jun 2005 (UTC)
- Okay, I made another attempt. What do you think? Dianne Hackborn 02:59, 23 Jun 2005 (UTC)
- I noticed that "modern" operating systems (second-to-last row in the chart) are marked as supporting multiprocessing, multithreading but not fibers (Y Y N), but then it says that nearly all operating systems after 1995 support all three (Y Y Y). This seems to be self-contradictory; for example, Mac OS X is a modern operating system, and also (if I remember correctly) came out after 1995. Does it fall in the second-to-last category (supporting multiprocessing, multithreading but not fibers), or does it fall in the last category (supporting all three)? 68.50.203.109 04:14, 9 July 2006 (UTC)
- Why is the fibers colomn even in this table? By definition, fibers can be implemented in a user program and thus don't rely on the existance of either threads or processes (so one could, say, implement them in DOS). I personally think the fibers column should be removed from the table entirely or changed to all Y letters. --NotQuiteEXPComplete 12:40, 14 July 2006 (UTC)
- Indeed, some programming languages (those with coroutines or continuations) support "fibers" out of the box, with little effort. In other languages (like C), it requires some systems-level wizardry, but such can be hidden behind a library. About the only use of the fibers column is documenting which OS's provide API calls in the standard library; I can't think of any modern systems which do not. --EngineerScotty 23:30, 18 July 2006 (UTC)
- Why is Microsoft Windows listed as "Y Y N"? There are native functions to create and schedule fibers (CreateFiber, ConvertThreadToFiber,SwitchToFiber), contained within kernel32.dll, and supported since Win9x. Goffrie 16:54, 20 July 2006 (UTC)
OK...?
I don't see around here (The referenced talk page) who added the {{contradict}} tag, and more to the point, why? 68.39.174.238 03:40, 18 July 2006 (UTC)
- The tag appears to have been added in this edit by an anon user. If there's no obvious reason for it (and I don't see one) then I suggest we just remove the tag. --Allan McInnes (talk) 05:11, 18 July 2006 (UTC)
- They are right though. The table is, in some cases, blatently wrong. Since fibers seem to be just a Windows name for userspace threads from what I can gather, it does appear to be wrong since one should be able to implement them in all of these systems in userspace or one's program. --NotQuiteEXPComplete 17:39, 18 July 2006 (UTC)
- Ok. But wrong isn't the same as self-contradictory, which is think what caused some confusion. Nor did the person who did the tagging explicate their reasons here on the talk page, as they were supposed - was there a reason for the tagging, or was it just idle vandalism? If the table's wrong, then by all means fix it (personally I'd be in favor of cutting it completely, since I don't think it adds much value). --Allan McInnes (talk) 18:40, 18 July 2006 (UTC)
- I think the contradiction is pointed out under the "Fibers and NPOV" header in the talk page. The last row of the table says "Almost all operating systems after 1995 fall into this category." but many operating systems after 95' are mentioned in previous rows. Maybe not completely self-contradictory, but at least confusing. And I agree with the people that think we should get rid of all this fiver stuff which makes no sense outside NT-land. --Lost Goblin 08:28, 19 July 2006 (UTC)
- Yeah, I'm going to go ahead and kill the table. --NotQuiteEXPComplete 19:17, 26 July 2006 (UTC)
Article restructuring
Forgive me if this is a poorly stated ... I'm rather tired. I think a good restucturing would go a long way to improving this article. The section headers are also inapropriate in some places were the discussion strays from what the section header talks about. In general, I think there should be sections comparing (or at least saying a small amount before linking to another article) the entire spectrum of: (1) implementation: kernel-space vs. user-space, (2) for kernel space, the model for interfacing kernel-threads and user-threads: many-to-one, one-to-one, one-to-many, (3) co-operitve scheduling (i.e. by the process) vs. premptive sceduling (i.e. by the kernel) and showing how the desireable properties of threads, namely continuing to execute the process in the face of blocking calls and being able to run on multiprocessor machines, pretty much follow directly from these three things. The article mentions most of this stuff, but the flow and cohesiveness is pretty poor.--NotQuiteEXPComplete 06:46, 9 August 2006 (UTC)
Green threads
I don't think "green threads" came from SunOS. Instead, I think they came from the Green project:
http://today.java.net/jag/old/green
http://www.jguru.com/faq/view.jsp?EID=416246
Calling user-level threads "green threads" probably resulted from the early Java JDK using that name for the thread implementation inherited from the Green project.
Booskunk 06:53, 23 October 2006 (UTC)
Language error
Some sentences in the introduction do not seem to make grammatical sense:
"On a multiprocessor or multi-core system, which are beginning into general use, threading can be achieved via multiprocessing, wherein different threads and processes can run literally simultaneously on different processors or cores."
Should be replaced with something like "which are beginning to see general use"
Also "Absent that, programs can still implement threading by using timer..." is wrong. 155.232.128.10 11:27, 23 January 2007 (UTC)
- Yes. Feel free to fix it. –EdC 12:00, 23 January 2007 (UTC)
C example
Can someone please provide an example in C? --Pradeep.v 06:43, 7 April 2007 (UTC)
Explain
Can someone add an explanation for 1×1, M×N, 1×M or 1:1, M:N, 1:M or whatever... —The preceding unsigned comment was added by Frap (talk • contribs) 21:26, 26 April 2007 (UTC).
- I believe that the usage with a colon indicates whether there is a 1:1 correspondence between (user space) "threads" and (kernel space) lightweight processes. On OSes where LWPs are expensive, it can make sense for the language or threading library to create more "threads" than there are LWPs backing them, and juggle them as threads enter wait states. So M:1 indicates that there are M fibers per thread (LWP), while M:N indicates a situation where the process as a whole has N LWPs backing M fibers (where M > N). Evidently, in the latter case at most M fibers can be running concurrently - I suppose the rest must be yielded or (perhaps) engaged in simulated blocking I/O. Try this link: Understanding Threads. –EdC 20:11, 27 April 2007 (UTC)
Multithreaded programming in dual vs quad+ cores
Someone please address the issue of multithreaded programming in dual vs quad+ cores. Nowhere does it talk about whether or not it is more difficult for the programmer to create a program to take advantage of >2 processors. We understand from the article that you must write the application with multiple threads in mind BUT nowhere does it talk about the issue of programming difficulty or lake thereof for >2 processors.
Basically: Will a multithreaded program of today have to be modified for a computer with 4, 8, 32 cpus?
—The preceding unsigned comment was added by 64.233.242.55 (talk • contribs) 10:54, 8 November 2006 (UTC)
- In many cases, not if it's coded intelegently. See thread pool. Howerver, I think there are algorithms that only admit so much parallelization, so sometimes, it may only be possible to parallelize something so much.--NotQuiteEXPComplete 10:08, 20 June 2007 (UTC)
- Correct, see Amdahl's Law. Regarding the question: it depends. If you are using multithreading as means to parallelize an algorithm (as opposed to using multiple machines with single CPUs via MPI) then you will have to adjust your algorithm for caching, shared memory, etc. If you're creating something to distribute workload (like a web server) then it shouldn't matter much how many processors you have since each request is essentially independent. Cburnett 12:54, 20 June 2007 (UTC)
Ambiguity in para 2.
Para 2 starts "Multiple threads can be executed in parallel on many computer systems."
I believe you mean this in the sense of: On many computer systems, multiple threads can be executed in parallel on a single processor.
and are not refering to multiprocessing. However the existing sentence can be read either way. —Preceding unsigned comment added by 209.157.48.1 (talk) 23:19, 26 March 2008 (UTC)
- Deleted it. Cburnett (talk) 23:24, 26 March 2008 (UTC)
Fiber, Fiber, Fiber?
Huh? Well, I'm no thread expert but I've never heard this term in this context before. Furthermore, it's spread all over the article, so one would think it's something widely known and used. However, when I search on the web, all I find is that it seems to be some seemingly unpopular concept used by Windows NT. It seems to be what I'd call user-level (or userland) threads e.g. GNU Pth or maybe it refers to anything thread-alike as used in programs with a main-loop which delegates/schedules short-lived tasks using state machines? So is "fiber" common terminology or just some marketing resp. product-specific term? --82.141.49.144 05:01, 4 December 2005 (UTC)
- The fiber term seems to me very specific to the NT world indeed. The current article tends to say that user-level threads would be called fibers, but that is wrong: there are N:M user-level implementation of "threads" which are really threads, like PM2's Marcel library: they can be preempted, are scheduled by the library's scheduler, etc. They really behave like POSIX threads from the point of view of the application. The "fiber" concept introduced by NT is quite different: it's more about the application managing contextes. That could be compared to POSIX's makecontext/swapcontext. If nobody objects, I'd rather reword the whole page to fix the confusion between user-level threads and fibers. SamuelThibault 15:57, 1 December 2008 (UTC)
- Fibers could also be compared with what a lot of programming environment implement: tasks SamuelThibault 15:59, 1 December 2008 (UTC)
Thread join redirects here, but isn't mentioned in the article
Thread join, which is wiki-linked from some other synchronization articles, redirects to this article, but the article never mentions joining threads. I think it would be useful to have an explanation of that technique -- it would certainly have helped me. Npdoty (talk) 03:10, 29 April 2010 (UTC)
Samples
Recently there were attempts to insert sample of "how to create thread in Java" into this article. Personally, I don't think it belongs to this article (if providing only a sample of creating thread, and only in Java, it is pretty much useless for the article, and if going into all thread-related functions for all languages/OSs, it will become unmanageable), and also it is IMHO a borderline WP:NOTHOWTO, so I've removed the sample, but if there are arguments for including such samples - let's discuss them. If it would be an article about a feature of the programming language - it would be a different story, but for features which can be adequately explained without referring to specifics of the language - IMHO it is unnecessary to introduce such samples. Ipsign (talk) 07:27, 14 September 2011 (UTC)
Daemon thread
Daemon threads are not mentioned in the article.--Wisamzaqoot (talk) 22:49, 8 October 2011 (UTC)
Fiber performance
"Fibers are an even lighter unit of scheduling which are cooperatively scheduled: a running fiber must explicitly "yield" to allow another fiber to run"
Ok.
"which makes their implementation much easier than kernel or user threads"
This is a little ambiguous to me. By reading the rest of the section I undestand that it means that it is easier for the OS to implement fibers when compared to threads. But at first I thought it meant that it was easier for the application programmer to use fibers instead of threads, which is not true. Threads are easier because you don't have to worry about scheduling.
"A fiber can be scheduled to run in any thread in the same process."
I don't think so. A fiber can only be scheduled to run in the thread that created it. You can however switch to a fiber that was created by a different thread, but still the fiber will run in the thread that created it.
"This permits applications to gain performance improvements by managing scheduling themselves, instead of relying on the kernel scheduler (which may not be tuned for the application)."
How can you get performance improvement by using fibers? Fibers can't run concurrently (unless they are fibers from different threads). If a fiber makes a blocking call, it won't be able to switch to another fiber until the call unblocks. If you have a single process and thread, just adding fibers won't improve performance.
"Parallel programming environments such as OpenMP typically implement their tasks through fibers."
OpenMP uses fibers? It must also use threads/processes or it wouldn't be able to use multiple processors/cores. Citation, please.
Check this out:
http://msdn.microsoft.com/en-us/library/aa915371.aspx
"In general, fibers do not provide advantages over a well-designed multithreaded application."
"From a system standpoint, a fiber assumes the identity of the thread that created it. For example, if a fiber accesses thread local storage (TLS), it is accessing the TLS of the thread that created it. In addition, if a fiber calls the ExitThread function, the thread that created it exits."
"You can call SwitchToFiber with the address of a fiber created by a different thread. To do this, you must have the address returned to the other thread when it called CreateFiber and you must use proper synchronization."
Italo Tasso (talk) 10:42, 9 January 2010 (UTC)
I agree. Fibers alone simply do not run in parallel and OpenMP is intended to exploit parallel systems. Intel is a prominent contributor/supported of the OpenMP project and a paragraph on this Intel page seems to cast further doubt on the connection: "In the Win32 threading API, there is a threading option called fibers that enables users to write their own thread scheduler and so exert fine-grained control over threading operations. This too is not possible in OpenMP."
JohnMcF (talk) 17:46, 11 February 2012 (UTC)
Analogy with cooking
The text mentions an analogy, namely:
To give an analogy, multiple threads in a process are like multiple cooks reading off the same cook book and following its instructions, not necessarily from the same page.
Would the next one be better to clarify the differences between multithreading and multitasking?
To give an analogy, multiple threads in a process are like multiple cooks reading off the same cook book, sharing their ingredients and following its instructions (not necessarily from the same page). Multitasking would be multiple cooks reading off their own cook book and each one has with its own ingredients.
What do you think? --Mattias.Campe (talk) 11:21, 16 April 2012 (UTC)
Missing word ?
In the article "Thread (computing)" - http://en.wikipedia.org/wiki/Thread_%28computing%29
There may be a word missing. Reference the third paragraph from the top...
Original... Many modern operating systems directly support both time-sliced and multiprocessor threading with a process scheduler. The kernel of an operating system allows programmers to manipulate threads via the system call interface. Some implementations are called a kernel thread, whereas a lightweight process (LWP) is a specific type of kernel thread that shares the same state and information. (This last sentence does not make sense. "... shares the same and information with what?")
Looks like the last fragment (quotation), which may actually be an editorial comment, is missing the word "state" after the word "same". It also mistakenly ends the quotation after "with what" which is NOT part of the original statement. 184.36.175.67 (talk) 15:00, 5 September 2012 (UTC)
Rewrite
substitute "the smallest" with "a" confidence level 5/10 — Preceding unsigned comment added by 97.117.5.49 (talk) 04:15, 29 April 2013 (UTC)
History
As a reader, a history section would be useful here. Who first proposed the concept? Who first implemented it? — Preceding unsigned comment added by 101.109.245.117 (talk) 06:48, 17 November 2011 (UTC)
- Absolutely agree. When I started off big systems had processes, and little ones didn't. Unix had fork() which created processes. Then I went off into Windows-only for a few years, and while I was there threads got invented as a concept. They'd always been there of course - except that a process was a VM with a single thread. — Preceding unsigned comment added by Number774 (talk • contribs) 10:36, 18 November 2013 (UTC)
Event-driven programming and Verilog
Could somebody with expertise clarify/elaborate this final part? It's not clear to me in this context what application-level event-driven programming and hardware description languages like Verilog have to do with threading in general, or with each other.
Skwuent (talk) 01:33, 13 August 2014 (UTC)
Propose merge/redirect a related article
The article Thread (computer science) is a stub and does not contain any information that cannot be found in Thread (computing). I suggest doing away with the stub and just having it redirect here instead. If anyone believes that the wording of the stub article is superior to or provides additional insight to topics in this article, then the articles can be merged with a redirect. 18:33, 12 November 2014 (UTC)
- Sorry, but that doesn't make sense as Thread (computer science) was moved to Thread (computing) over a redirect on January 1, 2012. — Dsimic (talk | contribs) 03:00, 13 November 2014 (UTC)
Hardware threads
Recent edits (diff) have introduced "hardware threads" into the lead. The text is not correct as it suggests hardware is involved in the topic of this article (well, it is involved, but the threading mentioned here can be performed on a very simple CPU from the 1970s—tricky hardware is not involved). I would have reverted the changes but there is also a bunch of bullet points for advantages and drawbacks that need thought. Thoughts? Johnuniq (talk) 07:31, 5 September 2014 (UTC)
- I've made some edits to decouple "software threads" (the topic of this article) from "hardware threads". "Hardware threads" aren't necessary for concurrent execution of "software threads", and can be used to concurrently execute multiple single-threaded processes. There's a little bit more to clean up. Guy Harris (talk) 00:04, 12 June 2015 (UTC)
- And other editors appear to have done more work previous to mine; the edit you list was a bit confused, but when I started the article was already no longer claiming that hardware threads were used to implement software multiprocessing or multithreading, so somebody'd already addressed some of the issues. Guy Harris (talk) 00:07, 12 June 2015 (UTC)
History
This is a great article. I only wish someone could add the origin of the thread model. Like what year did it first arrive, who developed it, etc. Anyway very nice article. — Preceding unsigned comment added by Overhere2000 (talk • contribs) 22:42, 26 August 2015 (UTC)
Definition of single-threading
I believe the definition of "single-threading" provided in this article is incorrect. It reads:
"In computer programming, single-threading is the processing of one command at a time.[3]"
1. It's citation is from a book about CICS programming, not programming in general. That is not such a great source for something about computing in general, but that's not really my point.
2. Computers do not process commands. Computers, if you want to use the word computers (as opposed to CPU), process instructions. A "command" is a statement in a high-level language that is meant to indicate an action to be performed. (yes, I didn't spend all day on that definition!) An interpretter or compiler transforms commands into a set of machine instructions. These are what the computer processes.
A computer will not necessarily process all the code of a single "command" at once. (1) It can't recognize what parts of the machine code are which commands from the high-level language. (2) In multi-tasking a process may be swapped out (correct me if I'm wrong) at almost any given instruction.
I have no sources to cite for this.
thank you Joe Petree — Preceding unsigned comment added by Jsp314159 (talk • contribs) 15:21, 11 February 2018 (UTC)
Threads vs Tasks
"IBM PL/I(F) included support for multithreading (called multitasking) in the late 1960s" - okay, I was around when there were no threads. It was called multi-tasking because, well, they were called tasks. The term remains with us -- one multitasks by doing multiple things. My issue is not with the change in the phrase (we do that all the time), it's that throughout this document it is "tasks" or "(called multitasking)" or "in which context they were called 'tasks'." I suggest that a section titled something like "The Change from Task or Process to Thread" and the reasoning for it (with docs). Otherwise it just seems like someone one day started calling it threads without rhyme or reason. Cheers. Lloydsargent (talk) —Preceding undated comment added 22:27, 19 June 2018 (UTC)
Fibers and NPOV
Yeah, this whole Fiber stuff needs rewriting, the term is not used for much outside of the NTcentric world. Also curious is the table. What the hell is the last entry? It says most OSs since 1995 use a proces/thread/fiber model. But above it, it says that OS X Win 2k etc (Major OSes released AFTER 1995) use a process/thread model. What came in 1995? Win95?? What about NT (released 93) or Macs which were stuck with classic MacOS (system 7, I believe) until OSX. Seems a little Windowscentric. So much for NPOV! —The preceding unsigned comment was added by 58.107.87.183 (talk • contribs) 23:37, 15 June 2006 (UTC1)
- This edit got rid of the "Comparison between models" section, including the offending table.
- The next-to-last entry said "Although each of these operating systems allows the programmer to implement fibers or use a fiber library, most programmers do not use fibers in their applications." So, yeah, in most OSes released after 1995, you could write your own fiber package and use it, but I don't know what OSes other than Windows NT provided any support for even third-party fiber packages. Guy Harris (talk) 22:39, 19 June 2018 (UTC)
Multiprocess vs. Multithreaded vs. Fibers
Would anyone mind if I did some cleanup of this table? It seems to me that all the long discussion about particular systems (especially the big one on AmigaOS) hides the comparison it is trying to make. What about moving discussion about specific operating systems to a section below?
Also, I kind-of disagree with the definition of multiprocess that is being implied here -- that it means something about more than one user-level "application" running. I think it should be much more tied to the idea of memory protection: that is, an OS with memory protection has multiple processes. From this perspective, for example, the traditional AmigaOS (pre-PPC) would be better classified as multi-threaded only, and would serve as a good example to illustrate the difference between that and processes.
Anyway, I don't want to step on anyone's toes. :)
- Also, I want to say that the text in the table is really hard to read; not everyone has a big display. -- Taku 05:59, Jun 20, 2005 (UTC)
"Parallel programming environments such as OpenMP typically implement their tasks through fibers" Doing a github search for a popular openmp implementation (I understand there are multiple implementations) shows that there are no fibers in the implementation. But it does have pthreads. So, this statement is false, at least for using the word typically. --Resonant cacophony (talk) 06:04, 19 December 2018 (UTC)
Windows services and processes
Could you explain what category MS Windows Services fit into this? I've seen descriptions of Windows services as "processes" themselves.
I take it that in Windows Task Manager, what are described as "Services" are actually processes in general computing lexicon, and Task Manager's "Processes" are ?super-processes or programs. "Threads" as such are not displayed in Task Manager (understandably - a too low level of detail outside of technical work/debugging).
If I'm wrong about this, then I'm clearly more confused than I thought! — Preceding unsigned comment added by 80.189.81.20 (talk) 13:11, 3 July 2019 (UTC)
- To quote "Service Programs" in Microsoft's documentation:
A service program contains executable code for one or more services. A service program created with the type SERVICE_WIN32_OWN_PROCESS contains the code for only one service. A service program created with the type SERVICE_WIN32_SHARE_PROCESS contains code for more than one service, enabling them to share code. An example of a service program that does this is the generic service host process, Svchost.exe, which hosts internal Windows services. Note that Svchost.exe is reserved for use by the operating system and should not be used by non-Windows services. Instead, developers should implement their own service hosting programs.
- and
A service runs as a background process that can affect system performance, responsiveness, energy efficiency, and security.
- So a given service can run within a single process, or several services can run within a process. A service process can be multi-threaded, as per "Multithreaded Services" in Microsoft's documentation. A server for a network protocol running over a connection-oriented transport layer, for example, might have a thread for each connection (that might be done with separate processes per connection on a UN*X system).
- Windows 7's Task Manager's "Processes" appear to be processes in the general computing lexicon, in that they are associated with a particular executable image. Windows 10's Task Manager's "Processes" appears to show groups of processes, with individual processes under it; perhaps that's what you're referring to as "super-processes". The "Details" view in the W10 Task Manager is similar to the "Processes" view in W7's Task Manager, showing processes in the general computing lexicon - and Windows API - sense. Guy Harris (talk) 19:11, 3 July 2019 (UTC)
Kernel changes for M:N threading
In Thread (computing)#M:N (hybrid threading), somebody asked for a citation in this edit, with the reason being that "no kernel modifications are needed".
I changed that to a request for clarification, because that section says
M:N maps some M number of application threads onto some N number of kernel entities, or "virtual processors."
and, if your kernel has no such kernel entities that can be used, clearly implementing M:N threading will require kernel changes. However, if, for example, it already has such entities - perhaps using them for 1:1 threading - it may be possible to implement M:N threading on top of those entities, with only changes to the userland threading library.
So, if the discussions of the various threading models, the implementation complexity refers to the complexity of implementing the threading model atop a system with no support for threading, whether in the kernel or in userland, M:N threading involves changes in both cases, whereas 1:1 would require kernel changes but would probably only require relatively simple userland changes to use the new kernel facilities, rather than requiring the userland scheduling atop the kernel changes that M:N threading would require, and N:1 threading would require just a userland threading library and maybe some limited kernel mechanisms, such as a "wait for multiple events" call, if not already present. Guy Harris (talk) 15:54, 12 June 2020 (UTC)
Authors
The section titled "Processes, Threads, and Fibers" and its subsections were started by Daniel Barbalace.
Website: http://www.clearthought.info
- Thanks for the excellent work, Daniel. -- The Anome 14:59, 24 Oct 2004 (UTC)
Process creation cost
From the article:
- "Systems like Windows NT, OS/2 and Linux (2.5 kernel or higher) are said to have "cheap" threads and "expensive" processes, while in other systems there is not so big a difference."
Eh? I though one major advantage of Linux was its relatively low process creation cost, and the new NPTL implementation of Linux threads is a 1:1 implementation where there is little difference between a thread and a process at the lowest level. -- The Anome 22:33, 22 Oct 2004 (UTC)
If you take a look at NetBSD 2.x or DragonFly BSD you will notice that they don't use 1:1, but instead N:M, which is a lot more efficient than Linux 1:1. I've added a link about NetBSD SA (Scheduler Activations) in this page that I advise you to look at. -- RuiPaulo 23:54, 24 Oct 2004 (UTC)
The process with threads image
The potentially leads to misinterpretations.
As it is, unwary readers may think processes are always made up of threads, which is not true. Actually, in many contexts, threads break processes into tasks.
I propose the image, as it is now, to be removed. M. B., Jr. (talk) 23:58, 18 December 2022 (UTC)
- And in the context of the Mach kernel, a task has multiple threads - and a task corresponds somewhat to a process; Darwin's XNU kernel constructs each UN*X process atop a Mach task (and Darwin's pthread library constructs pthreads atop Mach threads).
- So what are your defintions of process, thread, and task here? Guy Harris (talk) 01:05, 19 December 2022 (UTC)
- Hi, Guy Harris. I'm glad we agree, since I did not state the opposite whatsoever. As for definitions, I'm not sure what you mean by "your". This is an article about the concept, not about particular implementations. In Modern Operating Systems, Tanenbaum writes that "several different models are possible". Notwithstanding, implementations may be provided in a proper context-wise manner, e.g., precisely explaining the case in the image's subtitle. This way, readers are able to understand the Mach way is not a general rule. Regards, M. B., Jr. (talk) 20:14, 19 December 2022 (UTC)
- You said "many contexts, threads break processes into tasks". I noted that, in Mach, one task, in Mach terminology, can have multiple threads, in Mach terminology, so, in that context, a thread doesn't "break processes into tasks" - there is, at least in Darwin (and probably other systems with Mach-based kernels), there's a one-to-one correspondence between Mach tasks and processes.
- And what I mean by "your defintions of process, thread, and task here" is "If you mean by "process" that which is described in process (computing), and if you mean by "thread" that which is described in thread (computing), what do you mean by "task" when you say that "threads break processes into tasks"? And if that isn't what you mean by "process" or "thread", what do you mean?"
- Where, and in which edition, does Tanenbaum write that "several different models are possible"? I need to know precesly what he was referring to. He could be referring to the different types of threads mentioned in thread (computing) § Processes, kernel threads, user threads, and fibers, or to the threading models mentioned in thread (computing) § Threading models, or to something else?
- And the image is pretty straightforward - a process has multiple threads of control, so that the process can be doing multiple things in parallel. By your use of the word "task", are you referring to the "things" being performed by the threads?
- I only brought up Mach because it uses the term "task" as a technical term; OS/360 and its successors use "task" in a different sense, where it seems to cover both what we might think of as processes and what we might think of as threads. The point is that "task" doesn't seem to have a commonly-used technical definition in the fashion that "process" and "thread" appear to; if you wan tto be clearly understood here, a good first step might be to indicate what you mean by "task", so we can determine how "processes are made up of threads" and "threads break processes into tasks" mean something different? Guy Harris (talk) 22:55, 19 December 2022 (UTC)
- Hi, Guy Harris. Thanks for your feedback. Discussions like this one make Wikipedia awesome. Indeed, the concept of task is context-subordinated. I'm referring to the OS concept, not the application one. And yes, the process to task conversion constitutes a thread hijack technique (and maybe a thread rescue technique as well?). Because processes themselves live in the main memory. Despite the name, the CPU deals with threads and tasks, not processes. If one wants processes' data to be actually processed, such data should be somehow turned into threads and/or tasks. Anyway, the point is: it is probably in the best interest of Wikipedia's readers to understand concepts, and not stick to particular definitions and/or implementations. The book is ISBN 0-13-595752-4, chapter 12 Processes and Processors in Distributed Systems. Regards, M. B., Jr. (talk) 17:11, 20 December 2022 (UTC)
- Hi, Guy Harris. I'm glad we agree, since I did not state the opposite whatsoever. As for definitions, I'm not sure what you mean by "your". This is an article about the concept, not about particular implementations. In Modern Operating Systems, Tanenbaum writes that "several different models are possible". Notwithstanding, implementations may be provided in a proper context-wise manner, e.g., precisely explaining the case in the image's subtitle. This way, readers are able to understand the Mach way is not a general rule. Regards, M. B., Jr. (talk) 20:14, 19 December 2022 (UTC)
So where is the OS concept of a "task", as distinct from a "thread" or a "process", defined?
And what is a "process to task conversion", and how does it "constitute a thread hijack technique (and maybe a thread rescue technique as well?)"
Processes "live in the main memory" in a number of senses: 1) in many operating systems, one resource that a process has and that is shared by all the threads in the process is its address space - not all of which necessiarly resides in main memory - and 2) the OS maintains state about the process, typically in wired main memory. Point 2), however, applies to threads and, in the case of OSes that use the term "task", tasks.
And most CPUs deal with instructions, not threads (in the thread (computing) sense) or tasks. The notion that a CPU is working on a particular "thread", "process", or "task" is, on most CPUs and operating systems, defined by the operating system, not by the CPU. Guy Harris (talk) 20:50, 20 December 2022 (UTC)
- Hi, Guy Harris. Although fun, I guess this is the wrong talk spot for digressions. The article's content should be improved. You have mentioned pthreads (IEEE's POSIX documents for threads) and Mach stuff earlier. Their definitions of task seem to converge. Regards, M. B., Jr. (talk) 21:32, 20 December 2022 (UTC)
- For IEEE's POSIX documents:
- The current Single UNIX Standard doesn't define "task" in its Definitions section. It uses the word "task" in its definitions of "batch job", "batch submission", "command" ("A directive to the shell to perform a particular task."), "program", and "Utility"; that uses the term in the common English sense of "something to be done", in the sense that shopping for groceries, for example, is a task.
- The SUS's page for
pthread_create()
only uses the word "task" in the Rationale, where it refers to the Ada notion of a "task". That's discussed in Ada (programming language) § Concurrency, where it says that "Depending on the implementation, Ada tasks are either mapped to operating system threads or processes, or are scheduled internally by the Ada runtime." The Rationale talks about how to implement Ada tasks atop threads. - The SUS's page for
pthread_cond_broadcast()
uses the word "task" only in the common English sense ("The pthread_cond_broadcast() function is used whenever the shared-variable state has been changed in a way that more than one thread can proceed with its task.").
- So I see no "definition of task" in POSIX to converg with anything else.
- As for Mach, the Mach 3 Kernel Principles document says on page 7:
- The Mach kernel provides an environment consisting of the following elements:
- thread — An execution point of control. A thread is a light-weight entity; most of the state pertinent to a thread is associated with its containing task.
- task — A container to hold references to resources in the form of a port name space, a virtual address space and a set of threads.
- The Mach kernel provides an environment consisting of the following elements:
- so that's Mach's "definition of task".
- That's different from, say, the Ada definition; the beginning of Section 9 of the 1995 Ada Reference Manual says that "Each task represents a separate thread of control that proceeds independently and concurrently between the points where it interacts with other tasks." and that "In addition, tasks can communicate indirectly by reading and updating (unprotected) shared variables, presuming the access is properly synchronized through some other kind of task interaction.". That could be implemented atop multiple processes, each with its own address space, with parts of those address spaces being mapped to shared memory; however, it could also be implemented atop multiple threads in a single process, with all the threads in a single address space (and, as the quote above from Ada (programming language) § Concurrency, those threads could be known to the OS or could be implemented as user-mode threads in an Ada support library).
- In an early specification for PL/I, "A task is an identifiable execution of a set of instructions. A task is dynamic, and only exists during the execution of a program or part of a program." The "Asynchronous Operations and Tasks" section of IBM System/360 Operating System PL/I Language Specifications from 1966, which begins on page 77, describes tasks in a fashion that sounds similar to Ada - for example, on page 79, it says that "If storage is allocated for a variable in the attaching task, this allocation may apply to the attached task, so that the variable may appear as a reference in the attached task."
- Tasks in OS/360, as per section 6 "Task Management" of OS/360 Concepts and Facilities, also seem somewhat thread-like, although, as OS/360 ran everything in a single non-virtual address space, there's no notion of "process" in the sense of an entity corresponding to an address space. A "job step" might somewhat correspond to a process, but that's not an exact match. OS/VS1 added virtual memory support, but only had a single address space in which it ran all jobs, the same way OS/360 MFT ran all jobs in physical memory, and OS/VS2 (SVS) added virtual memory support, but only had a single address space in which it ran all jobs, the same way OS/360 MVT ran all jobs in physical memory. OS/VS2 (MVS), eventually called just MVS, was the first OS in the OS/360 line to support multiple virtual address spaces; address spaces somewhat corresponded to processes, with each address space starting out with a single task that could create subtasks running in the same address space, which somewhat correspond to threads.
- So "task" does not have the same technical meaning for all operating systems and programming languages in which the term is used. The Task (computing) page explicitly acknowledges that:
In computing, a task is a unit of execution or a unit of work. The term is ambiguous; precise alternative terms include process, light-weight process, thread (for execution), step, request, or query (for work).
- As such, it's probably best if the term "task" is used only when talking about systems where it's used in the same, or similar, senses; it shouldn't be used when talking about processes and threads in general. Guy Harris (talk) 07:23, 21 December 2022 (UTC)
- Hi, Guy Harris. Thank you for interacting for the benefit of knowledge. I have searched some of POSIX active standards' documents. In IEEE/ISO/IEC 9945-2009 (not the 2017 Technical Corrigendum 2, I mean the original one), in page 3523, line 118726, it seems that process and task are supposed to be interchangeable concepts. The same in page 3532, line 119135. Also, for the definition of task in the POSIX domain, there is an Ottawa Linux Symposium 2002 paper, which is made available by kernel.org. In its very first page, the author puts it like this:
- "... In Linux, the basic unit is a task. In a program that only calls fork() and/or exec(), a Linux task is identical to a POSIX process. The difference arises when a task uses the clone() system call to implement multithreading. The program then becomes a cooperating set of tasks which share some resources. We will use the term task to mean a Linux task...".
- A notable Mach resemblance seems to be reasonable then. Regards, M. B., Jr. (talk) 21:04, 21 December 2022 (UTC)
- The POSIX stuff is from the Rationale, not the Standard, so it's not a formal definition of "task" in the POSIX context.
- As for the Linux paper, that's a definition of task in the Linux domain, not in the POSIX domain, just as any use of the Mach term "task" in Darwin is a defintion of task in the Darwin domain, not in the POSIX domain. Furthermore, while, in Darwin, a UNIX process has one Mach task associated with it, a Linux task does not always correspond to a single process - as the author notes, when
clone()
is used, a task doesn't necessarily correspond to a process; the paper goes on to note that multiple tasks are used to implement multiple threads within a Linux process. - So, again, there's no Mach resemblance; "task" means different things in different operating systems and programming languages, and the term "task" should not be used in a general discussion, so as to avoid confusion. Guy Harris (talk) 22:05, 21 December 2022 (UTC)
- > The POSIX stuff is from the Rationale, not the Standard, so it's not a formal definition of "task" in the POSIX context.
- Hi, Guy Harris. You're looking for bullets inside documents. Not every specification defines ideas in the atomic perspective a reader wants. Regards, M. B., Jr. (talk) 00:37, 22 December 2022 (UTC)
- The POSIX Rationale is not a specification, it's an explanation of why some choices were made when developing the specification. The POSIX Standard is the specification, and it does not define "task". Guy Harris (talk) 00:59, 22 December 2022 (UTC)
- Hi, Guy Harris. The document I provided you with is a part of the current POSIX standard. Regards, M. B., Jr. (talk) 01:16, 22 December 2022 (UTC)
- Not all parts of the standard specify things. Many standards indicate that some parts are "normative", meaning that they standardize things, and other parts are "informative", meaning that they provide additional information but do not standardize anything. The Rationales in various versions of POSIX are informative, not normative. Guy Harris (talk) 01:40, 22 December 2022 (UTC)
- Hi, Guy Harris. The document I provided you with is a part of the current POSIX standard. Regards, M. B., Jr. (talk) 01:16, 22 December 2022 (UTC)
- The POSIX Rationale is not a specification, it's an explanation of why some choices were made when developing the specification. The POSIX Standard is the specification, and it does not define "task". Guy Harris (talk) 00:59, 22 December 2022 (UTC)
- For IEEE's POSIX documents:
As for the quote from Tanenbaum, at least one instance of it is on page 507 of the first edition of Modern Operating Systems, in the beginning of chapter 12 "Processes and Processors in Distributed Systems":
In many distributed systems, it is possible to have multiple threads of control within a process. This ability provides some important advantages, but also introduces various problems. We will study those issues first. Then we come to the subject of how the processors and processes are organized, and see that several different models are possible. Finally we look at processor allocation and scheduling in distributed systems.
Presumably the "several different models" are discussed in section 12.2 "System Models". Two models he describes are the "workstation model" and the "processor pool model".
The examples he gives of the first model involve several workstations on a network. possibly with one or more file server on the same network, with each user doing their own work on their workstation and, if there are file servers, using them for files shared between users. In addition, he describes various schemes for sending jobs to idle workstations, to use those workstations' resources when they're not being used by the user of the workstation.
The second model involves a collection of processors in a machine room, with users having X terminals on their desks, with work assigned to the processors as appropriate. (Think of it as timesharing with a GUI and with the timesharing machine being a cluster.)
Neither of these models have to do with how threads exist within a process.
So if you're using Tanenbaum's quote to say something about the organization of processes and threads, you must be referring to another place where he speaks of "several different models"; if so, could you give the page number, so I can find it in archive.org version of the book? There's a little bit about theading models in section 12.1.4 "Implementing a Threads Package". Guy Harris (talk) 08:39, 21 December 2022 (UTC)
"Thread(OS)" listed at Redirects for discussion
The redirect Thread(OS) has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at Wikipedia:Redirects for discussion/Log/2023 November 28 § Thread(OS) until a consensus is reached. Steel1943 (talk) 06:33, 28 November 2023 (UTC)
"Thread(computing)" listed at Redirects for discussion
The redirect Thread(computing) has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at Wikipedia:Redirects for discussion/Log/2023 December 18 § Thread(computing) until a consensus is reached. Steel1943 (talk) 16:46, 18 December 2023 (UTC)
"Thread(computer science)" listed at Redirects for discussion
The redirect Thread(computer science) has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at Wikipedia:Redirects for discussion/Log/2023 December 18 § Thread(computer science) until a consensus is reached. Steel1943 (talk) 16:48, 18 December 2023 (UTC)
"Execution abstraction" listed at Redirects for discussion
The redirect Execution abstraction has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at Wikipedia:Redirects for discussion/Log/2023 December 18 § Execution abstraction until a consensus is reached. Steel1943 (talk) 16:59, 18 December 2023 (UTC)
"Current running thread" listed at Redirects for discussion
The redirect Current running thread has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at Wikipedia:Redirects for discussion/Log/2023 December 18 § Current running thread until a consensus is reached. Steel1943 (talk) 17:02, 18 December 2023 (UTC)