Jump to content

Talk:Fork (system call)

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Qwertyus (talk | contribs) at 11:51, 14 March 2015 (→‎Origin of vfork(): actually 3 years). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

WikiProject iconComputing Start‑class
WikiProject iconThis article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
StartThis article has been rated as Start-class on Wikipedia's content assessment scale.
???This article has not yet received a rating on the project's importance scale.

Mythos

What about deleting that paragraph? I'm not sure whether the author was aware of the fact that he described the infamous "forkbomb" which has its own article and is already linked below. Further "mythos" is never actually explained. Also I doubt this is the right place for "honorable mentions" of fork() or whatever else. There would be no end if you did this for all articles. If at all, the article about the Matrix should link to fork but not vice-versa. Anyway, I remove it from the article. --82.141.49.144 03:55, 4 December 2005 (UTC)[reply]

While I agree with much of this, rather than saying mythos, I think that "geek culture" is probably a bit better, as the term has seeped into "pop" culture. McKay 00:07, 9 December 2005 (UTC)[reply]

Make code compileable

I can't compile the code. Someone could add the propriate header files. It invite readers to experiment with fork(). The line with the /* Note */ should be changed and the note should be removed. I have no clue what the author meant with that. --Bernard François 14:58, 15 January 2006 (UTC)[reply]

From what I can see this code only needs the unistd.h (or stdlib.h if that doesn't work) header file, sometimes it also needs an extra include such as /sys/type.h or /sys/types.h, but this is sytem dependent. Also, the code should be in a main function. I'm not going to change the code on the page because it is only sample code. Regarding the /* Note... */, this line uses _exit(), which is different from exit(), (it has no underscore). These things may be effectively the same (actually they probably are - an exit() works fine here), but you'd need to see the source code for each of them. --Pyrofysh 06:36, 25 April 2006 (UTC)[reply]
But why did the author use _exit() instead of exit() at all in the child? --84.188.202.21 14:31, 12 October 2006 (UTC)[reply]
The standard library does some clean-ups with library call exit() (flushes buffers, closes file descriptors). Also, any files created with tmpfile() are closed, and some other things may occur also that one would prefer not to happen until the parent exits. The system call _exit() unconditionally exits. In addition, had vfork() been used instead of fork(), very unexpected things can happen because the memory is shared, if exit() is called before an exec; but sensible people don't use vfork. Agarvin 20:03, 12 November 2006 (UTC)[reply]

SMP

Forks fully utilize SMP, right?

If you mean Symmetric MultiProcessing, this depends more on the operating system's implementation of process handling than on the fork call. All fork does in most implementations is make a copy of a current process. --Pyrofysh 06:41, 25 April 2006 (UTC)[reply]

Clean code

Why is there a declaration of "int i" at the start of the code when it is only used in one of the for loops? Why not declare int i & j at the start, and forget about "i" at the beginning? Consistency? Maybe I am missing something?

Nope, not missing anything - this is fixed. Steeltoe 05:13, 19 April 2006 (UTC)[reply]
On that note, some compilers will not compile the code unless all variables are intialised at some point in the function. Currently i is only initialised under the "else if(pid <0)" condition, and j is only intialised under the "if(pid==0)" condition. As neither of these variables will be initialised under all cases, those compilers will cause an error. Ideally both int i and j should be declared at the start, because c intialises ints on declaration (even if it sometimes is to nonsense).
I'm just putting this note here in case someone doesn't understand why their code doesn't compile - I think that the current code should remain unchanged regarding this because it is simpler having variables declared where they are used (in this case), additionally, most modern compilers don't suffer from this problem. --Pyrofysh 10:22, 7 June 2006 (UTC)[reply]

exit()

I've created an article for the inverse operation, exit. - Loadmaster 16:36, 18 August 2006 (UTC)[reply]

Fork is not critical to the Unix design philosophy

Forking is an important part of Unix, critical to the support of its design philosophy, which encourages the development of filters.

Concurrent processes are necessary for the development of filters. Whether they are implemented using fork, or some other primitive such as spawn, is surely an implementation detail? --DavidHopwood 18:26, 5 April 2007 (UTC)[reply]

Yet fork predates spawn by at least a decade or two. I'll wager that many Unix programmers don't even know there is a spawn function. Likewise, a huge amount of Unix code relies on fork/exec to operate. — Loadmaster 03:51, 20 June 2007 (UTC)[reply]
No, I don't think it is just an implementation detail. I believe Fork well conforms to some basic Unix principles, if not intently, at least coincidently and so be persevered by natural selection over time.
Among other principles, Unix has two important principles,
1. Each component should do one thing, and do one thing well.
2. Fundamental component (especially like kernel) should provide mechanism rather than policy.
Among the kernel's jobs, to create a new task and to decide what new job the new task should do are two separate things. Although most of newly-created task is to load another executable from the disk, but that's not always the case. A newly-created task may run the same code as its parent, with the same or different input, or it may load new executable code from network or other media than disk, or it may even compute new code by its own.
Therefore to implement a system call that creates a new task AND loads new executable code is a bad idea. If you insist to do so, you come across the second difficulty that is how you determine the way the new executable code is loaded. To enforce the new code always loaded from disk is crude. To provide several predefined ways is better, but still it enforces policy to users.
On the other hand, to create an empty task is also not practical. Can an empty task exist at all? I don't know. But how an empty task decide what it go next step. It can't. So either kernel or another task has to make a decision for it. Kernel should not provide policy so it shouldn't make such a decision. Another task doing so will violate process separation basically. To let the newly-created task duplicate its parent is a handy and elegant choice. It involves only memory operation (and by COW the memory opt is also minimized). It does not violate process separation and make the child-process have maximal flexibility to decide what next step to go. So fork() does one thing only and well and push as much policy as possible up to the user space. — Feng Dong 11:44, 28 May 2008 (+0800)
I don't really know how to reply in a wiki, sorry.
It may be nice to imagine that UNIX has some fundamental principles that it does not violate, it's sadly not true. There are examples with fork/exec itself.
FD_CLOEXEC, ancient unix programs would have a little for loop to close all fds from 3 to 255 in the child, but then it was realized that say a library may want to indicate some not to close, an example of exec doing more than one thing now.
alarm() and fork, old versions had the bug that the child could get the signal, so not only is fork also doing peculiar things about masking certain signals, it is now also canceling timers.
stdio did not mesh well with exit (say the exec failed), an exit was doing more than just terminating the process, so _exit was added.
Soon people realized that fork was expensive (even with COW) to just exec a new image shortly afterwards, so vfork was added, again doing less than fork.
There was signal, that was originally racey, so sigaction was added to do more.
wait, later waitpid and wait4, SIGCHLD, sigaltstack, termio differences, all the work into the STREAMS dead end, the list goes on about how things were extended to do more and less over the years as programmers discovered the need.
A spawn-like is needed in UNIX, simply vfork is pretty gross, old man pages even say it is not safe to do little more than _exit or exec from the child. There are now huge VM spaces, that is wasteful to mark COW to just tear down again during the exec. Moreover pretty much on anything now (even Linux has a tunable for this) there needs to be enough swap available for the fork to succeed. This is just another case of after time systems programmers seeing that there was something lacking.
It's sort of a mess though, look at the SUSv3 posix_spawn and the related interfaces that are related to posix_spawn_file_actions_init and posix_spawnattr_init. All of this just to deal with with doing some of the typical stuff a programmer might wish to do between a fork and exec. It is also poorly designed, say you are in a multi-threaded program and you close a fd that happens to get closed by another thread, the spawn fails with errno EBADF, you might as well use FD_CLOEXEC.
That sort of alludes to another issue about UNIX, there were some poor decisions that after 20 or so years made other things more complicated. Think about when pthreads were added and the semantics on addr lookup in BSD sockts, errno work arounds, signal issues, and fork itself (look at the mess of fork, fork1, and forkall in Solaris in particular). Actually something should be added about locked mutexes in the child and that now the standard specifies that fork only forks the calling thread in the child to the article itself, it is a detail like that stdio already in the article.
So no matter how much we wish it to be true, UNIX is not pretty, and systems programming is hairy and always will be. Yes fork is a do one simple thing concept, but the later vfork did even less, and then when you are running into not enough swap to fork that 8GB db process to run a shell script, you realize that something like system that takes exec like array args and does not do all the VM stuff was a sorely lacking feature. —Preceding unsigned comment added by 131.225.103.35 (talk) 22:33, 11 June 2009 (UTC)[reply]
Fork, IIRC, was done in response to the hairy spawn-equivalent task-starter in MULTICS. (I believe this is citable, though I don't have the citation.) The idea was that rather than create an N-argument task starter that was always in a state of flux as people came up with new OS features, that a simple fork call in conjunction with inheritance and optional overriding would be much simpler, especially when it came to creating pipelines. As it was. It has nothing whatsoever to do with the ability to create filters, shells, pipelines, etc., and the opening section of this wiki entry is misleading. Also, original Unix generally favored small over efficient due to the machines upon which it was developed. The downside of fork is that this simplicity is largely predicated on Unix's single-threaded process model, and once that went away we got into the current situation. ISC PB (talk) 00:28, 8 March 2011 (UTC)[reply]

Merge with Fork Paging

It has been suggested to merge Fork Paging with this article. Add your comments below.

Splitting of section

This page is good and helps me on how to program fork in C. But I would suggest first to put the discussion of more technical aspects like vfork in a separate page. Secondly, could some more examples in C be provided? What about communication between forked processes for example. —Preceding unsigned comment added by 161.53.128.59 (talk) 09:01, 21 August 2009 (UTC)[reply]

Does vFork() really fork ?

vFork seems to be some very broken implementation. I cannot fork with it, the Master always blocks for the child as soon as the child is started, diminishing any purpose of creating a fork at all. (no concurrency). I do not see it in the article mentioned ? Is that behaviour so "normal" that it does not even need mentioning? --89.247.43.241 (talk) 01:49, 21 November 2011 (UTC)[reply]

vfork is a rather special case of fork, and isn't applicable to all the uses where fork could be used. I believe the linux man-pages describe it rather well. Note that there exists some controversy to whether or not vfork should be in libc at all—just look for the heading Bugs on the page linked. Imho, and this is a rather subjective opinion, vfork has it's niche use cases but shouldn't be undertaken lightly by the novice programmer; without a proper understanding of the limitations and advantages it's likely to run up more debugging hours than it's worth. --Knneth (talk) 21:02, 11 December 2011 (UTC)[reply]

Copyright problem removed

Prior content in this article duplicated one or more previously published sources. Copied or closely paraphrased material has been rewritten or removed and must not be restored, unless it is duly released under a compatible license. (For more information, please see "using copyrighted works from others" if you are not the copyright holder of this material, or "donating copyrighted materials" if you are.) For legal reasons, we cannot accept copyrighted text or images borrowed from other web sites or published material; such additions will be deleted. Contributors may use copyrighted publications as a source of information, but not as a source of sentences or phrases. Accordingly, the material may be rewritten, but only if it does not infringe on the copyright of the original or plagiarize from that source. Please see our guideline on non-free text for how to properly implement limited quotations of copyrighted text. Wikipedia takes copyright violations very seriously, and persistent violators will be blocked from editing. While we appreciate contributions, we must require all contributors to understand and comply with these policies. Thank you. Osiris (talk) 07:02, 7 July 2012 (UTC)[reply]

Fork only duplicates the calling thread

At least with POSIX threads, forking a process creates a process with a single thread, i.e. only the thread that called fork(2) is duplicated, see e.g. Is it safe to fork from within a thread? or Chapter 6 in David R. Butenhof's book "Programming with POSIX Threads". Should the article mention this or would this be out of scope? — Tobias Bergemann (talk) 13:39, 8 August 2013 (UTC)[reply]

Importance of forking in Unix

The section "Importance of forking in Unix" claims (without a source) that

Forking is an important part of Unix, critical to the support of its design philosophy

I don't think this is true. The ability to run multiple programs is critical to the Unix philosophy, and for that matter to any multiprocessing operating system. The section does have a point when it describes the combination of forks and pipes, which is quite simple in Unix but then again, (1) fork is older than pipe (V1 vs. V3 Research Unix, IIRC) and (2) the pipelining mechanism could have been implemented in a different way, e.g. by a combined fork/exec system call that can also set file descriptors. Maybe that would violate worse is better, but I see nothing particularly important about fork for the support of shell pipelines. QVVERTYVS (hm?) 14:08, 16 October 2013 (UTC)[reply]

Origin of vfork()

It is mentioned that vfork() was first appeared in 3BSD, but FreeBSD's manual page claims that vfork() was first appeared in 2.9BSD. Stevens and Rago in the Advanced Programming in the Unix Environment also claim that vfork() was originated from 2.9BSD (p. 234). I think this conflict arises from that the vfork() system call was removed in 4.4BSD (according to Stevens and Rago), but later systems added it back (as mentioned in the Linux manual page). -- Bkouhi (talk) 02:35, 14 March 2015 (UTC)[reply]

Seems like they're wrong. 3BSD has a vfork(2) manual page, and 3BSD predates 2.9BSD by four three years. (The 2.9 series was continued into the 1990s, though not by Berkeley; 2.9 was a 4.1cBSD backported to the PDP-11.) The NetBSD people know this too. QVVERTYVS (hm?) 11:39, 14 March 2015 (UTC)[reply]