Talk:Asynchronous I/O

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computer science (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

Asynchronous vs. Non-blocking[edit]

  • Non-blocking refers to using normal I/O primitives in non-blocking mode, e.g. the O_NONBLOCK flag to open(2), where calls to read(2) and write(2) return -1 with errno set to EAGAIN if I/O would block.
  • Asynchronous refers to separating checks for the availability of I/O, potentially on many fds, and actually doing the I/O. For example, one may use the O_ASYNC flag to open(2), wait for the kernel to send SIGIO, and then use non-blocking calls to actually do the I/O; or call select(2) with a set of fds to wait for I/O to become available on one of them, then use read(2) and write(2) to do I/O on any fds which have data waiting.
In fact, they are completely different things: nonblocking-I/O is completely synchronous: data is only being transferred to the application doing the I/O during the I/O call, while with asynchronous I/O, the data is actually transferred while the application does other things. For example, boost asio doesn't do any asynchronous I/O, its fully synchronous (one has to call a asio function to actually do the I/O), while overlapped I/O under e.g. windows is indeed fully asynchronous: the function returns immediately and the buffer is filled asynchronously. This is even more important as, programmatically, asynchronous I/O needs very different program structures and usage than non-blocking I/O.
No, during a non-blocking I/O call the data is being transferred to/from an OS buffer. I fail to see how this differs from an asynchronous callback executed when a buffer is full/empty. EdC 23:26, 25 October 2007 (UTC)

The technique of spawning a thread to block on a single fd each waiting for I/O while the main thread carries on with processing can also be described as async, but does it using blocking calls. Hairy Dude 19:31, 24 May 2006 (UTC)

That's how the terms are used in C/POSIX. I suspect that some other culture (probably Java?) calls (emulated) asynchronous IO "non-blocking IO" and thus the confusion. Any Java programmers who can enlighten us? --Doc aberdeen 14:13, 27 February 2007 (UTC)

Hey Guys. This [article] may clear things up a bit. --68.253.58.171 (talk) 15:42, 11 January 2008 (UTC)

I don’t agree with this article in that it classifies a select() loop as a type of asynchronous I/O. I think the IBM article referenced above is wrong. It confuses asynchronous I/O with non-blocking I/O. Non-blocking I/O is not asynchronous. It categorizes a select() loop as “asynchronous blocking I/O”. In my opinion, using a non-blocking write call with a select loop is NOT asynchronous I/O in any way. The non-blocking write to the socket is performed synchronously (the result is known as soon as the call returns). An asynchronous operation requires context information be passed so that that the request to perform I/O is decoupled from the result of that request. The result is usually communicated to the application on a different thread than the thread that makes the request. In my opinion, a select loop approach can be non-blocking and allows multiplexed I/O, but is not asynchronous. Asynchronous I/O APIs allow for zero-copy implementations and allow for multiple I/O events to be processes simultaneously by multiple threads. Non-blocking I/O with a select loop allows neither. —Preceding unsigned comment added by 12.53.191.4 (talk) 17:11, 11 January 2010 (UTC)

That's not true: In fact, non-blocking I/O with a select loop allows for zero-copy implementations and for multiple I/O events to be processed simultaneously by multiple threads: One can open a number of fds with O_DIRECT for zero-copy and, once select returns, delegate processing to other threads. This is asynchronous and involves parallelity of multiple I/Os. But, it's not the usual way using select. -- Juergen 91.52.170.222 (talk) 20:08, 22 March 2010 (UTC)
You (and many others here) confuse asynchronous I/O done e.g. in the kernel with I/O done in an application. OS Kernels almost always do all I/O asynchronously, but that doesn't mean that the userspace I/O is asynchronous (for example, disk I/O is always asynchronous in GNU/Linux, but will happily block your app for ages). This article should either get the distinction right, or it will simply be wrong and useless forever.109.193.183.13 (talk) 12:26, 28 August 2014 (UTC)

This article should also mention asynchronous I/O as done in UNIX with real-time extensions by using the aio_read() call and the aio_write() call. I'll be glad to contribute some words on that topic - I'm surprised they are not here already. — Preceding unsigned comment added by Jcnoble (talkcontribs) 22:30, 11 March 2012 (UTC)

Windows I/O Completion Ports[edit]

Windows I/O Completion Ports seem to be missing from the list. They are similar (pre-date?) to the Solaris Completion Queues, so can go under that sub-heading. See http://www.microsoft.com/technet/sysinternals/information/IoCompletionPorts.mspx and http://msdn2.microsoft.com/en-us/library/aa365198.aspx for more info --Edouard.

Uncontrolled stack growth in question[edit]

I am unsure of other operating systems, but on Microsoft Windows platforms, completion routines for a particular file descriptor are not stacked making uncontrolled stack growth impossible. This is clearly documented in the MSDN. See FileIOCompletionRoutine, WSARecv, etc. Does this problem exist with other OSes (I suspect that this is not the case) or is this just wrong? Karl McClendon (talk) 18:41, 10 November 2009 (UTC)

mmap and madvise[edit]

Asynchronous reading is, at least with linux, also possible with memory mapped file I/O, combined with prefetching by madvice (MADV_WILLNEED), which works without the overhead of copying and with less system calls.

  • Am I allowed to add this to the article, without being reverted for citing no references ?
  • Are there other OSs with that same capabilities ?

-- Juergen 91.52.170.222 (talk) 20:08, 22 March 2010 (UTC)

This is actually wrong - madvise doesn't guarantee that an access will be non-blocking or asynchronous (neither on return, nor at any later time) - if the data is in the cache at access time, it might be, otherwise it won't be. madvise can't ensure that data is in the cache at a later time, and doesn't even attempt to (e.g. on GNU/Linux). Besides, every page fault is a system call, so whether there are actual savings on system calls with this technique is not shown.109.193.183.13 (talk) 12:30, 28 August 2014 (UTC)

Mixing different levels of asynchronity[edit]

Shouldn't we differentiate between these types of asynchronity for the article:

  • the one between device and OS (which is common with all modern OSs, implemented by interrupts, but not usually called asynchronous I/O unless combined with one of the following approaches in the application),
  • the one between OS and application (which needs a particular approach, e.g., threads, SIGIO, select, O_NONBLOCK, mmap with madvise, etc.), and
  • O_DIRECT, bypassing the OS buffers, still using asynchronous interrupts, but to be combined with one of the above application approaches to usually be called asynchronous ?

This wikipedia article calls the first type asynchronous, which is acceptable, the IBM article does not, which is acceptable, too, but contradicts to the article's point of view. Different contributors have different positions here. How can we manage to get a consistent one ? -- Juergen 91.52.170.222 (talk) 20:08, 22 March 2010 (UTC)

Please include a reference regarding O_DIRECT. It looks like O_DIRECT is related to whether or not the OS buffers I/O. That is different than whether or not reads and writes are async. If I schedule an async read, I register a callback. This allows the device driver to use the buffer I supplied. My callback gets called when the data has been read. Does the use of O_DIRECT require a callback to be specified? If not, then it's not async. The application would still have to make a system call to see if the data has been read into the buffer. The read from the driver to the buffer could be async, but without being completely async from app to device driver, efficiency is lost by needless context switches. -- Scott S. —Preceding unsigned comment added by 12.53.191.4 (talk) 13:46, 27 May 2010 (UTC)

At the lowest level, all I/O is async (interrupt handlers). IMO, the term async I/O implies the I/O is async from end to end (app to device driver). If the operations are not truly async from end to end, you end up with impedance mismatches that cause data to be copied and/or context switches. -- Scott S. —Preceding unsigned comment added by 12.53.191.4 (talk) 13:54, 27 May 2010 (UTC)

Forms[edit]

This really bothers me: "All forms of asynchronous I/O open applications up to potential resource conflicts and associated failure. Careful programming (often using mutual exclusion, semaphores, etc.) is required to prevent this." Sure, I suppose that's the case sometimes. But in my experience, most actual uses of async IO are selected to *avoid* this sort of stuff. When you have only one thread that does stuff, then eventually returns to the top of a runloop and does select(), there's little reason to lock anything, since there's no other thread to lock against.

I'm also wondering what the "processes" section is all about. In what way are "processes" a form of async IO? Isn't async IO all about eliminating the need for one process per IO stream? 69.14.204.77 (talk) 12:45, 8 January 2013 (UTC)