Talk:Virtual address space

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing (Rated Start-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

Assumptions that need a fix[edit]

This article has references that are specific to IA32. For example, it talks about 4GB address space. It could be different on other architectures. Raanoo 13:30, 24 July 2007 (UTC)

Merge proposal[edit]

The article could be merged with virtual memory, as it's entire content barring the first sentence is about virtual memory, not specifically virtual address space. —Preceding unsigned comment added by 132.204.221.34 (talk) 06:18, 29 May 2008 (UTC)

Certainly, the concept of virtual memory is the thing which gave a prominence to virtual address spaces, but these two are not more the same thing than rocket engine and space exploration. "Virtual address space" is one of (several) hardware implementations of memory addresses, a case of "logical address space" which is opposed to physical addresses. "Virtual memory" is a specific OS-level mechanism to enhance the usable memory for a process. The most common and most famous one, but not the only one (cf. memory-mapped file). Although on modern systems the virtual memory operates in the context of VAS, it has not necessarily to use it (cf. ancient "swapping" mechanism when a process was entirely uploaded to the disk). Leave articles distinct, just specify what is what. Incnis Mrsi (talk) 12:43, 26 August 2012 (UTC)
OK, there are several concepts to discuss here.
There's the concept of instructions referring to addresses other than physical addresses. This could be used in a system that keeps all code and data in memory, without even swapping entire processes, much less swapping segments or paging. That's what, for example, "base and bounds" registers provide, and is arguably what the logical address page is referring to when it discusses memory addresses.
There's the concept of those some or all of those "not-physical" addresses not necessarily being mapped to physical addresses, with attempts by the machine to refer to them trapping to code that would, for example, load data from backing store into main memory and changing the map so that the address range in question refers to the physical memory in question. That's what's generally considered "virtual memory", and is what the virtual memory page covers.
There's the notion of the range of not-physical addresses to which instructions, on a machine with some form of address mapping, can refer. On a machine with virtual memory, that would be called the virtual address space; I'm not sure what it would be called on a machine without virtual memory (so that if any of the addresses in that address space are backed by physical memory, all are - i.e., a system where all code and data is in physical memory at all times, or where an entity given its own address space is either entirely in physical memory or entirely out of physical memory, i.e. swapped in or swapped out, at any given time). There is no requirement that each process, if the OS even has a notion of processes, have a separate virtual address space, as the current second paragraph of virtual memory points out:
Most modern operating systems that support virtual memory also run each process in its own dedicated address space. Each program thus appears to have sole access to the virtual memory. However, some older operating systems (such as OS/VS1 and OS/VS2 SVS) and even modern ones (such as IBM i) are single address space operating systems that run all processes in a single address space composed of virtualized memory.
There's the notion of memory-mapped files, which are generally implemented by assigning a section of the virtual address space to a region of a file and arranging that page faults in that region fetch backing store from the file and, optionally, arranging that dirty pages in that region be written back to the file. Whether that's distinct from "virtual memory" is a question of whether "virtual memory" refers only to what, in a system with paged virtual memory, are sometimes called "anonymous pages", i.e. pages not backed by a memory-mapped file (such as, say, stack and "heap" pages) but, instead, backed by pages in "swap space" (swap partition, swap file/pagefile, whatever) or refers to all cases where a portion of the address space needn't be backed by physical memory and a reference to that portion of the address space causes a fault where the fault handler fetches the data from some form of backing store, which might be a memory-mapped file. I'm unaware of any consensus that "virtual memory" refers only to the former. Guy Harris (talk) 20:10, 26 August 2012 (UTC)
I do not insist that "virtual memory" and "memory-mapped files" never intersect. I insist only that they are distinct. Accessing a memory-mapped file not necessarily leads to a page fault. On reading the file, an entire page may be already downloaded. On writing to an already downloaded page, the page is just marked as altered (no fault). Then, after a suitable amount of time, the kernel uploads it (again, no fault). Then, the page is not necessarily discarded but may remain in the RAM as a file cache. It is not exactly the same way as the paging usually operates, but it expands a usable VAS portion likewise. Could you offer a definition of virtual memory which entirely absorbs memory-mapped files? Incnis Mrsi (talk) 07:17, 27 August 2012 (UTC)
Accessing, for example, the stack or heap does not necessarily lead to a page fault, either. On reading from the stack or heap, an entire page may be downloaded. On writing to an already downloaded page, the page is just marked as altered (no fault). Then, if the page frame containing that page is needed for another purpose, the kernel uploads it (again, no fault). The only differences there are 1) it's read from or written to the "swap area" (swap partition or swap file/page file/whatever your OS calls it), 2) it's not written back unless the page frame is needed for another purpose, and 3) the pages, being anonymous, don't serve as entries in a cache. So "accessing a memory-mapped file does not necessarily lead to a page fault" is not a point of distinction between "virtual memory" in the sense of "anonymous pages don't all have to be in physical memory" and memory-mapped files.
The page on paging says, in the introductory paragraph, "In computer operating systems, paging is one of the memory-management schemes by which a computer can store and retrieve data from secondary storage for use in main memory. In the paging memory-management scheme, the operating system retrieves data from secondary storage in same-size blocks called pages." That applies exactly to memory-mapped files - the pages in the file are retried from secondary storage (the file) for use in main memory, and are retrieved in units of pages.
And a definition of "virtual memory" that includes both anonymous pages and pages from memory-mapped files would be one where the following sentences from the opening paragraph of the virtual memory article:
This technique virtualizes a computer architecture's various forms of computer data storage (such as random-access memory and disk storage), allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which behaves like directly addressable read/write memory (RAM).
includes files in disk storage as well as "swap space". In most OSes that support memory-mapped files, explicit calls must be made to map regions of files into the address space (but I don't think that's the case in IBM i - I think every file has a region of the (single) address space permanently assigned to it), which might be considered a difference, and, in some systems, you can't necessarily map all of a file into the address space, e.g. you can't map all of a file > 4GB into the address space if pointers are 32 bits, which might be considered another difference, but, in both cases, the addresses backed by "swap space" or by a file are "virtual" in that there's no guarantee that they're in memory and a fetch from or store into the region might result in a page fault. Guy Harris (talk) 09:12, 27 August 2012 (UTC)
You persuaded that there is no essential difference (probably, I distinguished "virtual memory" from "paging" insufficiently). But a redirect from VAS to the "virtual memory" article can be a bit confusing: a VAS has not necessarily to contain any "virtual" (i.e. not yet mapped to physical) memory. Should "virtual address space" become a quasi-independent section with a definition of that kind of memory address space? Incnis Mrsi (talk) 19:12, 28 August 2012 (UTC)
The "virtual address space" page is a bit of a mess:
I've never heard "virtual address space", without an "a" or a "the" in front of it, used to refer to a mechanism, in the fashion of the first sentence of the opening paragraph, rather than used to refer to the range of not-physical addresses to which instructions can refer.
The concepts discussed on this page are not restricted to "modern" operating systems such as the ones listed (and I'm not sure what renders OpenVMS or UNIX or even Windows NT "modern", as they date back to somewhere between the late 1970's and early-to-mid 1990's); they date back at least to Multics and TSS/360 and TENEX.
There are systems that do not provide "security through process isolation" but that do provide virtual memory and a global virtual address space, such as IBM i and the "licensed internal code" on which it runs; IBM i, at least, uses other mechanisms to isolate processes (as did, say, OS/VS2 SVS, which I think used the "storage key" mechanism).
The example is too NT-oriented, and should probably be rewritten to discuss the general concepts of anonymous pages and file-mapped pages either without referring to the NT specifics or with references to specifics from other OSes as well. The "on 32-bit MS Windows" and "on 64-bit MS Windows" stuff should probably be removed entirely; Wikipedia is not a manual, guidebook, textbook, or scientific journal - if people want to know that level of detail about NT, they should look in Microsoft's documentation. Guy Harris (talk) 20:10, 26 August 2012 (UTC)
Yes it is a mess. I fixed the lead, but more work is needed. Peter Flass (talk) 02:04, 26 August 2013 (UTC)

Is there a consensus on the merge proposal? Peter Flass (talk) 22:01, 12 December 2014 (UTC)

So, um, shouldn't this proposal be on the target talk page, Talk:Virtual memory? —SamB (talk) 17:26, 20 July 2015 (UTC)

What is, and what isn't, part of the virtual address space?[edit]

Is the virtual address space defined by the hardware (as in "all addresses for which the OS could provide a mapping") or by the hardware and the OS (as in "all addresses for which the OS actually would provide such a mapping if a reference is made to the address")? In the former case, the address space, for most if not all non-segmented systems, would include address 0, and would consist of a small number of regions (typically 1 or 2, e.g. 2 in x86-64 with current implementations); in the latter case, for most OSes, it would not include 0, and might include several regions.

Or, to put it another way, in the diagram on the page, are the gray areas part of the virtual address space (in which case it's defined by the hardware) or not (in which case it's defined by the software)? Guy Harris (talk) 18:03, 26 August 2013 (UTC)

It seems to me that it's a combination. The architecture defines the maximum possible size and the software decides what to actually make available. "Would" or "does" is more appropriate. Peter Flass (talk) 15:39, 27 August 2013 (UTC)
Yes, "by the hardware and the OS" is "a combination", so you're saying it's "by the hardware and the OS". Guy Harris (talk) 16:29, 27 August 2013 (UTC)
You could have an address space smaller than the maximum amount of memory the CPU can address, or as you mention an address space can have holes. The hardware defines the maximum and the OS can further restrict it. Another question - is the concept of "address space" relevant to segmented systems such as Multics? Still another - are there any systems besides SVS that use a single shared address space for all processes? Peter Flass (talk) 19:07, 27 August 2013 (UTC)
"are there any systems besides SVS that use a single shared address space for all processes?" Yes. Guy Harris (talk) 19:52, 27 August 2013 (UTC)

The answer to the OS woulda-coulda question is: No. Virtual addressing is not an OS concept. Virtual addressing is implemented in hardware. As far as I know, virtual addresses have never been implemented in software, except perhaps as a research project. The OS relies on the processor to supply the mechanism. In short, you can have a VAX without VMS, but not VMS without a VAX.

The virtual address space is the range of addresses the processor is capable of addressing. It's not to be confused with the process space allocated by the OS, or the demand-paged memory services the OS provides. For example, on the i386 architecture the virtual address space is 232, but in ye olde Windows no single process was ever allotted more than 231 because 2 GB was reserved for the OS.

Every reference in this article referring to files and virtual memory should be removed. Some are completely wrong. (The best that can be said for the reference to malloc is not even wrong.) In its place should be a discussion of address translation. Cf. this page from caldera and this discussion from bottom up cs.Jklowden (talk) 23:18, 30 October 2014 (UTC)

Example[edit]

Even in the windows-specific example, I don't believe the references to the page file are required to make the point. Peter Flass (talk) 07:31, 28 August 2013 (UTC)