Talk:Single system image
|WikiProject Computing||(Rated C-class, Mid-importance)|
|Collapse thread about the article in a previous life. Has changed since then.|
This article exposes several troubling issues
Distributed operating system has absolutely no business being redirected to this article
A Distributed operating system is the operating system of a specifically designed distributed system, while the single system image is an attribute of some systems that employ a distributed computing model. Why would a search for a specific variety of operating system redirect here, to an attribute of a system. A single system image is not a cluster of machines! Single system image is a paradigm of design in distributed computing.
This article makes grossly negligent and misrepresentative statements in its lead and throughout
The wholesale concept of the single system image is that of an image, a perception, an illusion that is created for the user of the system. In the opening statement of the lead, this article attempts to morph the single system image into a cluster of machines. If this is an article about a Single system image (SSI) cluster, it should be titled accordingly.
The article continues throughout, using the metaphoric terms of SSI system and SSI cluster. Only three times is SSI used in the whole word sense: between the parentheses in the lead sentence, (correctly) in the title of the table Properties, and finally in the reference for the DragonFly entry in that same table (usage of single system image is the same). This article is not about SSI, it is about systems/clusters that express the SSI.
This article uses multiple references in an inappropriate, inaccurate, and fallacious manner
The three references that attempt to congeal a loose idea that the SSI is synonymous with a distributed operating system, are embarrassingly fallacious in their usage. Since I am primarily concerned with the DOS relationship, I have not checked the other half of the references. I outline this claim in detail below.
Since no page numbers were included in the references, I took the liberty of including every instance of the context of: single system image and distributed operating system, in the source.
Distributed Systems: Concepts and Design
These passages state that a single system image is something that removes a system’s users from all concerns regarding where any process runs or where any resource resides. They also say that the distributed operating system produces a single system image for all processes and resources. Absolutely nothing in these passages either implicitly or explicitly correlates SSI and DOS as synonymous. This reference, in relation to the statement, “The concept [of SSI] is often considered synonymous with that of a distributed operating system,” is fallacious.
Scheduling in Distributed Computing Systems: Analysis, Design and Models
This passage states clearly that the concept of Single system image is an encapsulating quality of a system, hiding its internal complexity, causing it to be perceived as a single entity. DOS provides this quality. Absolutely nothing in this passage either implicitly or explicitly correlates SSI and DOS as synonymous. This reference, in relation to the statement, “The concept [of SSI] is often considered synonymous with that of a distributed operating system,” is fallacious.
These passages state transparency is one of the most important design goals in DCS. It continues to define three important sub-aspects of transparency. More importantly, the earlier passage reveals that transparency provides DCS with SSI. Absolutely nothing in these passages either implicitly or explicitly correlates SSI and DOS as synonymous. This reference, in relation to the statement, “The concept [of SSI] is often considered synonymous with that of a distributed operating system,” is fallacious.
Operating System Directions for the Next Millennium
This passage states transparency is one of the most important design goals in DCS. It continues to define three important sub-aspects of transparency. More importantly, the earlier passage reveals that transparency provides DCS with SSI.. Absolutely nothing in this passage either implicitly or explicitly correlates SSI and DOS as synonymous. This reference, in relation to the statement, “The concept [of SSI] is often considered synonymous with that of a distributed operating system,” is fallacious.
This passage involves outlining the goals of the Millennium Project (a Microsoft Operating System); under the bullet-point of Security, the author speaks to "SSI" having no bearing on the application of Security in a Distributed Operating system. Absolutely nothing in this passage either implicitly or explicitly correlates SSI and DOS as synonymous. This reference, in relation to the statement, “The concept [of SSI] is often considered synonymous with that of a distributed operating system,” is fallacious.
The use of the term "seamless appearance" IS synonymous to "Single System Image; and as such, is not synonymous with a distributed operating system. Absolutely nothing in this passage either implicitly or explicitly correlates SSI and DOS as synonymous. This reference, in relation to the statement, “The concept [of SSI] is often considered synonymous with that of a distributed operating system,” is fallacious.
This passage speaks to a specific project system, "Legion" and a proposal to use an Object-model that would provide a SSI as an aid to extensibility. Absolutely nothing in these passages either implicitly or explicitly correlates SSI and DOS as synonymous. This reference, in relation to the statement, “The concept [of SSI] is often considered synonymous with that of a distributed operating system,” is fallacious.
According to some of the article’s other sources:
Grid and Cluster Computing
The initial passage presents SSI as inseparable from the cluster. The latter passage states the cluster may or may not involve a distributed operating system. These passages directly contradict the correlation of SSI and DOS as synonymous. If SSI is required, but DOS is not, how can they be synonymous? If a = b, and b = c sometimes; can it be considered that a = c? No; if the associative link is broken – in any way – the rule (or synonym, in this case) no longer holds. This reference, in relation to the statement, “The concept [of SSI] is often considered synonymous with that of a distributed operating system,” is fallacious.
In search of clusters
Your search - "distributed operating system" - did not match any documents.
Single System Image (SSI)
This passage represents the opening remark or first sentence of the paper, and states the point, "A single system image (SSI) is [a] property of a system;" and as such, a property of a system cannot be synonymous with an operating system.
Your insanely anal rant notwithstanding, the two notions are not universally or precisely defined. The list of systems given in this article identify themselves as either "distributed operating system", "cluster operating system" or "single system image", despite having very similar traits and purpose. It's mostly a matter of marketing whether something is labeled "cluster OS", "distributed OS" or "SSI OS". Check the original documentation of the systems listed.
I've read your draft User:JLSjr/Distributed operating system, and your brain, my friend, completely lacks any power of abstraction. Yes, this article is crap (like most of Wikipedia) for it insists too much on clusters. Until a couple of weeks ago, distributed operating system redirected to parallel computing, which did not even mention the concept. And that state of affairs lasted for 3 years! Your article is however far worse than this one because it attempts to built a textbook architecture of a distributed OS by cropping together loosely-related sources. The bottom line is that the main trait of a distributed OS is the provision of a SSI, and this is supported by the various sources cited, your attempts to split hairs aside. Pcap ping 12:38, 28 April 2010 (UTC)
The main improvement that this article really needs is some more historical examples, and some connections with other notions. Your draft has some historical distributed OS examples. Those would be nice. But please, use some survey papers or books that treat these in perspective. The SSI notion precedes the moniker and "cluster computing" terminology, never mind the newer grid computing / cloud computing (off topic: read that talk page for a classic example of massive wiki circle jerk.) The notion of distributed OS or SSI is not limited to clusters, understood as machines in the same location (possibly HPC cluster if the interconnect is beefier, especially in terms of latency), but the SSI / DOS concept has been mostly implemented in clusters for practical reasons. Although infrastructure like PlanetLab is now available to researchers (never mind Google or Amazon's own infrastructure), Internet-level "full" distributed OS has few practical benefits, so most the "cloud computing" stuff is usually limited to subsystems like storage, e.g., Google File System. So, a certain focus on clusters is inevitable in the discussion of DOS / SSI, because that's the level of scaling they've been capable of, so far. Pcap ping 13:25, 28 April 2010 (UTC)
Your insanely anal rant Your brain ...completely lacks any power of abstraction
This type of remark is unnecessary. I will continue to refer to the article, as it is the subject around which my issues revolve.
References used in the article:
Each of these entities being compared do share certain aspects of relevance; despite our differences of opinion, we do agree on that much. It is critical to understand, this matters not. If the entities in question were identical twins, it would not matter. If the entities in question were two discrete pointers to the same instantiated object, it would not matter. The references do not support it; period.
I would like for it to be noticed, that in my earlier remarks, I did not indicate ignorance, stupidity, or "insanity" were possible as a cause. Please notice as well, I did not insinuate dishonest, surreptitious, illicit, or fraudulent intent. I indicated negligence: a simple lack of effort or diligence.
In any event, there can be no doubt to the refutation of the references; and they should be removed. This is where I would normally say, "And hopefully replaced." I actually did begin to type those words, and then remembered... This leads us to point number 2.
Misrepresentative statements are made throughout the article:
The only possible argument would be that SSI is the object of the sentence, and it is directly followed by the stand-alone noun, "cluster." I will be more than happy to debate the stochastic opportunities in this instance. How about, we entitle an article "Semi-metallic paint;" and the lead line could have a lead line such as, "A 'Semi-metallic paint' brush has bristles impregnated with inorganic metal ions in order to..." The paint is on the brush. The brush spreads the paint. It’s not about the relationship, or the activities, or anything else; it is about the disparity between the article’s title and what the lead sentence indicates the subject of the article is. It is misleading, confusing, and just plain inaccurate. Is more required?
"A single system image (SSI) is the property of a system that hides the heterogeneous and distributed nature..." —RAJKUMAR BUYYA, Single System Image (SSI), pg. 1
"A [single system image] can be defined as the illusion created by hardware or software, that presents a collection of resources..." —RAJKUMAR BUYYA, Single System Image (SSI), pg. 1
Twice on the first page Buyya describes (and defines) SSI as an illusion, or a property of a system. In other areas of the paper, the term Single system image Cluster is used. The SSI cluster is not an idea. It is a network of, a collection of, AND a cluster of computers. Trouble is, we are talking about two completely different things; a SSI, and a SSI cluster.
Either change the title or change the wording; but don't get confused and think I am trying to impugn character that must be defended, I am not. Just be careful, and maybe not too abstract. Factual righteous truth carries the overwhelming load of that "crappy writing" out there. On the other hand, one can have the most eloquent and well-written piece ever; and if it is not accurate, its toast. If you want my assistance, just ask; but yes, I will be man-handling that beast I'm wrestling currently. Maybe you would curb this, and come help me???
Exactly 50 words, you may never know how difficult this was...
how about adding SGI to the list?
Ericfluger 19:49, 10 March 2007 (UTC)
- I agree. I came to this page expecting references to SGI machines that are generally not considered to be 'clusters' but are called SSI, but instead read all about clusters that appear to be a single machine. IMO, SSI is more than just *pretending* to be a single image...they are a single computer, or are so at a much lower level than clusters. Of course, I'm mostly talking about the old IRIX machines. Perhaps things have changed.
- The above URL has gone. Here's a new one : http://www.sgi.com/products/servers/uv/
Since everyone is clear that this article is a mess I'm starting a proposed rewrite at /Rewrite. In the interests of full disclosure I acknowledge that I'm a OpenSSI developer. If anyone thinks I'm giving undue weight to OpenSSI please note it here HughesJohn (talk) 20:49, 25 September 2008 (UTC)
- A single system image (SSI) is the property of a system That hides the heterogeneous and distributed nature of the available resources and presents them to users and applications as a single unified computing resource.
Keep in mind: "The lead section, lead, or introduction of a Wikipedia article is the section before the table of contents and first heading. The lead serves both as an introduction to the article below and as a short, independent summary of the important aspects of the article's topic." HughesJohn (talk) 13:08, 2 October 2008 (UTC)
I have filled in the OpenVMS entries in the table.
I would like to discuss/suggest/add some additional aspects for SSI clusters:
Locking resources is cluster wide. Locks survive a node being removed from the cluster, regardless of whether that node was the master of the lock, or had some interest in the lock. Note that is some SSI system such as OpenVMS, the DLM is one of (if not the) fastest means of cluster communication.
A single security model
All security mechanisms are cluster wide. A single set of /etc/passwd or SYSUAF and related files are used.
The cluster software is directly integrated into the kernel. A standalone node is effectively a cluster of one node. Turning cluster software on or off is a configuration option not a re-install/rebuild. System services are mainly cluster centric. Cluster membership is present before normal file system and user access is possible. This also means before most daemons are possible.
Cluster communication versus IPC
Much if not most of the communication within the cluster is not specifically IPC. This is particularly true if the cluster software is fully integrated with the kernel, in which case most of the traffic is kernel to kernel, not specifically process to process. Thus we might use spinlocks for inter processor intra node coordination and communication, distributed lock manager for inter node kernel coordination and communication, and pipes for inter process communication. Simon L Jackson (talk) 02:52, 11 January 2009 (UTC)
"Shared Roots" might not belong with SSI - A shared root cluster is intermediate between a SSI and an "incoherent" conventional linux "bunch of boxes" cluster. More precisely, all SSIs must have real or virtual shared roots, but not all shared roots need be SSIs: A shared root cluster has a unified filesystem between nodes, but not necessarily a unified process/memory space. A shared root without unification of other aspects is a convenient compromise between scalability and ease of management.
For linux clusters, the largest clusters (10000s of nodes+) are "bunch of boxes" clusters, shared roots to date are used up to 1000s of nodes, SSIs up to 100s. But SSIs are the easiest to manage/use, then shared roots, then bunches of boxes.
- Historically, the term cluster, as used by DEC from the early 1980s, meant SSI cluster.
- Shared roots are often not particularly shared. Should we distinguish shared boot from shared root? Both OpenVMS and TruCluster can boot off the shared root regardless of whether it is a directly available disk (eg Y cabled bus, iSCSI or SAN) or via a network boot (sometimes referred to as a "satellite" node).
- To share a root, a single security model should be considered.
- To have a fully shared root (or perhaps this should be shared boot) means cluster communications needs to be present very early in the boot process, before the shared root is formally mounted, and therefore before a full IP stack can be running. Thus some SSI clusters either don't or prefer not to IP for communications. If they do, they use a separate simplified IP stack. The SCS protocol used by OpenVMS is specifically designed to provide flexible cluster communication and is significantly more efficient than IPv4.
- Followups from HughesJohn (talk) 13:47, 12 January 2009 (UTC) about "shared roots"
- Yeah, when I did my rewrite I was of the opinion (influenced by my OpenSSI background I expect) that a cluster was SSI, and to be SSI it had to have the whole kit and caboodle. I came up with the idea of splitting the discussion into a set of features, which were more or less provided by different systems as a way of avoiding flamewars about which were "real" SSI systems (for example I was originally very dubious about the openMosix claims to be SSI). However as time passes I think I stumbled on the right idea - SSI is not an absolute, different systems include different SSI features.
- Yes, shared boot is different from shared root. For example OpenSSI usually doesn't do shared boot - each node boots from its own local disk, then joins the cluster to find the root. (It can do shared boot with Etherboot or PXE, but that has the disadvantage of serialising the boot process).
- It seems to me that shared root implies a single security model. Maybe we should discuss this in that section.
- These days a full IP stack could be in a network card's boot rom (Etherboot/PXE) so that doesn't seem to be much of a problem. As I said above OpenSSI usually boots the full Linux kernel from a local device, so it can handle fairly complex protocols for in cluster communications - Infiniband for example).
- HughesJohn (talk) 13:47, 12 January 2009 (UTC)
Hot node addition/removal?
This is the ability to add or remove nodes at runtime, rather than at cluster start time. I know that OpenSSI can do it, Kerrighed can't (but is working on it), not sure about others. Somewhat important because it affects the purpose of the cluster: Is it designed to be highly available, such that individual machines can fail but the cluster lives? Or does adding additional systems increase performance but decrease reliability, since node failure means cluster failure? Essentially the same kind of difference as (say) RAID-1 and RAID-0, respectively.
On a similar note, there's also the question of whether processes can live independently of their initial node. In Mosix-based systems, one node may be doing the heavy lifting CPU-wise, but all I/O and IPC has to be proxied back to the starting node; if it fails, the process is dead. By contrast, I believe OpenSSI tries to translate as many resources as it can to equivalent resources on the local machine, so only hardware-reliant processes die if the corresponding starter node dies. This feature is obviously only relevant if a system supports hot removal, since if a node dies in (say) Kerrighed, your cluster is dead and the whole issue is moot.