Talk:Google File System
|WikiProject Google||(Rated Start-class, Low-importance)|
Access control vs. cacheing
How can users cache metadata if the master-server is used for file-access control? This makes little sense and someone knowledgeable should make corrections. —Preceding unsigned comment added by 220.127.116.11 (talk • contribs)
- My understanding is that each client asks the master whether they can access the chunk, which is different from asking where to find the chunk. Not especially secure a system, but on the other hand it is usually running on a private secure network... --maru (talk) contribs 17:57, 13 April 2006 (UTC)
Request for Elaboration
It would be nice if someone more knowledgeable than I could point to real-whorl applications or similar systems, and add them to the body of the article. --Maru Dubshinki 09:30 PM Sunday, 06 March 2005
Is it 'high data throughput, at the expense of low latency' or 'high data throughput, at the expense of high latency'? The meaning is that in order to get a high data throughput, low latency performace is sacrificed. But now that someone else has edited it to the latter, I am no longer so sure of my grammar. (hmm... 'high latency' or 'low latency'? ...) --maru 01:26, 28 Apr 2005 (UTC)
Additional External Link
Please consider adding the following link as the article discusses at length GFS implementation: http://www.baselinemag.com/article2/0,1540,1985040,00.asp --Todd B --July 11, 2006
- Working on it. Thanks for the link- http://www.baselinemag.com/article2/0,1540,1985046,00.asp and http://www.baselinemag.com/article2/0,1540,1985047,00.asp mention "BigFiles", which I'd not heard of before. --maru (talk) contribs 13:21, 11 July 2006 (UTC)
The links to the GFS paper .pdf, and the ZDnet article are dead links. http://news.zdnet.com/2100-9588_22-5596811.html http://labs.google.com/papers/gfs.html Both return file not found errors. 22 Nov 2011 18.104.22.168 (talk) 02:57, 23 November 2011 (UTC)
- I've fixed the ZDnet link. I don't know about the Lab page & PDF - they both still register in a Google search for the paper, so maybe it's a temporary error? --Gwern (contribs) 03:42 23 November 2011 (GMT)
- The GFS .pdf link is gone with everything else that was Google Labs. http://googleblog.blogspot.com/2011/10/fall-sweep.html Google shut down the site without even correcting its own links as Gwern notes (thanks for the quick fix). One copy of the .pdf exists here: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.125.789&rep=rep1&type=pdf --22.214.171.124 (talk) 23:47, 23 November 2011 (UTC)
- The google link to the GFS paper, along with some historical influences still exist via the following page - http://research.google.com/people/hgobioff/ 5 February 2012 — Preceding unsigned comment added by 126.96.36.199 (talk) 22:25, 5 February 2012 (UTC)
the issue of the GPL
The part about the GPL in the first sentence of the article is misleading, at best. Regardless of how exactly something is (or isn't) derived from a GPL'd piece of software, no entity (corporation or individual) is compelled by the GPL to release anything unless they redistribute their altered code in binary form. Google (or anyone else) is perfectly free to mercilessy hack away any GPL'd piece of code they want, and as long as they only use it "in-house", for their own purposes, and don't redistribute it (whether for a fee or not is immaterial), then they have no obligation to release source. Considering what a sticky area this is, I didn't want to just hack up that sentence, but something along the lines of "Google has shown no interest in releasing their filesystem, either for profit or for the good of the Internet community" would be more accurate. Reference to the GPL is probably superfluous, and should simply link to another appropriate article if it needs to remain.
- Embedded or not, even if "The only way it is available to another enterprise is in embedded form--if you buy a high-end version of the Google Search Appliance", shouldn't they still release the GFS under the GPL (this is distribution after all)? Chutz 10:19, 14 February 2007 (UTC)
Ok, is it really necessary to include a section criticizing a product that is not even available to the public. I mean, the program may be very bad or very good, but is there any real point in saying anything about its quality if it's only used internally by Google, and not by anyone else? If no one but Google is using this, then why would there be any public criticism in the first place. Avador 18:02, 23 December 2006 (UTC)
I removed the section; the criticism is not for a release product (as you mentioned) and was entirely inaccurate (as the first line in the section stated). Osmaker 00:10, 18 January 2007 (UTC)
netgfs listed on googlecode.com
The project is listed here: netgsf on code.google.com but I wonder if it's really going to be open-sourced someday, or if it's only there to be accessed for internal use. Self Torture 03:52, 2 February 2007 (UTC)
generic no size file system
how about a no size file system?
Entry type:File name:file location on disk:file size .... .... .... Entry type=pointer to next table:File name:file location on disk:filesize —The preceding unsigned comment was added by 188.8.131.52 (talk) 12:39, 11 May 2007 (UTC).
I would like to propose that this article be merged into a generic Google technology article (perhaps Google platform, though that suffers from WP:V problems as well). A closed-source filesystem with no open source analogue and no use outside of a specific company (and which has limited applications outside a few types of read-centric applications) doesn't seem like something Wikipedia needs an article for. Don't get me wrong. I love the technical implications of this as much as the next guy, but an internal-only product can't really satisfy WP:V. JRP (talk) 23:05, 29 October 2008 (UTC)
- I don't think it's true that there is currently no FLOSS analogue (the see alsos seem to have at least 2). As for WP:V issues - it seems quite well documented to me. Multiple MSM articles and research papers, which is more than most articles on software.
- And it's widely used inside Google: all their prominent services like search and GMail are running on GFS one way or another. (The interview I linked makes this clear.) --Gwern (contribs) 09:29 12 August 2009 (GMT)
Google File System is a specific entity that is disjoint from the Google Platform. Although it is a proprietary system, its methods are of scientific interest (cf. Hadoop) completely apart from the marketing and usage interests of Google Platform. Claiming that Wikipedia should not support a concept simply because it's, at present, only used internally to one company, seems provincial and shortsighted to me. I recommend keeping the articles separate. 184.108.40.206 (talk) 19:44, 31 March 2010 (UTC)John
high level or low level
- The physical operations are handled by Linux filesystems like ext3. --Gwern (contribs) 09:25 12 August 2009 (GMT)
Possible Typo in Performance Section
In the Performance section the speed for a small number of nodes is listed as 80-100 MB/s, and the speed for a large number of nodes is 583 Mb/s (note: megabits, not megabytes). This equates to 72.875 MB/s. This doesn't check out logically. Can somebody help me verify the correct measurement?