Talk:Folding@home/Archive 2

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Results (and better summarisation of them)

I believe it's important that results be better summarized in lay language or in such a manner that they explain better what the results are all about and what they may lead to. -Mardus (talk) 15:15, 28 April 2008 (UTC)

Agreeing with the above. I came here to see what had been achieved by the project, but this long string of scientific manuscripts with names of authors etc isn't really helping. Presumably someone associated with the project could provide a brief synopsis of the most significant developments in research that this project has helped to bring about. Ubertoaster (talk) 08:24, 19 April 2011 (UTC)

See this http://folding.stanford.edu/English/FAQ and particularly this http://folding.stanford.edu/English/FAQ-Diseases. I'm participating in the project, and am pretty passionate about it. There are many others out there, but I can certainly help you. On the suggestion, I'm considering moving the list of publications to a separate article. Also, the cores need to be dealt with, perhaps in a similar manner. Thoughts? Jessemv (talk) 00:17, 21 August 2011 (UTC)

Second biggest

Folding@home has now been surpassed in terms of cumputer power by bitcoin. According to bitcoinwatch.com, bitcoin has a total power of about 19 Petaflop/s. The article's claim of being the biggest distributed computing cluster is thus false. 213.66.122.5 (talk) 10:07, 12 May 2011 (UTC)

Is it even a distributed computing project? According to the website, it's simply a currency transaction/trading application.
194.80.64.113 (talk) 15:53, 12 July 2011 (UTC)
It's a glorified pyramid scheme from where I see it 122.57.186.20 (talk) 09:25, 3 August 2011 (UTC)
I don't think BitCoin is a distributed-computing project. Its just a money exchanger thing, and the petaFLOPS don't really mean anything other than that stuff is moving around. If I recall right, you have to do some brief calculations to perform a trade, or something like that. So its purpose is trading, not computing. The computing is a side effect. Besides, the article has been changed by someone else to no longer claim that it is the largest. IMHO, it is though. Jessemv (talk) 00:25, 21 August 2011 (UTC)

Suggestions for GA

I noticed this as a GA candidate, and after glancing at the article, if I were the reviewer, I'd fail it due to the extremely long embedded lists (part of criteria 1a of good article criteria). Quite frankly, the reader is probably not interested in the details about every single Active/Inactive Core, nor is the reader interested in every single scholarly publication that has used Folding@home data. You might want to split these lists out into their own sub-articles, and write a paragraph or two of prose to summarize the data instead. ...comments? ~BFizz 16:40, 16 August 2010 (UTC)

Fixed a while back. Better now? GA process is likely to start up again. The last attempts at GA and FA were quite frankly pitiful and embarrassing. Unlike those attempts, I am honestly looking to improve the article to GA/FA status, and it is ALMOST there. Standby. Jessemv (talk) 20:06, 24 September 2011 (UTC)

GA Review

This review is transcluded from Talk:Folding@home/GA1. The edit link for this section can be used to add comments to the review.

Reviewer: Wizardman Operation Big Bear 15:02, 17 August 2010 (UTC)

After a cursory read, I see that this article meets quick-fail criteria, as a result of much of the article being embedded lists, the references being bare URLs, and other issues. As the nominator is indef-blocked, there's no one to fix it, so I'm failing the article. Wizardman Operation Big Bear 15:02, 17 August 2010 (UTC)

Why isn't there an update on the Petaflops achievment.

I realize that there are two measurements of the FLOPS. The nativeFLOPS and the x86FLOPS But I was wondering why the achievement of reaching sustaining 6.2 x86FLOPS on April 2010 isn't mentioned or some sort of comparable graph to the between the two. As it is credited as being one of the records on FLOPS page but not on here. —Preceding unsigned comment added by 71.59.218.245 (talk) 17:34, 1 March 2011 (UTC)

That issue should be fixed now, but not exactly by a graph. See the native FLOPS are used because those records go back. We simply have no information as to the x86 FLOPS way back in some of the early milestones. The article mentions the conversions, and if the reader wants more info they can click on the citation. Let me know if you have any further suggestions. Jessemv (talk) 20:09, 24 September 2011 (UTC)

Wikipedia team

Just wanted to point out that the wikipedia team # is 42223 - CompuTerror™ 13:39, 5 August 2011 (UTC)

Thanks I guess. There's a link to the team under External Links. Jessemv (talk) 20:11, 24 September 2011 (UTC)

Folding@home cores

I have recently moved the "Cores" section of the article to its own page. Here are some of my reasons for doing so: After reviewing policies regarding what can and cannot be in an article, I felt that the Cores section should be its own article. Here are some of my reasons: 1) At it was in the F@H article, the list was very long, and annoying to scroll through. 2) I don't see the list getting shorter. To the contrary, as new cores are added and old ones retire, the list will grow. Thus it should be split off somehow. 3) For those coming to the F@h article, it is likely irrelevant information. They end up having to scroll through this big monotonous list to get to the rest of the article. Most people probably look up F@h on Wikipedia not for its cores, but for general information about the project itself, what it has done, and where it is going. This is covered very well in the article minus the Cores section. 4) There is a lot more information specifically about cores than what was currently there. What was there is little notes and things. I believe that people didn't expand on it for reasons including that it would worsen the length of the list. Also its a bit technical, but some people have the know-how to add things, but there's a lot of general information that can be added as well. 5) The cores have enough references and enough to talk about to warrant their own article. Certainly information specifically about GROMACS or TINKER could be brought in. Information on specific cores is found in other places. Its not a main talking point here on the forums, but if one hunts around some good information can be found. There's probably a lot more information about them than can be found for some other Wikipedia articles, like the list of all obscure fictional Jedi who ever lived. 6) I or someone else can turn the list into paragraphs that flow well and are easier to read. This would also make it more encyclopedic. 7) Wikipedia's Be Bold policy. 8) There was a topic on this talk page about the length of the Core section and how they didn't think it belonged. It is important information, so I didn't want to get rid of it altogether.

I hope I referenced the article all right, but of course feel free to change it. The Folding@home cores article is fairly new, and needs some cleanup. Also, I don't think I'm allowed to remove the "New Article" tag, so perhaps someone else could eventually replace it with a Stub tag or something. Finally, if you can, please edit and expand it! Thanks. Jessemv (talk) 00:18, 22 August 2011 (UTC)

New Article flag has been removed. Would be nice if the article had some more information to each core though. Jessemv (talk) 15:04, 20 September 2011 (UTC)

Publications

Surely this article should,have a list of publications or summary of results which have come from the Folding@Home project? Publication is an important part of any scientific research and arguably the reason for doing it (in that the whole purpose of research is to present new results). It would be really neat if F@H participants could read the Wikipedia article, see the results and think "Hey, I helped do that". Non participants would no doubt naturally ask the question "why" as they read the article? Results are part of the reason why F@H does what it does. John Dalton 23:36, 4 November 2007 (UTC)

The third paragraph mentions it. Are you saying that it should have a more prominent discussion on the papers? (like a full section) If there have been any major breakthroughs that have resulted, then I would tend to agree with you. Cardsplayer4life 23:53, 4 November 2007 (UTC)
I agree. I'm quite curious what specific scientific knowledge has been gleaned from these simulations. Also, I think a mention of how the volunteers are credited for their work in these scientific papers should also be there. --seav (talk) 11:08, 21 November 2007 (UTC)

Although it needs a fair amount of work, I have added a list of published results to the article. Johnnaylor (talk) 19:36, 29 February 2008 (UTC)

I'm starting work on this summary. You are clearly right. Jonnaylor's list of publications was nice for their sheer numbers, but summary style is much better. It's going to be a lot of work, and may take me a while, but I just wanted to let everyone know that I'm currently working on it. My goal is to get this article to a Good Article or near it by January, when the v7 client will become the recommended client and the F@h website gets overhauled. Jessemv (talk) 20:45, 9 October 2011 (UTC)

Estimated energy use and efficiency

I'd just like to point out that I kind of don't like this section, but I'm not positive that we should get rid of it. I really see two options here. One, I could delete the section entirely. Two, I could copy the section onto my own website, reword to make it look better, and then cite that page in this article. Pros for option one: no other distributed-computing page has this analysis, its difficult to estimate anyway, and its probably irrelevent. Cons for option one: some people would like to know. I have seen several people on the F@h forums referencing that section, so I know that it does matter to those people. Pros for option two: keeps the apparently desired content, and removes the article of Original Research. Cons for option two: that's kind of a weird thing to do, plus the estimate is still going to be a shot in the dark, and it may likely end up not sounding like the rest of the article. So those are my thoughts. What do you think? Jessemv (talk) 15:02, 20 September 2011 (UTC)

And by the way, would someone please fix those URLs in the Results section. I'm not sure how to do it properly, but it doesn't look good having some of those recent papers just be external links. Thanks. Jessemv (talk) 15:02, 20 September 2011 (UTC)
Correction: URLs fixed. I tried again and I guess it was easier to get all that information than I thought. I still couldn't get as much citation details as I would have liked, but I tried my best there. Please feel free to add more! Jessemv (talk) 02:01, 22 September 2011 (UTC)
By the way, just discovered that that section dates back to February 24, 2008. Is it really important? I feel like we should make a decision on that, and then I'd be happy to learn more of Wikipedia's workings by nominating this as a Good Article, since it seems to me that other than that section it is worthy, or is a few minor edits away from it. Hopefully we can get it to Featured. If Rosetta@home's article is Featured, ours certainly should be! :) Jessemv (talk) 02:01, 22 September 2011 (UTC)
I am in the process of trying to find external analysis of this. Neither PrimeGrid, Rosetta@home, SETI@home, or World Community Grid have this section in their articles. Nevertheless I'll see if I can get some non-OR content in there. If it becomes reasonable to assume that nothing can be found, I move to delete the section. I am happy to do so if no one has any objections. Thoughts? Jessemv (talk) 17:19, 22 September 2011 (UTC)
I have not yet found legitimate analysis. I have Private Messenged several high-ranking users (one a Site Admin) on foldingforum.org, and one said, among other things, "At this point I'd probably recommend that you leave out that section, for the reasons you suggest." and reminded me that "The resources that FAH uses generally don't include every possible type of computing component". The other user replied back with "It can go away as far as I am concerned. Without citations, it really doesn't belong." So, the section currently has three delete votes (counting mine at this point unless I can remove the OR) and external analysis has yet to be found. In about a week, I will be posting my intentions, waiting a week or two for responses, and then proceeding with deletion or whatever. Just FYI everyone. Jessemv (talk) 06:51, 24 September 2011 (UTC)
Alright. Well apparently Dr. Vijay Pande has not gotten back to me with any energy consumption references. I guess he's too busy studying all our WUs. :) So, it sits with two votes against from high-ranking forum users, and my down vote as well. But I've decided not to completely delete the section. Just so no one freaks out because they cared about it and finds it's suddenly gone, it's right here, listed below. Hopefully now we can go improve the rest of the article, and then I'll see about getting on with the Good Article nomination process. *chorus* Still, if anyone finds any legitimate, comprehensive, and appropriate external analysis feel free to apply it and put the section back in the article. In the meantime, here it is, but without the September 2011 Original Research flag. Jessemv (talk) 17:42, 28 September 2011 (UTC)
Folding@home is a diverse network, utilizing many different kinds of CPUs, GPUs, and PS3s,[1] and in some cases is not the exclusive usage of the system. This wide range of hardware and the nature of distributed computing makes calculating energy use difficult, and requires a number of assumptions. However, some estimates can still be made.
Starting with the assumption that the average desktop computer draws about 65 to 250 watts,[2][3] and the average laptop draws at most around 50 watts,[2][3] then we have a CPU energy usage of 50 to 250 watts. As of August 9, 2011 there were 403,858 active CPU clients providing 643 teraFLOPS (643,000,000 megaFLOPS) of computing power.[1] In the worst-case scenario that there is only one CPU client per computer, we have a total between 20,192,900 and 100,964,500 watts of power for a efficiency between of 6 to 32 megaFLOPS/watt.
The GPU Folding@home client can be run alongside the CPU client, or alone. We can assume that the power draw of the computer doubles when the graphics card is fully utilized.[4][5][6] Note that this assumption also includes the possibility of the GPU client being run alone. As of August 9, 2011 there were 18,604 ATI and nVidia GPUs contributing 4,446 x86 teraFLOPS (4,446,000,000 x86 megaFLOPS) of computing power.[1] Thus assuming a power draw between 50 and 250 watts each, this amounts to between 930,200 and 4,651,000 watts total, translating to between 956 and 4,780 megaFLOPS/watt. At worst, this would put Folding@home's GPUs 5th in the June 2011 Green500 List.[7]
Official Folding@home estimates that each PS3 will draw 200 watts while running Folding@home.[8] As of August 9, 2011, 19,803 PS3s were providing 1,177 x86 teraFLOPS (1,177,000,000 x86 megaFLOPS) of computing power.[1] However, the newer PS3s draw about 90 watts while folding.[9][10] Assuming 90 watts each, it comes to a total of 1,782,270 watts for an efficiency of 660 megaFLOPS/watt. This would place Folding@home's PS3s 12th in the June 2011 Green500 List.[7]
With these power-usage assumptions, this amounts to a total between 22,905,370 and 109,576,100 watts, and with 6,266,000,000 megaFLOPS overall, this is an average of between 57 and 273 megaFLOPS/watt, placing Folding@home between 87 and 421st in the June 2011 Green500 List.[11] This only takes into consideration running the computing system itself, and does not factor in heat production, normal system load, nor dynamic frequency scaling.
  1. ^ a b c d Pande Group (updated automatically). "Client Statistics by OS". Stanford University. Retrieved 2011-09-12. 
  2. ^ a b MICHAEL BLUEJAY. "How much electricity do computers use?". Retrieved 2011-08-09. 
  3. ^ a b "How Much Electricity Does a Computer Use?". Retrieved 2011-08-09. 
  4. ^ "The Real Power Consumption of 73 Graphics Cards". Retrieved 2011-08-09. 
  5. ^ Tino Kreiss (January 21, 2009). "Actual Power Consumption And Current Requirements". Retrieved 2011-08-09. 
  6. ^ "The Truth About Graphics Power Requirements V2". Retrieved 2011-08-09. 
  7. ^ a b "The Green500 List 1-100 - June 2011". Retrieved 2011-08-09. 
  8. ^ "PS3 FAQ" (FAQ). Retrieved 2011-08-09. 
  9. ^ "PS3 Model Differences". Retrieved 2011-08-09. 
  10. ^ Zagen30. "Re: ATTENTION: All PS3 folders - Let's get Sony's attention!". Retrieved 2011-08-12. 
  11. ^ "The Green500 List 401-500 - June 2011". Retrieved 2011-08-09. 
Jessemv (talk) 17:42, 28 September 2011 (UTC)

Images

I am aware that this article needs some more images. I'm working on it, but the biggest thing is getting permissions and whatnot. I've got some really great illustrations picked out. Recently I have commented out that one image of F@h using 99% of a CPU. IMO, it wasn't very helpful (as its been explained in words) and conveyed the wrong message. I'm working on finding replacements; there are some good ones out there. Just so everyone knows. Jessemv (talk) 03:37, 15 October 2011 (UTC)

I'm making good progress gathering the images and getting permissions. Jessemv (talk) 00:57, 17 October 2011 (UTC)
Permission has been granted by Vijay Pande. My next step is to choose the best image and get it onto the article! Jessemv (talk) 01:51, 17 October 2011 (UTC)
Most of the images I was looking at have been added. There may be better ones out there, but at this point I'm going to go back to focusing on improving the text. Jessemv (talk) 23:52, 19 October 2011 (UTC)

Nonsensical sentence in the introduction?

Can someone clarify what this means?

"The Folding@home project has also successfully simulated protein folding in the 1.5 millisecond range — which is a simulation thousands of times longer than it was previously thought possible to model, and millions of times longer than ever previously achieved."

The source says that they successfully replicated a folding situation that takes a long time in a short period of time.

But the sentence on the page says the simulation of 1.5 milliseconds is 1000x longer than it was previously thought was possible? So they are talking about 1.5ms/1000? This makes no sense. The million times longer makes even less sense.

Jabberwockgee (talk) 02:57, 20 October 2011 (UTC)

Hi Jabberwockgee, the source says that their first goal was to break the microsecond barrier, and the the millisecond barrier is 1000x times harder. I'm pretty sure there are 1000 microseconds in a millisecond, so that makes sense. The "thousands of times longer than it was previously thought possible to model, and millions of times longer than ever previously achieved" statements comes partly from the main site http://folding.stanford.edu/ and http://folding.stanford.edu/English/FAQ-Press#ntoc8 but neither of them say exactly that, which is odd since I was sure it was covered by a source, but now I can't seem to find it. Hmm. Well, it does seem a bit braggish and perhaps untrue now, so I've replaced it with something better that is fully backed up. Thanks for pointing it out. Jessemv (talk) 04:24, 20 October 2011 (UTC)

Requesting fact-checker

I'd like to have some expert look over the Biomedical Significance and let me know if all the facts are right. I'm not trying to insert false statements or anything, but rather the difficulty I'm primarily having is the interpretation of technical scientific publications. Several times now I have run into situations where it appears that two modern scientific papers contradict each other, so then I have to do my best to figure out what each one is saying and try to put down the most truth into the article. Currently the Alzheimer's and Huntington's subsections may have errors resulting from my ignorance during this process, so if someone (like some bored Biochem major) could look it over, I'd be very grateful and I'd be willing to make the changes myself if need be. The changes I have made recently to those sections have undoubtedly been an improvement, but given the number of people looking at the article I want to make sure that I have the truth down and not some wrong compilation thing. So any help would be appreciated. Thanks. Jessemv (talk) 18:02, 21 October 2011 (UTC)

I did some more research, and this is not as much of a problem as it once was. Still, I'd like to make sure I have things right. Jessemv (talk) 03:19, 22 October 2011 (UTC)

Grammar Issues

Not sure how much these are holding back the article, but I just pasted the entire article into Microsoft Word and ran a grammar check. I've taken the snippets of what it had issues with, and put asterisks (*) between the text that it highlighted. For snippets that have no asterisks, almost the entire thing was underlined in green. If someone could fix these I would much appreciate it. I have difficulty with such things even though many of the issues are my fault. Jessemv (talk) 06:14, 30 October 2011 (UTC)

Passive Voice:

  • misfolding, and related diseases that *would never be seen* experimentally
  • simulations of protein molecular dynamics have *been severely limited* by computational power.
  • This *was demonstrated* in 2004 by Folding@home
  • upon request, and some *can be accessed* from the
  • Abeta studies using Folding@home *could be used* as a starting point for a new Alzheimer's therapy
  • That paper *was called* the "tip of the iceberg"
  • Folding@home *is also being used* to study Aβ fragments of different sizes
  • Abeta peptides *are produced,* while the action
  • involving the common SH3 protein *are also being studied,* as it has implications
  • to study Huntington's *are also being used* for Alzheimer's research
  • This strategy *could be used* to bring the results from
  • If p53 *becomes mutated,* breaks down,
  • They *are needed* for these purposes by rapidly growing
  • several other *proteins which* have mutations tied to cancer
  • for the immune system, *have been used* as immunotherapy
  • The disease is caused by mutations in the Type-1 collagen protein, the most common form of collagen and found abundantly throughout the body.
  • Folding@home *is utilized* to find prime binding
  • The project *was officially concluded* on March 8, 2004,
  • which *will be used* for important scientific purposes
  • The work *that was started* by the
  • computing projects *are often driven* by a sense of collegiate
  • from a project *are benchmarked* on that machine
  • ojects, and *are rewarded* with additional
  • Passkeys *are generated* from a case-sensitive hash function
  • By default, each client *is configured* to donate under Team 0,
  • Teams *can be used* for troubleshooting or recruitment purposes,
  • These clients *are designed* to run FAH's calculations at an extremely low priority,
  • which *are most commonly found* in video games.
  • single WUs *are completed* much faster across
  • distributed computing software, as it *had previously been reserved* only for supercomputers
  • generation of the SMP client *was released* as an open beta
  • users who run these *are rewarded* with a 20% increase
  • FAHViewer *is modeled* after the PS3 viewer
  • protein animation *can be disabled* to further minimize this
  • that the client *is being asked* to process
  • Work Units *are normally processed* only once
  • the same core *can be used* by various versions of the client

split infinitive:

  • Those with the disease are unable *to successfully make* functional connective
  • have the potential *to significantly lower* the development cost of new drugs
  • GPUs have the possibility *to significantly out-perform* CPUs

sentence structure:

  • prepare to study more complex biomedical problems.

verb use:

  • extra points if they use a passkey and maintained an 80% successful return of Work Units.[158]

Jessemv (talk) 06:14, 30 October 2011 (UTC)

I would take the flaggings from Microsoft Word's grammar checker with a handful of salt. Most of the uses of passive voice and all of the instances of split infinitives seem fine. Some resources: When to use passive and active voice, Are split infinitives grammatically incorrect, or are they valid constructs?. At a glance, the 'verb use' flagging seems legitimate -- there seems to be inconsistent tense in verbs. Emw (talk) 01:35, 31 October 2011 (UTC)
All right. Thanks for the interesting links. So I guess these issues are a low priority. I don't know, perhaps active voice just sounds better. I'll do some experimentation, but as long as the sentence isn't confusing I guess these aren't holding the article back too much. Thanks. Jessemv (talk) 02:08, 31 October 2011 (UTC)

F@H-on-BOINC

The article says that Folding@BOINC is currently under development. What is the official status of that venture? It seems to the very casual observer that progress has stagnated at the very least. Billy the Impaler 13:18, 18 July 2006 (UTC)

BOINC requires that the project be made opensource to be able to use the BOINC system. The PandeGroup does not want to do that because of the [uncited] reasons in the article, so development has been stopped until further notice. 86.130.96.190 (talk) 15:28, 3 January 2008 (UTC)


Where do you read out that BOINC will mandate the use of OSS? Are all BOINC clients OSS?

194.204.35.117 (talk) 17:55, 3 January 2008 (UTC)

There has been (some years ago) a report about one distributed folding application (don't remember which one; it wasn't Rosetta) which used the user's internet connection for purposes which had nothing to do with protein folding at all. It appears that this function was sort of hidden from the unsuspecting user. So yes, it makes sense that such massive grid computing application should be verifiable. --80.134.52.222 (talk) 15:08, 3 December 2008 (UTC)

BOINC is a set of tools which enables the development of distributed computing projects and is useful for new Distributed Computing developments. Folding@home predates BOINC and had already developed their own set of tools (more or less in parallel with the development of Seti@home, which also predated BOINC). There is no significant advantage to migrating Folding@home from a custom set of tools to a general purpose set of tools, and there are two important disadvantages. First, (re-)development costs. Second, loss of performance. BOINC was originally developed for true parallel processing (e.g.- Seti@home) where processing delays can be generally ignored, whereas the Folding@home infrastructure was developed with an explicit goal of minimizing delays since their work assignments are a mixture of serial and parallel and turn-around delays are much more important.

BOINC and Folding@home both continue to develop the capabilities of their middleware. BOINC's goal continues to be a general purpose application; Folding@home continues to optimize for their specific needs. Their approach toward OSS also continues to be different. — Preceding unsigned comment added by 68.183.134.245 (talk) 16:09, 2 November 2011 (UTC)

Yes indeed. Very true. Jessemv (talk) 16:30, 2 November 2011 (UTC)

Graphical processing units

"However, it should be noted that this exaggerates the performance increase of the GPU client over the CPU client: the CPUs that contribute to Folding@Home vary widely from new to old, high performance to low, whereas the GPU client runs on only the very latest GPUs from ATI Technologies. A comparison to the latest Opteron processor showed more modest gains, both in terms of performance in points and points-per-watt.[1]"

This statement is just wrong because a TFLOPS count and WU points are not comparable. WU points are tied to WU completion time, not to system performance running the WU.


—Preceding unsigned comment added by 194.204.35.117 (talkcontribs) 27 November 2006

  • To my reading, this section seems to include reference to the same recent public beta test twice. Could someone familar with the beta test clarify/consolidate this please?

Stanford has recently cited further advances with the high performance client and stated they will be releasing a public, beta trial at the end of September 2006. ... As of October 2, 2006, the FAH GPU client has been released into a public beta test.

Keesiewonder 01:26, 13 December 2006 (UTC)

  • Let me be very specific. A CPU is a general purpose computing device. A GPU is a special purpose computing device designed to deal with data (usually visual) in three dimensions. Adding a graphics card to any system will allow GPU hardware to take over graphics tasks which would normally be emulated by CPU software. Adding a graphics card to any system will almost always result in a speedup depending upon the software. What the Folding@home people have done is to use the GPU to analyze protein (in three dimensions) rather than render a 3-d display. This is why the PS3-Client produces better results than the ATI-Client which produces better results than any CPU-client. --Neilrieck 11:28, 23 October 2007 (UTC)
    • The Pande Group objects to the use of the word "better" when pertaining to results, because it is not true. The only major difference between the three systems (CPU, GPU, PS3) is the speed... for accuracy it is single or double precision which is more important, and on that front, the CPU DGromacs (and its variants) and SMP cores, which use Double Precision, win out over the single precision GPU and PS3 for absolute accuracy and therefore theoretical quality of results. However, single precision seems to be accurate enough for most purposes, hence its usage on most of the other cores. Johnnaylor (talk) 18:47, 30 April 2008 (UTC)

I don't believe that the SMP cores are exclusively double precision. The bulk of the SMP computations are single precision for the same reasons you've given, although key values may be computed in double precision. DGromacs was based on a compiler-based upgrade from SP to DP and use DP exclusively but it was measurably slower for the same work. It has been deprecated by the incorporation of a mixed approach based on selected key values, providing both optimized speed and optimized accuracy as needed. — Preceding unsigned comment added by 68.183.134.245 (talk) 16:19, 2 November 2011 (UTC)

Suggestions for improvement

Some general suggestions for getting this article ready for the Good Article nomination process:

  • Heavily prune the number of citations in the lead. The lead should contain high-level information that is covered in more detail in the body of the article. It's typically better to reserve citations for those more detailed sentences in the body. A lead with few citations also allows reading to flow better, since there's less visual distraction with dense bracketed number citations.
Fixed. Jessemv (talk) 05:37, 5 October 2011 (UTC)
Fixed. Jessemv (talk) 15:29, 8 October 2011 (UTC)
  • Convert the bullet-point list format used in 'Software' to regular paragraph-based prose.
I believe this is now fixed. Hopefully it's now very close to what you wanted. Jessemv (talk) 19:43, 8 October 2011 (UTC)
  • Consolidate the list of publications in the 'Results' section to ten or fewer key publications. The current list of 95 publications is hefty article bloat. Perhaps link to the full list of Folding@home publications in the 'External links' section.
You're right, and I'm making the changes. The list is currently commented out, as I am working on summarizing each paper, which will take me a while. Then I will organize it and whatnot. I will be looking at Rosetta@home for guidelines on how it should end up. Jessemv (talk) 19:33, 8 October 2011 (UTC)
Please look again. I have some good summary-style stuff down now, although I'm not done. You should be able to see what I'm doing and where I'm going here. Jessemv (talk) 04:00, 10 October 2011 (UTC)
  • Remove line breaks in citations in the wiki-markup source. This is more of a maintenance issue, but the current in-source citation style makes it a nightmare to edit articles through the standard Mediawiki editing interface. Compare the way citations are used in the source of the lead in this article to how they're used in a high-quality article like, say, Virus.
Finished fixing just now. Jessemv (talk) 04:41, 5 October 2011 (UTC)

If those items above were addressed, I could possibly do a more detailed assessment. The article needs significant work before it would have a decent shot at passing GA, but I think it can get there. Emw (talk) 18:14, 2 October 2011 (UTC)

Thank you very much for the tips. I've read them over and I'll get to work as soon as I have time. I intend to turn the Results section into summary style, perhaps summarizing each paper if I can, or at least the ones that are the most notable. Being Wikipedia, I get suspicious if some statement in an article isn't covered in a citation, so I didn't want anyone to feel that way for the lead. But you're right, I'll see what I can do. On the last one, you want a citation to just be in one contiguous thing instead of line-breaked? Like "text text text,< ref >blah blah blah< / ref > text text text" kind of a thing? I was breaking it all up like that because it seemed to me that when you edit it, its easy to tell article content apart from a citation if there's line breaks. Can I remove the line breaks from inside the citation, but still put the following text on a new line? That might strike a good balance. And I'll get back to you when I finish the changes. Thanks for reading it over. Jessemv (talk) 23:15, 3 October 2011 (UTC)
Regarding markup style for citations, IMO it would be best to closely follow the convention among featured articles in the sciences. Take a look at the source of Caffeine, ROT13, DNA, etc. Neither the line-break style nor conventional style is particularly elegant, but going with convention will decrease frustration among experienced editors who will inevitably want to improve the article as it's assessed during GA and beyond.
I just looked more at the article's 'References' section, and I'm concerned about the heavy reliance on forum posts by seemingly non-authoritative people. Unless a forum post is made by a project scientist (e.g., Vijay Pande), I don't think it would pass muster as a reliable source. The sparseness of citations to academic journal articles is also concerning. If you don't have access to academic journals, then you might be able to get it by applying for one of the Wikipedia:Credo_accounts. Emw (talk) 02:14, 4 October 2011 (UTC)
All right. I will take a look at those pages and adjust all of the citations accordingly. Thank you for taking the time to point out examples like that; I appreciate it. Ah I didn't think of editors during and after the GA process. It just was clear to me. Will be fixed.
Regarding the forum posts, I thought your answer in the GA review for Rosetta@home is very appropriate: "While forums are typically unacceptable sources, I've decided to include posts from Rosetta@home scientists and moderators because they offer reliable information that doesn't seem to be available elsewhere" This is the case for F@h. I understand your position. I made sure that I cited posts only by people who have proven to be reliable. For example, 7im (his name can be found in a few places on the F@h website) is a very experienced editor who has been with F@h since its early years. From what I have seen, he continues to provide solid, reliable answers that are not questioned or changed by site moderators. It looks like he joined in 2007 but that was when the forum was actually made. I will try to find better sources, but the "seemingly non-authoritative" people actually do know what they are talking about. The reason it relies so heavily on them at the moment is because I'll read something on the forum that sounds like something that fits well into the article, and I'll go add it. The posts from Dr. Pande's blog are all there from me doing a "[whatever I'm searching for] site:folding.typepad.com" Google search. I was not aware that academic journals had to be cited, and since I'm at Utah State University I do have access to those things. They are just a little harder to search for relevant statements IMO, so I didn't bother. Perhaps I'll look in the F@h publications and use those if I can. See I'd much rather have Dr. Pande do some amazing science than post on the forums all the time. I have cited a few of his posts, but mostly its from other high-ranking users, but they aren't marked as such which can skew first impressions.
It seems this article needs a lot more work than I thought. I very much appreciate all the suggestions. If its not nearly ready to be a GA, at least its a lot better than it was back in early August. I just need to be more familiar with Wikipedia policy. Thanks for helping me get a bit up to speed. Jessemv (talk) 03:40, 4 October 2011 (UTC)

More suggestions

The article is better now. Compare the versions before and after the suggestions above were implemented. (Note that the 'after' version also contains substantial work in addition to the suggestions.) Those suggestions addressed some basic layout and styling issues. Now for some more suggestions:

  • Condense and hone the article. Proportional to its scope and scientific nature, this article seems over-sized. The article's readable prose is about 8800 words. I suspect this puts Folding@home in the upper 1% of science articles in terms of size of readable prose. High quality science articles in that cohort tend to be top-importance articles, e.g. DNA or Evolution, that are root-level topics with more than a half century of broad, intense academic research and a vast tree of sub-topics beneath them. This article is roughly mid-importance in a niche category (computational biology). While the extra content makes it very comprehensive, I think the balance between comprehensiveness and summary style in the article's current state is too lopsided to the former. This point relates to WP:FACR 4 and WP:GACR 3b: that the article "...stays focused on the topic without going into unnecessary detail (see summary style)."
Like you said, this is pretty general request. While it does seem to contain a lot of material, the issue should be better now, since I've solved most, if not all of the suggestions below. I don't know how you got those numbers, but I just feel that as long as the text stays on topic that the information should be included. Perhaps there are a few places where the focus drifts off or something. You may be aware of this already, but the article's byte count also includes citations and behind-the-scenes formatting, so it may not be a good measure of the size of the article. If you could please point out some more specific things that you think should go away I'll try to deal with those. The article was just pretty tiny when I started, and I've invested a lot of time and effort adding things to it since then. Perhaps I added too much, but it's a bit difficult for me to let things go unless necessary. I also feel that Folding@home is a very powerful and important tool for computational biology, and I just wanted to explain that and how it works. There's a lot of people out there who want to know more about the project, so I want the article to explain pretty much everything about it in normal language to them so that I don't have to. :) I guess I'm just an inclusionist. Jessemv (talk) 23:26, 26 October 2011 (UTC)
I have done some additional honing in various areas of the article, so things should be much better. However, in the process, I also added in some details to better connect ideas. I believe I'm done adding material to the page, so its pretty much refinement from now on. Also, it should be noted that the byte count of this article seriously overestimates its size, as there are tons of handy commented-out journal citations near the end of the article. Jessemv (talk) 01:42, 30 October 2011 (UTC)
Specific suggestions toward this end:
  • I think the 'Results' section should go away. The content delves into lots of unnecessary detail, and is very redundant with content included above that section (which is also presented much better). Given that, merge any select pieces of the subsections of 'Results' into similar top-level sections in the body. The 'Disease research' subsection seems like it could relatively cleanly merge into the top-level 'Biomedical significance' section. The same applies for 'Protein folding theory'. The 'Scientific computing' subsection (which probably makes more sense to call 'Distributed computing') belongs in a section that is now nebulously comprised of the current top-level sections 'Software' and 'High performance platforms'.
Done, but I couldn't figure out where to put the comparison to Anton so I just changed the section to "Comparison to Other Molecular Systems" pending further suggestions or ideas. That was a huge change for the article, but I'm glad it's done and it looks much better. Jessemv (talk) 05:34, 26 October 2011 (UTC)
  • During that merge of the 'Results' section, remove explicit references to academic papers themselves. Including the name of a paper in the body of an article is unnecessary detail for virtually all readers. Paper names should almost always be reserved for a citation. Instead, summarize some of the remaining papers' findings, and refer to the papers along the lines of the following: 'in (year), Folding@home was used to...'. An important aspect of this merge is triage: select which papers are most important to devote findings-summaries to, and integrate mention of other papers into phrases, or perhaps transitional sentences.
I have removed publication titles, but have yet to accomplish the other items. Jessemv (talk) 23:10, 23 October 2011 (UTC)
I believe I have fixed this now, but perhaps there are still notable papers left to use, we'll see. Jessemv (talk) 05:34, 26 October 2011 (UTC)
  • While consolidating the 'Software' and 'High performance platforms' sections, I would suggest renaming 'High performance platforms' to something more straightforward and general, like 'Computing platforms'. 'High performance platforms' sounds like marketing-speak, and is susceptible to seem dated in a few years. With a name like 'Computing platforms', the humble single-core CPU platform could also be appropriately included.
Consolidation complete. I still refer to them as "high performance clients" a couple times in the text, and most of the information resides under Software=>Client although a small amount spilled over into "Points" or "Work Units". I tried to treat all the clients the same, but I also left some original text highlighting the importance of those high-performance clients. In the process, I removed some extra generally-unnecessary details about them. Jessemv (talk) 06:03, 24 October 2011 (UTC)
The F@h logo is now taken care of I believe. I just don't know how to fix the others, but I'd be happy to do it if I knew what to do. The F@h logo no longer has prose-based summaries; it now uses a summary template with areas for all the information. However, I don't know where that came from, and it isn't appropriate for the other screenshots. So if you can set me up with one, or show me where I can get one, I'll follow through. Dr. Pande's permission was given on this thread, so I'll be including that link in the summary template. I really don't understand the complexity of the licences, and I would appreciate any tips. Thanks. Jessemv (talk) 01:00, 31 October 2011 (UTC)
  • Use templates to structure data in the 'Summary' section of image files, instead of simply prose.
Fixed for the F@h logo, not yet for the other images. I will finish that off when I get summary templates in there. Jessemv (talk) 01:00, 31 October 2011 (UTC)

Once these concerns are addressed and a few minor touch-ups done, I think it'd be reasonable to nominate this as a good article. Emw (talk) 22:08, 23 October 2011 (UTC)

Thanks very much. Those are some interesting suggestions, and thank you for providing them and links to Wikipedia's guidelines/policies. I'll set to work on these issues. However, the Results are important, and there are just so many major-impact papers that have come out of the project that it's a challenge to select one over the other. But I'll do my best to follow through with what you say. Perhaps I can find a way to include as many of them as I can, what do you think? As for the "High Performance Platforms" I never thought of it as marketing-speak but I guess that makes sense now that I think about it. It's called that on the F@h website anyway, but you're right that it ought to be changed to something more inclusive. Anyway, I'll continue my work and try to address all those suggestions. Thanks again, Jessemv (talk) 23:10, 23 October 2011 (UTC)
I should have said more regarding 'Results'. Here are more specific comments:
In the section lead:
  • 'To date' should give a specific date, since it will soon be out-of-date. Be wary of phrases using 'currently' throughout the article, too -- these are likely to be wrong some time in the future.
Fixed. Jessemv (talk) 00:44, 26 October 2011 (UTC)
  • Since the article is about Folding@home and not the Pande group, I would suggest removing the note about the group's 193 papers, but keep the count of paper directly related to Folding@home.
Fixed. Jessemv (talk) 00:44, 26 October 2011 (UTC)
  • The second and third paragraphs seem fluffy:
  • The Pande Group is a nonprofit institution dedicated to science research and education.
I don't think the Pande group meets the conventional understanding of the terms "science education non-profit" or "institution". It's a research lab at Stanford that studies protein folding dynamics with the aid of distributed computing.
Definition of "institute": "An association organized to promote art, science or education". Definition of "institution": "An organization founded and united for a specific purpose". The source, which they wrote, uses this words. I don't think I should change it. Jessemv (talk) 00:44, 26 October 2011 (UTC)
  • They do not sell the results or make any money off of it, and in fact make the data available for others to use....
I wasn't able to find anything to support the assertions made in this and the proceeding two sentences in the citation (the page linked to by http://folding.stanford.edu/). If the papers are available, link to the list of papers. If the raw data is available, link to the data.
Fixed. Jessemv (talk) 00:44, 26 October 2011 (UTC)
The second and third sentences in this paragraph now read: The results and data from the project are made freely available for others to use. Moreover, all scientific journals resulting from the project are posted on the Folding@home website after publication. However, looking at http://folding.stanford.edu/English/Papers, it seems only abstracts and summaries are available for journal articles. That is, there is nothing especially "freely available" about those results -- they still require expensive subscriptions to closed-access journals. Also, I notice at http://folding.stanford.edu/English/FAQ-main#ntoc4 that the group says "Next, after publication of these scientific articles that analyze the data, the raw data of the folding runs will be available for everyone, including other researchers, here on this web site." I wasn't able to find the data while quickly super-quickly glancing through the site. If it exists on the site, could you link to the actual raw data? Emw (talk) 11:57, 25 October 2011 (UTC)
Hmm. I can access pretty much all the journals fine. While I am at Utah State University so my IP Address may have some influence there, I was pretty sure that the journals really were open otherwise. A few of the journals decide to keep the publication for a year before releasing it, but that's pretty rare I thought. What ones didn't work for you? I'll also be talking to the appropriate people to get that raw-data statement confirmed or changed, since I haven't been able to find it either, other than the data from the 1.5-millisecond simulation. I'm pretty sure it does exist, because I know that they are very open about their results, but it'd be good to find it. I was just going off of what the website said, but perhaps the website's statements are obsolete or something. I'll check up on that. Jessemv (talk) 14:55, 25 October 2011 (UTC)
Well it looks like Vijay Pande stepped in and responded to my post, which is far more than I expected. I'll be adding using it as a citation a bit later, but it pretty much answers your questions so here's the link. Jessemv (talk) 15:56, 25 October 2011 (UTC)
Even with today's changes, I still think the second paragraph is problematic.
  • The Pande Group is a nonprofit institution dedicated to scientific research and education, and do not make any money off of any of Folding@home's results.
The word-choice issues I noted in my previous comment about this sentence remain. First, considering the usage of the term "non-profit": Stanford University is the 501(c)(3) non-profit entity (see http://folding.stanford.edu/English/Donate), of which the Pande group is a part. So one could perhaps say "The Pande group is a part of a non-profit institution, Stanford University", but I think saying "the Pande group is a non-profit institution" is incorrect. Further, using the word "institution" is both unconventional and somewhat misleading. The conventional phrasing is "non-profit organization" or, more formally, "non-profit entity". Saying the Pande group is an "institution" brings to mind the Stanford Research Institute and the Institute for Advanced Study, which the Pande group is not comparable to. I think it is most accurate to say that the Pande group is an academic research laboratory. It is headed by a principal investigator employed as a faculty member at a sponsoring university, staffed mostly by graduate students and post-doctoral researchers sponsored by the university and academic fellowships, and part of a typical department of the university.
7im pointed out to me that "This is not generally assumed. Many projects sell their results." I'd have to agree with this. If you read that Donate page it talks about those donations going to things to continue their research. They're not just walking away with it. Yes they are paid by the university, but then even non-profit people have to pay for groceries somehow. Anyway, with the change 7im made, and the definition of "institution" above, this should be fixed now. Jessemv (talk) 21:32, 26 October 2011 (UTC)
This edit by 7im improves the accuracy and precision of what I think was meant. It addresses most of the issues I had with that particular sentence. Part of the wording introduces a new problem:
"The full publications are available online or from a local municipal or academic library."
This is often incorrect. Most municipal libraries do not have subscriptions to scientific journals, and the few that do tend to have a very limited set. "Academic libraries" is vague; most high schools are in the same boat as municipal libraries. There is even a large distribution in the range of subscriptions available among colleges and universities. Most liberal arts colleges tend to not subscribe to a lot of scientific journals, and even many bigger universities have surprisingly small collections. Emw (talk) 16:51, 27 October 2011 (UTC)
All right. Is there a better word choice then? But we are discussing journals that are a year or less old, and even then many of those are publicly available anyway, so it seems to me that its a really small matter anyway. To detail it like that might be unnecessary detail. I'm assuming that you're not at a university or any place like that, so unlike me you have the ability to check to see what journals you can't access. I like the phrasing now, and it seems correct. You haven't yet specified examples of what number of journals are involved in this issue, but since all publications become free after a year, it is AT MOST 1/11 = 9% of all the publications, and over time this value will continue to decrease further. The statement should stand. Jessemv (talk) 17:07, 27 October 2011 (UTC)
Apologies, I should have been more diligent about trying to access journal articles over a year old. I can access almost all of those articles while not on a university network. However, this doesn't imply that municipal or academic libraries would carry print versions of the journal articles. I assume that is the suggestion made by including these places in the sentence. In that case the points in my previous post would still apply. Also, having tried to access articles less than a year old while not on a university network, it seems that the overwhelming majority of those articles are not publicly available. (Only those published in PLoS are publicly available; articles published in other journals are not.) Considering these things, I think it would make sense to rephrase this sentence to "Articles published over a year ago are freely available online." Emw (talk) 16:33, 30 October 2011 (UTC)
Not a problem. I appreciate you taking the time to check the list. I believe the sentence (and the source from Dr. Pande) were referring to libraries have online access, rather than print. I've noticed that there are many different hosts of the same publication. While again I'm not in the position to check this, perhaps if you Googled the title of some of those otherwise inaccessible publications you might find a site where there is a free copy. The Results section of the website links to but one. So that's something to consider. As I believe the source points out, those who are going to benefit (and understand!) the publications are most likely going to be in areas where they have access to them, while even the general public may not. May I suggest "The full publications are freely available online from local municipal or academic libraries, and become fully publicly available after a years time" or something along those lines. Jessemv (talk) 18:24, 30 October 2011 (UTC)
The statement "(the Pande group) do not make any money off of any of Folding@home's results" deserves some consideration. I think it is fair to say that the group makes money off of Folding@home's results, just not in the direct, commercialized fashion the way pharmaceutical companies and other for-profit entities do. For example, it would be very unusual if members of the Pande group were directly paid per purchase of their published journal articles, or for the group to sell their raw data to a pharmaceutical company. Their funding derives (i.e., they make money) primarily through grants, mostly from the US federal government. Grants are competitive, and prior results (in the form of highly-cited journal publications) are a big factor in determining who gets money and who does not. So in this way -- again, not like the familiar revenue models of for-profit entities -- the Pande group does make money off of Folding@home's results. Emw (talk) 05:13, 26 October 2011 (UTC)
Again, 7im's selling of the results point. When I hear "profit" I think of cash in the pocket, not striving for monetary grants which again are just turned around for further research. This is phrased much better now. Jessemv (talk) 21:32, 26 October 2011 (UTC)
  • "All of data sets generated from the project are freely available for others to use upon request, and some can be accessed from the Folding@home website"
The fact that a scientific research group makes its data sets available upon request is generally assumed, and unremarkable. I think this is another "implied fact" typically not noted in high-quality WP science articles (like the fact that Folding@home's results have been published in top journals). However the freely and readily available raw Folding@home data on the villin headpiece is remarkable in that it is unusually transparent (a good thing), and probably warrants specific mentioning. I think it would make most sense to integrate the note about the readily available villin headpiece data set into the coverage of villin in 'Protein Folding Theory'. Emw (talk) 05:13, 26 October 2011 (UTC)
Hmm. Well I don't know if it is assumed. The Pande Group specifically put several sections in their FAQ about how they deal with the results, so they feel that is indeed remarkable. While I did get rid of the top journals statement because that defintely was assumed and unnecessary, this one does deserve mentioning. I'll see about how to get a statement about villin in there without it sounding like some "unnecessary detail" that article policies advise against. Jessemv (talk) 21:32, 26 October 2011 (UTC)
  • "Additionally, all of the scientific publications resulting from Folding@home are posted on the Folding@home website after publication."
Again, one would assume this. Pretty much all research labs list their lab's publications on their website. Similar to my reasoning with other implied facts, I think this sentence should be removed. An external link to the publication list on the lab's website should suffice. (Furthermore, links to the publications are posted on the Folding@home website; the publications themselves are still restricted to subscribers.) Emw (talk) 05:13, 26 October 2011 (UTC)
I'll be leaving this for now, as its tied into 7im's and the citation's statements about the accessibility of these publications, which is itself noteworthy. I'm not sure that everyone would have the know-how to make these assumptions. A link to the publications is already in External Links. Jessemv (talk) 21:32, 26 October 2011 (UTC)
  • "While many of them are free to begin with, due to policies from the NIH some of these papers are free only to scientists and universities, but after a year these papers also become freely available to the general public."
There seem to be several misconceptions here. While "many" publications are free (e.g. those published in PLoS journals), the vast majority of journal papers are not free to begin with; they are restricted to subscribers who indirectly pay at the institutional level for access. I haven't looked to see if the Pande lab publishes most of its articles in open-access journals, but I wouldn't imagine they do (subscription-based journals like Science and Nature are considered more prestigious). The second sentence is misleading -- it makes it sound like the NIH is the one restricting access to research articles for the first year after publication, when it is in fact the journal publishers that restrict access. Read this article for more information on the NIH's (relatively new) public access policy. Even if we put this aside for a moment, recently published journal articles are not free to scientists and universities. The subscriptions that grant access to these otherwise-restricted articles are a significant expense at the institutional level for universities -- they often cost on the order of $10,000 per journal per year (see here). These articles might appear free when accessed online from within the university's network because of seamless ease-of-access, but this is due special network-level configuration of subscription rights. Emw (talk) 05:13, 26 October 2011 (UTC)
I agree with most of what you say Emw, except that you have more knowledge about this topic that is not common to most wiki readers. Not everything is as implied as you imply. ;) In any case, I streamlined this last paragraph based on your comments and hope that most of it meets your approval. I'll ask Jesse to clean up any broken or missing references, as I'm just a hack, not a wikipedian.7im (talk) 19:23, 26 October 2011 (UTC)
The statement holds true. If indeed you look at the list, and Monte Carlo test it by clicking publications at random, I'm sure you'll find that most of them are free. I thank you for that article, that is indeed most enlightening. 7im has enhanced this statement, and while is still holds true to the citation, he explained that "[anyone who cares] about the paper's content is already a researcher, and already has a free subscription at their own college. Nobody that needs the info is blocked from getting it." Jessemv (talk) 21:32, 26 October 2011 (UTC)
Given all this talk about NIH's public access policy, I think it would be awesome for our readers to put links to the open access copies of all the journal articles over 1 year old that were funded by the NIH. (I imagine this would apply be most of the referenced publications.) This isn't imperative, but probably something worth looking into in the future. Emw (talk) 05:13, 26 October 2011 (UTC)
I don't see this as an issue. With the trimming of the NIH statement, this no longer really applies. Plus all of the citations should already be accessable. Somehow Wikipedia generates several external links to various places that hold the article. Please let me know if there's a problem here. Jessemv (talk) 21:32, 26 October 2011 (UTC)
  • Folding@home has produced papers which have been published in top journals such as Science, Nature, PNAS, Nature Structural Biology, and the Journal of Molecular Biology.
One would assume this. I don't think other high-quality science articles note this sort of implied fact. (Also, Folding@home hasn't produced papers. Journal articles have been written about it, and on subjects it has facilitated the research of.)
Fixed. Jessemv (talk) 00:44, 26 October 2011 (UTC)
  • The Pande Group has noted that it can take quite a while (often as much as a year) to go from a result to a published article.
This is also not really worth mentioning. It is par for the course in science.
Fixed. Jessemv (talk) 00:44, 26 October 2011 (UTC)
  • 'Awards' seems somewhat out-of-focus for this article. Pande's awards would probably be more appropriate to reserve for his article. Bowman's award seems just above the borderline for notability to include in Wikipedia, but I think the award would only make sense to mention if his specific research could be more tightly integrated into the article's content. The Guinness record is already mentioned in 'Participation'. The subject of the final paragraph in this subsection -- miscellaneous fellowships and awards for other researchers in the Pande group -- seems like a clear example of the type of unnecessary detail that WP:GACR 3b discourages.
You're right. Fixed. Jessemv (talk) 00:44, 26 October 2011 (UTC)
  • 'Disease research' revisits how Folding@home has been used in research on Alzheimer's, Huntington's, and cancer -- each of which has a sub-section devoted to it in the 'Biomedical significance' section. I think any material in 'Diseaase' research that isn't closely redundant with previous sections should be merged with those previous sections. Remaining redundant stuff here should be removed.
With the scattering out of the Results section I just did, this is fixed. Jessemv (talk) 05:34, 26 October 2011 (UTC)
Hope this helps! Emw (talk) 00:50, 25 October 2011 (UTC)

Update

I think great progress has been made on the overarching task in my previous set of suggestions -- that is, to "condense and hone" the article. When I made that suggestions the article was 8800 words of readable prose as counted by this tool, which includes in its count only the readable text in an article and not the references. The same tool measures the readable prose size of article in its current state at about 6800 words -- 23% less than its previous 8800-word state. More than simply reducing word count, I think the cuts done removed lots of unnecessary detail and redundancy. Although in terms of size of readable prose the article still seems on large end among high-quality science articles of similar importance, I no longer consider the size of this article to be a blocking issue in itself.

With most of the necessary sweeping changes hopefully finished, I'd like to review the article from top to bottom in more detail. As is convention in reviews, I will strike through the bulleted issues below as I consider them resolved.

Lead

The lead is overall in pretty good shape. It's succinct and has good very high-level coverage of the main points of the article. I think the lead sentence is especially well-written. Some notes:

Thanks. User:7im offered some great suggestions regarding that lead and particularly the first sentence. I too really like the lead. Jessemv (talk) 01:34, 6 November 2011 (UTC)
  • The lead refers to 'Dr. Vijay Pande'; and he is most frequently referred to as 'Dr. Pande' throughout the article. Wikipedia articles typically do not prefix the names of scientists (or even medical doctors) with 'Dr.' even though they possess PhD's or MD's, or sometimes both. See, for example, how DNA refers to Francis and Crick: "As first discovered by James D. Watson and Francis Crick, the structure of DNA...", or how RNA interference refers to Fire and Mello: "In 2006, Andrew Fire and Craig C. Mello shared the Nobel Prize in Physiology or Medicine for their work on RNA interference...". I suggest removing all instances of the prefix "Dr." from the article, and leaving simply the last name (e.g. Dr. Pande -> Pande). Emw (talk) 05:24, 4 November 2011 (UTC)
Thanks for the examples. I just wanted to fully illustrate my respect, but you are quite right. Fixed. Jessemv (talk) 01:34, 6 November 2011 (UTC)
  • "Folding@home is run by the Pande Group, a non-profit organization within Stanford University's chemistry department, under the supervision of Dr. Vijay Pande."
As mentioned above, I think referring to the Pande group itself as an NPO fails to accurately convey the research entity's purpose and misapplies the conventional understanding of the term NPO. I think the misconception derives from the fact that entity behind Folding@home -- which has all of the characteristics of a research lab -- is referred to as the 'Pande Group' instead of the 'Pande lab'. At superficial layers, the Folding@home site sometimes refers to the entity as the 'Pande Group' -- however, in deeper content they refer to themselves as 'Pande lab'. For example, the site's left navigation menu has a link titled 'Pande Group', but the link goes to the researchers' main site which refers to itself as the 'Pande lab'. The researchers also style themselves as the 'Pande lab' when referring to themselves in the context of academics they coordinate with on Folding@home, as seen at http://folding.stanford.edu/English/About. I think all instances of 'Pande Group' should be replaced with 'Pande lab' (note lower-case 'lab', as is used in http://folding.stanford.edu/English/About). I think something more accurate and terse would be better, like "Folding@home is developed and administered by the Pande lab at Stanford University, under the direction of Vijay Pande." Emw (talk) 05:24, 4 November 2011 (UTC)
They label themselves as a "non-profit institution". I believe we have discussed "institution" and decided that "organization" was better, albeit more generic. Nonprofit organization says "generally refers to an organization that uses surplus revenues to achieve its goals, rather than distributing them as profit or dividends" which is exactly what they do. However, you are correct that "Pande lab" is more accurate. I've made the replacement you suggested and hopefully it doesn't mess up readability too much. I don't understand exactly why NPO is such an inaccurate term so I left that there for now. I did however use "operated" instead of "administered" since "administered" sounds like someone is handing out something, which although technically this is true (Work Units) "operated" just seems a better word, no? Jessemv (talk) 01:34, 6 November 2011 (UTC)
My main concerns here are that 1) referring to a university research lab as a non-profit (or NPO, or non-profit institution) is a very unconventional use of the term "non-profit", and relatedly 2) Stanford University, not the Pande lab, is the 501(c)(3) non-profit entity. Almost all private US universities are non-profit entities. Research labs within those universities are virtually never referred to as non-profits/NPO's/non-profit institutions. This would be like calling labs within MIT's CSAIL or Caltech's JPL (or X research lab within Y university) "non-profits", which would be very much outside the norm. Further, unless the Pande lab is itself incorporated and registered as a non-profit entity (which is indicated against by http://folding.stanford.edu/English/Donate), I think calling the Pande lab a non-profit/NPO/non-profit institution would also be technically incorrect. It seems like the statement at http://folding.stanford.edu/English/FAQ-main#ntoc4 was probably written with the intent of communicating in an easy, non-technical manner the fact that Folding@home is not a shell group for a pharmaceutical company. But I think most experienced editors would agree that referring to the lab as a "non-profit", even though it is labeled as such in that FAQ, is non-idiomatic and not sufficiently precise and thus inappropriate to do in a high-quality article here. With that said, though, I think it would be alright to say something like "Being part of Stanford University, a non-profit organization, the Pande lab does not sell the data or results generated by Folding@home". I think this does a fair job navigating between something that is precise and human-readable. Emw (talk) 19:37, 6 November 2011 (UTC)
That makes sense. I've removed "non-profit" from the lead, and added your phrase in. I did have to modifify it a tiny bit for better grammar and to not say the same word too many times in adjacent sentences. But it is an excellent way of saying it. Thank you very much. Jessemv (talk) 20:13, 6 November 2011 (UTC)
Looks good! Emw (talk) 20:27, 6 November 2011 (UTC)
  • "In addition to producing ninety-five scientific research papers, more than all other major distributed computing projects combined, Folding@home..."
This assertion is made at http://folding.stanford.edu/English/FAQ-main#ntoc7, but I think it probably lacks credibility. Rosetta@home published about 20 scientific papers in 2010 (see http://boinc.bakerlab.org/rah_publications.php). Assuming conservatively that about 10 of these were derived from research from Rosetta@home, and that that project has kept at about that publishing rate since its formal launch in October 2005, that gives about 60 papers. Surely those 60 papers, combined with all the papers deriving from other major distributed computing projects (SETI@home, Einstein@home, World Community Grid, ClimatePrediction.net, etc.), substantially exceeds the 95 papers published so far based on research from Folding@home. Emw (talk) 05:24, 4 November 2011 (UTC)
The keyword is "major". There was an IP that made this edit, which removed that statement, saying that all of BOINC has produced 104 publications. I have been searching around trying to find the 104 number, but if you go to this this BOINC wiki and total up all the papers you get 104, but its a wiki so there goes that ethos. Now, 104 is clearly greater than 95, but some of these are from relatively tiny or inactive projects. Is there a more reliable list that gives a total count of BOINC publications? Jessemv (talk) 01:45, 6 November 2011 (UTC)
I don't know if there's an authoritative centralized list of scientific publications from BOINC projects, but most individual projects' websites maintain an authoritative list of their own publications. I would consider any distributed computing project to be "major" if it has over 10,000 active volunteer members or currently sustains over 250 x86 TFLOPS (as reported by http://boincstats.com/). Here are some such projects, and the number of publications they list as deriving from their research:
The above counts sum up to 173 papers. There are over 60 other active distributed computing projects on BOINC, but these don't meet the criteria for being "major" as defined above. Now, I don't think there is any standard definition of what qualifies a distributed computing project as "major", and each of the individual projects listed above is several orders of magnitude smaller than Folding@home in terms of active users and x86 TFLOPS, but I think the criteria I've used here are a reasonable way to distinguish among all distributed computing projects those ones that have major interest among the general public and/or exceptionally high computational power.
In light of that, an accurate version of the statement in question might be: "In addition to producing ninety-five scientific research papers, more than any other individual distributed computing project, Folding@home...". However, this would need to be reliably sourced. Emw (talk) 16:06, 6 November 2011 (UTC)
Wow thank your for all that research. I made a post on foldingforum.org, and Vijay Pande confirmed that statement to no longer be accurate. He has removed it from the website, which I then confirmed and it is gone. Accordingly, I have entirely removed the clause from the lead. I have also reworded the statement, as it has been pointed out that Folding@home does not produce papers, it generates data which the Pande lab produces papers from. :) Jessemv (talk) 01:39, 7 November 2011 (UTC)
  • "Folding@home has caused paradigm shifts in protein folding theory."
The term 'paradigm shift' has a well-known tendency for being used as marketing speak. As such using the term in the introduction raises a red flag to a fair proportion of readers that the content they will be reading is overly promotional of its subject. To allay this concern I suggest covering precisely what these paradigm shifts actually were in the body of the article, with specific reference to the 2010 Kuhn Paradigm Shift Award given to Bowman. (I think it's fine to mention the paradigm shifts in the introduction, too.) The summary of the award at http://folding.stanford.edu/English/Awards is vague, and seems borderline promotional and thus somewhat unreliable. The summary from the NIH Simbios site seems more detailed, less invested and more reliable: http://simbios.stanford.edu/news.htm#item44. A citation to an actual summary from the ACS on the 2010 Kuhn Paradigm Shift Award would be ideal, but I wasn't able to find it after a quick search. (P.S.: This article and the series it's in by Errol Morris offer a cogent, trenchant critique of the term 'paradigm shift'. I won't fuss about the validity of the term and thus its usage in this article, but Morris gives some great perspective.) Emw (talk) 01:54, 6 November 2011 (UTC)
I was not aware that the term was being abused in that manner. I thought it still retained its powerful original meaning. I spent a good deal of time yesterday searching for Bowman's award, and although I did find what appears to be the definition of the award I was unable to find the original description. The closest I found is here but you're right something from ACS is superior. I have emailed Bowman, explaining the context and humbly requesting a link to the site if he knows where it is. I have yet to receive a reply, so in the meantime I'll continue my search. Jessemv (talk) 20:13, 6 November 2011 (UTC)
Dr. Bowman emailed me back today, saying the following: "Thanks for supporting Folding@home and for your generous email. Unfortunately, there is no announcement from the ACS that I can point you too. Do you have any specific questions I can answer for you? Best wishes and thanks again, Greg" What should I ask him? Should I forward his reply to those questions to that one email-collection place in Wikipedia? Thanks for any help you can provide. Jessemv (talk) 01:50, 8 November 2011 (UTC)
I don't know what you mean by "that one email-collection place in Wikipedia". I would ask Bowman if he would be willing to post online -- either at a blog or some other permanent web resource -- the talk/presentation he made that won him the award. This would be a useful reference (and given that, note that an email unfortunately wouldn't suffice). More specifically, I would be interested in knowing whether the paradigm shift was more a result of his MSMBuilder itself than the wider Folding@home project, vice versa, or where the balance lies in relative contribution between those two projects to the paradigm shift. Hopefully this would resolve the slight inconsistency in what the paradigm shift(s) were attributed to between the SimTK source and the Folding@home website's source. If you're really interested in the material, then I would also read up on MSMBuilder (e.g. https://simtk.org/home/msmbuilder and any associated papers), and ask for explanation and/or more information on anything that isn't clear. Cheers, Emw (talk) 03:00, 9 November 2011 (UTC)
Sorry for the confusion. I guess I was referring to Wikipedia:Volunteer Response Team but I ended up asking Dr. Bowman himself if he could provide any external links, and he relied back with "No luck on getting a copy of the talk sadly. They seem to remove older content from the web pretty quickly. Sorry I couldn't be of more help." However, I was able to pull in a good reference that explained his award, and it turns out that MSMBuilder was a major factor, since it was instrumental in not only helping the move away from more traditional molecular dynamic simulation methods, but its MSMs that it built also attained a quantitative agreement between theory and experiment. Dr. Bowman is the lead developer on MSMBuilder. I've emailed the ACS group responsible for his award, but haven't heard back yet. However, since I used the most reliable and detailed second-hand source that I've found in the article, I'm considering this issue addressed. If I can eventually access the first-hand explanation, we can always substitute it in. Jessemv (talk) 06:34, 1 December 2011 (UTC)
  • "...as well as publicly releasing all of Folding@home's results"

: I think this is an overly-positive mischaracterization of the lab's publishing policy. Submitting most research articles to subscription-only journals that -- by government mandate -- make those articles publicly available after a year is not "publicly releasing all results". The current phrasing portrays the lab as more magnanimous than it actually is. If the lab posted PDF copies of its 12-month-embargoed articles (as is done, for example, at http://boinc.bakerlab.org/rah_publications.php), then I think the phrasing would be justifiable. Even then, though, I do not think the fact that the Pande lab publishes research and makes results available to researchers upon request is significant enough to mention in the introduction. This is not really notable, and at best would possibly warrant mentioning in the body of the article. Emw (talk) 01:54, 6 November 2011 (UTC)

Quite right. Our above discussions and new citations that have been pulled in no longer make that statement entirely true. I believe the best solution is to remove the statement, which I have done. Jessemv (talk) 05:57, 6 November 2011 (UTC)

Comparisons to Anton Super Computer

The comparisons to Anton has serious NPOV problems. — Preceding unsigned comment added by 74.73.228.91 (talk) 04:46, 4 November 2011 (UTC)

Which specific phrases do you think are inappropriate? Let's discuss it. The statement above is too aggressive and accusatory -- let's try to address things in a civil and cool-headed way (context Emw (talk) 20:23, 5 November 2011 (UTC)). Emw 05:41, 4 November 2011 (UTC)
I've trimmed my criticism and I'll make an effort to point out specific examples.
These statements are opinions but are presented as facts:
* "Folding@home's results compare well with other molecular dynamics systems, such as the Anton supercomputer."
* "Folding@home's statistical assembly of shorter simulations reproduce Anton's long simulations very well"
* "and the Markov State Models can find important new features missing in Anton's traditional analysis."
* "Although noting that Anton was in many ways several years behind FAH"
All of these statements are drawn directly or indirectly from a single individual (Vijay Pande)--the chief proponent of Folding at Home, a competing approach to Anton.
The entire section comparing FAH with Anton (and with Rosetta@Home, as far as that goes) lacks a neutral tone. I would suggest rephrasing it to something along
these lines:
There are a number of physics-based approaches to molecular simulation, all of which are an approximation to Quantum Mechanics and all of which make different tradeoffs between computational effort and various kinds of accuracy. There is no scientific consensus on the "best" method. Folding@Home and Anton both fall into the category of explicit solvent molecular dynamics. Through a combination of specialized hardware and software, Anton is optimized to the traditional approach of simulating a single long trajectory at a time and is able to do so orders of magnitude faster than other systems. In FAH terminology, Anton is optimized to run a single Work Unit very fast. Folding@Home is targeted at platforms which run a Work Unit much slower, but uses various statistical techniques to compose many parallel Work Units to stitch together long trajectories. The statistical techniques come with various caveats and tradeoffs; there is no scientific consensus on whether the long trajectories extrapolated from Folding@Home are as valid as the long trajectories created by traditional molecular dynamics. Obviously, the Folding@Home team believes in their approach just as the Anton team believes in their approach.
Both approaches can sample ensembles by composing many Work Units in various ways.74.73.228.91 (talk) 16:50, 5 November 2011 (UTC)
For reference, the second paragraph here is the content being discussed. In general, I agree that a forum post by a researcher that sheds better light on their own project than a competing research project is not a sufficiently reliable source, because it lends itself to violating WP:NPOV and is not peer-reviewed. The editor who wrote the content seems to agree here about the reliability of that source and that the paragraph is "a little-one sided". They intend to improve that coverage within a week. Maybe it would be worthwhile to review whatever the resulting changes are. Emw (talk) 17:48, 5 November 2011 (UTC)
Yes indeed. *Edit Conflict*: Thank you for clarifying. I'm a bit busy at the moment but very soon I will be conducting further research into both sides of this issue. I will be reading some of Anton's papers as well as a more thorough read of F@h's paper which I cited. You are correct, the comparison is a bit one-sided and clearly further research is required. I appreciate you pointing out your particular issues. Expansion and cleanup of that section are on the top of my to-do list. I will examine your suggestions and try to use them in an encyclopedic fashion. If you like, I can make a note here when I make those major changes. Standby. Jessemv (talk) 17:52, 5 November 2011 (UTC)
I've added the Neutral Point of View flag to the section until this issue is taken care of. Jessemv (talk) 18:03, 5 November 2011 (UTC)
I would like to apologize for the lack of neutrality in that section, and on the Anton page. I have improved that section, so it should be much better. Please let me know what you think. It will likely be further improved, but the issue should be taken care of. I've read several publications and it has become apparent that Anton is an amazing machine that is very important. I have made sure to note this in the section, replacing those one-sided forum posts. It was wrong of me to include that content when I had only really explored one side of the fence. I understand your position now, and once again I apologize and hope that we can move on in a positive direction. My knowledge of Anton has grown significantly, and I have learned from my mistake. Thank you for your help, and have a nice day. Humbled, Jessemv (talk) 01:20, 9 November 2011 (UTC)
Looks good, thanks! 74.73.228.91 (talk) 04:59, 15 November 2011 (UTC)

Since I'm still apologetic about originally offending you, I feel that I should mention here that I just made some changes to the Anton paragraph, in order to present the F@h-Anton relationship in a more professional light. For the record, I am referring to this version of the paragraph, but of course you can always check the current version of it as refinement goes on. Best, Jessemv (talk) 06:57, 26 February 2012 (UTC)

File:F@h v7 novice shot.png Nominated for speedy Deletion

Image-x-generic.svg

An image used in this article, File:F@h v7 novice shot.png, has been nominated for speedy deletion for the following reason: Wikipedia files with no non-free use rationale as of 24 November 2011

What should I do?

Don't panic; you should have time to contest the deletion (although please review deletion guidelines before doing so). The best way to contest this form of deletion is by posting on the image talk page.

  • If the image is non-free then you may need to provide a fair use rationale
  • If the image isn't freely licensed and there is no fair use rationale, then it cannot be uploaded or used.
  • If the image has already been deleted you may want to try Deletion Review

This notification is provided by a Bot --CommonsNotificationBot (talk) 16:11, 24 November 2011 (UTC)

I defended the image in its Talk page. Hopefully we can get this resolved. Jessemv (talk) 17:10, 24 November 2011 (UTC)
I recommend reading through the links provided by CommonsNotificationBot. Unfortunately, the fact that Pande gave permission to use some images is not enough to meet Wikipedia's licensing or fair-use rationale requirements. I would suggest adding a fair-use rationale template, as was done with File:LifeWithPlayStation_Folding.jpg. The template used in that image should be fine to use for the image in question here. A list of other non-free use rationale templates can be found here.
Even better than using one of those templates, you could translate Pande's blanket permission into an actual free license like CC-BY-SA 3.0 by processing the images through OTRS. This would allow the images to be used anywhere, for example on other Wikipedia articles or Wikipedia's main page. If you want to pursue getting a free license on those images and have any specific questions about OTRS after reading through this article, let me know. All this licensing stuff can be frustrating and seem like a bit much, but it's important to prevent contributors from ripping off content creators and the Wikimedia Foundation from being sued.
At the talk page for the image in question, you mention that the software underlying the screenshot program is freely licensed under GPL. The information page for the V7 client here shows a screenshot with a standard non-free copyright license. The download page for other clients here links to a licensing page here, and that license is also quite restrictive and definitely not GPL. Where did you see that the software was licensed under GPL? Emw (talk) 16:08, 25 November 2011 (UTC)
Thanks for the suggestions, I'll take a look and see what can be done here. I found the GPL License under the About feature of the v7 client. For proof, I took a screenshot, which I put online here. Thank you very much for the template information, it is indeed a bit confusing but I fully understand why they have it in place. The v7 client is currently in beta testing, and is not on the Download page at this time. It will be once a number of bugs with it are fixed, but right now they don't consider it ready for a public client just yet, and it will be replacing all of the v6 clients. And I'm not sure how GPL conflicts with their other licenses, which they have in place so that people don't try to cheat or install F@h in an unsupported way. Jessemv (talk) 16:38, 25 November 2011 (UTC)
Since the software underlying the GUI you've taken a screenshot of is under GPL, it would probably be fine to use Template:GPL screenshot. Emw (talk) 17:15, 25 November 2011 (UTC)
Thanks very much! I'll go and apply it. I think I understand where all the licenses fit into F@h: v7 is under GPL, but the underlying FAHClient which it can control (I say "can" because third-parties can make a different GUI and FAHClient can be run by the command-line) is not for data integrity purposes, and the cores themselves have a different license as well, and many of them are open. Confusing! Thanks for helping me navigate the legality here. I very much appreciate it. Jessemv (talk) 19:22, 25 November 2011 (UTC)
Made some changes to the file, including using the Information template and the GPL license. Do you consider this issue solved? Jessemv (talk) 19:44, 25 November 2011 (UTC)
Yes. Emw (talk) 21:05, 25 November 2011 (UTC)