Jump to content

Talk:AlphaFold: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Line 145: Line 145:
:::* John Moult: "This is a big deal. In some sense, the problem is solved." ([https://www.nature.com/articles/d41586-020-03348-4 ''Nature''], 30 November)
:::* John Moult: "This is a big deal. In some sense, the problem is solved." ([https://www.nature.com/articles/d41586-020-03348-4 ''Nature''], 30 November)
:::* CASP press release ([https://predictioncenter.org/casp14/doc/CASP14_press_release.html]_ "Artificial intelligence solution to a 50-year-old science challenge" (30 November)
:::* CASP press release ([https://predictioncenter.org/casp14/doc/CASP14_press_release.html]_ "Artificial intelligence solution to a 50-year-old science challenge" (30 November)
:::* John Moult: "In a serious sense the single protein single domain problem is largely solved" (CASP 14 closing presentation, 4 November. [https://drive.google.com/drive/folders/1_ywJ_LwoilA09vhB7r2Z8Z0YJfQjcj7c CASP stream] day 5 part 3, at 0:04:40), cf [https://twitter.com/d_kihara/status/1334942095925243909?s=20 tweet]
:::* John Moult: "But actually two-thirds of these points... are above 90; and in that sense, this is a real, atomic-level, competitive-with-experiment solution to the protein folding problem. This is a really extraordinary set of data. I wasn't sure I was going to live long enough to see this, but here it is. Again, with a few caveats, but the caveats are relatively minor." (CASP 14 introductory presentation, 30 November. [https://drive.google.com/drive/folders/1_ywJ_LwoilA09vhB7r2Z8Z0YJfQjcj7c CASP stream] day 1 part 1, at 0:30:30)
:::* John Moult: "In a serious sense the single protein single domain problem is largely solved" (CASP 14 closing presentation, 4 December. [https://drive.google.com/drive/folders/1_ywJ_LwoilA09vhB7r2Z8Z0YJfQjcj7c CASP stream] day 5 part 3, at 0:04:40), cf [https://twitter.com/d_kihara/status/1334942095925243909?s=20 tweet]
::::but later "there was some talk that we should abandon this in CASP, as it is a solved problem. Clearly that's not true. It is solved by one group, and it is one solution; and I think what we heard in the nice discussion earlier today is there are probably many ways of solving this problem, and I think it is very important that we continue to monitor those, and see which actually work." (same, from 0:05.20)
::::but later "there was some talk that we should abandon this in CASP, as it is a solved problem. Clearly that's not true. It is solved by one group, and it is one solution; and I think what we heard in the nice discussion earlier today is there are probably many ways of solving this problem, and I think it is very important that we continue to monitor those, and see which actually work." (same, from 0:05.20)
:::Some pushback (most of which we note in the article):
:::Some pushback (most of which we note in the article):

Revision as of 15:57, 12 December 2020


More resources

Parking some of these here:

  • AlphaFold's presentation at CASP14 was at 15:50 - 16:20 Tuesday (1 December; UK time). The session closed at 18:30, so reports may emerge soon.
    "John Jumper's bracing recap of AF2's pipeline at #CASP14 revealed major tweaks, adding structure-aware physics & geometry to the iterative NN design in an end-to-end learning of woven chain through a coalescing 'gas of rigid 3D bodies" [1] (3 Dec)
  • All CASP sessions are being video-recorded, but it is not confirmed whether all will be released. As of Thursday 03, the first day sessions are available via the CASP website As of Friday 04 a video of the talks by the Baker lab and the Zhang group is now up; but not AlphaFold, which followed them.
  • The AlphaFold 2018 presentations were uploaded, at [2] and [3]
  • "How Neural Networks took over CASP" gif -> pics NNs eat the world :-)
  • AF2 team *did* show some pics on how predicted contact maps evolved during AF2 convergence. (tweet)

reactions

(tweet: "Is protein folding 'solved'? Not quite. I spoke to a range of researchers about what #AlphaFold can and can't do - and why existing experimental techniques to understand proteins will still be needed")
  • CC-BY resource on protein sidechain rotational statistics and tweet
  • Twitter thread [9] including comment on T1040-D1, where assessors said there was a "pre-evaluation split". Tweeter says "the ground truth may contain experimental artifacts - e.g., many hydrophobic res are exposed". Also finds it extraordinary that AF2 could predict disulphide bonds. Cautions that "overtraining on static structures doesn't automatically solve the real-world problems that primarily needs *flexibility* prediction, such as drug discovery, protein misfolding/aggregation, antigen-antibody binding, etc."
OK. So the issue with T1040-D1 is that it comes from a very big protein (doi:10.1038/s41586-020-2921-5), that naturally splits into parts which then stick together. The sequence of only one of these parts was given for prediction. This is why, as labelled in the picture in the tweet, part of the structure contains an alpha helix with a run of exposed very hydrophobic residues. Normally, in aqueous solution, this would be very unphysical. In fact it is the burying of hydrophobic residues into the interior of the protein that, in free-energy terms, drives the whole folding process. The explanation here is that these hydrophobic residues are actually where this part of the protein sticks to its neighbour. In fact, these hydrophobic bases are what sticks this part of the protein to its neighbour. Jheald (talk) 11:02, 6 December 2020 (UTC)[reply]

some discussion

Unfortunately, I do not have access to Business Insider, but speaking about the original tweet by Mike Thompson, yes, he is right: without solving the problem of high precision refinement, one can not talk about solving the protein folding problem. There is a good reason why such refinement is treated as a separate category on CASP. An additional problem here is the existence of multiple structures of flexible loops and the overall problem of protein flexibility. People can determine a number of highly divergent structures of same protein, especially if the protein was transmembrane - can AlphaFold reprocuce the whole spectrum? The high precision + reproducing the flexibility is a must for ligand docking/drug design. This is also the reason one can not consider any experimental structure as "the truth" like Mike does, and his comment about "10-20 times worse" is puzzling. My very best wishes (talk) 19:42, 3 December 2020 (UTC)[reply]
@My very best wishes: You can read the Business Insider piece by clicking "view source" & scrolling down past all the javascript. But there's not a huge amount there. Jheald (talk) 19:58, 3 December 2020 (UTC)[reply]
Thank you! It does looks like a huge step forward. As soon as AlphaFold-2 is available to community as a source code (or/and as a working public webserver), it can be independently re-tested and used in a variety of ways by others. Just for starters, I would expect AlphaFold to fail for predicting protein structures with intertwined association of subunits [10], but it would be be interesting to see what it produces in such cases. I hope that a journal where they publish will make the availability of the software a condition for the publication. This is needed: (a) to make sure that the results are truly reproducible, and (b) to make this method widely used, exactly as it deserves. In some way, a method does not exist until it is publicly available. My very best wishes (talk) 20:53, 3 December 2020 (UTC)[reply]
Yes, this is all exciting. According to this, the group is now focusing on calculations of protein complexes (which should also cover the multi-domain proteins). Indeed, that is exactly what they should do, and it is straightforward using very same approach. That would immensely expand the applicability of AlphaFold. I still wonder what they would get for proteins with multiple structures like transmembrane transporters. My very best wishes (talk) 16:36, 4 December 2020 (UTC)[reply]
@Jheald. Most links above are just opinion pieces. This could be an important achievement, but the following is needed: (a) a peer-reviewed paper about AlphaFold-2 should be published; (b) the assessment on CAS14 should be published, and (c) most importantly, the method should be made available to the wider scientific community (as a public webserver or source code), so that other people can check how the method is actually working in real life (CASP is not the last word here) and be able to use it.
  • Just to be completely clear, if AlphaFold-2 was publicly available and easy to use, people (me including) could run a large number of specific tests where this method would fail. Such tests would include linear peptides, proteins with multiple alternative structures (there are hundreds such cases in the PDB), certain types of protein complexes, single sequences and small or questionable alignments as input, "intrinsically unfolded" proteins, etc. This might be one of the reasons why authors of AlphaFold said they are not going to make it publicly available any time soon. My very best wishes (talk) 19:58, 5 December 2020 (UTC)[reply]
@My very best wishes: I agree. For a 'solution' to have been reached scientifically, other teams need to be able to replicate what AF2 has done. And we need qualitative, not just functional, understanding of how AF2 is doing so well -- is there 'extra' information that AF2 is finding, or holding on to better? What different qualitative things (and representations) is it managing to isolate in different parts of its attention units? Are some of these novel? The solution is not really a 'solution' until we understand it. And when it is understood enough, then that may give us clues to techniques that may be able to capture the essence of those new qualitative things in ways that may be faster, more efficient, less compute-intensive.
Now we know publicly what is possible, DeepMind might well want to try to protect its lead. (Or it may not: Demis Hassabis is a bit more complicated than that). So information of may or may not be quick to emerge. It's a difficult horse for DeepMind to ride, because I suspect information like that above will emerge, and sooner rather than later, if not from DeepMind then from somebody else. (eg, per the links above, Facebook's team in New York also look to be all over this space; and others won't be far behind). I imagine the lawyers have been all over this, and DeepMind's paymasters (Google). But ultimately I suspect that this will end up being a win for all mankind, that no one player will be able to corner or monopolise (though many may do very well). So, how important are the scientific bragging rights to DeepMind, of being able to explain, qualitatively, how their system achieves what it does, before somebody else does? And how long, do they calculate, before somebody else will be able to do that, regardless of whatever they do? Those are the questions that will condition what DeepMind does next. Last time I think most of their method was in the presentations they gave at CASP. (Not 100% sure, haven't checked against the published papers). This time apparently their presentation was pretty sketchy. Personally I would expect that a more detailed description of their system will emerge, though maybe not in the CASP proceedings; plus information about what some of the internal states of the model that the model finds may look like. My guess is that that might appear within a year, possibly sooner, possibly once DeepMind have already worked out where they want to go next. But who knows?
Yes, what I have collected above are mostly opinion pieces. Where the article still needs work (IMO) is in the lead and the reactions section. In particular: what DeepMind has achieved; how significant is it; how was it reacted to (including counter-reactions); what are the limitations, opportunities, and next steps in the field. All of this needs work. It's for this that I have been trying to gather reactions; also to try to be aware if there are angles or relevant things I may have missed; also for any more info that may be out there about what it does and what makes that work. Jheald (talk) 12:37, 6 December 2020 (UTC)[reply]
You said: "Now we know publicly what is possible" [to accomplish using AlphaFold-2]. No, we do not know it even after CASP-14, that's the point. The method must be publicly available to other people who would run a lot of additional tests (like I noticed above) and use it. Can the results by the AlphaFold team on CASP-14 be reproduced by other researchers who were not developers of this method? We do not know even that. John Moult said: "In some sense, the problem is solved". In what sense, and was it really solved? We do not know it until the authors (or someone else) make a public web server or software which is actually available to community (of course it is NOT solved, as anyone can see even from the presentations on CASP-14, see slides 16-18 here, for example). Also, people always suppose to publish their research in such details that others can easily reproduce their results; this includes software if needed. My very best wishes (talk) 22:16, 6 December 2020 (UTC)[reply]
The point is that even without AF2's code, or even their detailed algorithm, we know now that this sustained level of detail is *possible* to achieve by code somehow, at least for proteins of the kind that get into the PDB. That, plus the fact that other major AI groups are already deeply into research of this space, will just in itself be a huge spur to people to try things. Even if the information were not forthcoming from DeepMind (and I think it will be), it is only a matter of time before more groups achieve this capability, and it becomes common knowledge how, and why such results can be obtained. Jheald (talk) 23:15, 6 December 2020 (UTC)[reply]
  • I am saying that (a) AlphaFold still has a lower precision than X-ray crystallography and probably than solution NMR spectroscopy based on results of CASP-14 assessment, and (b) it probably will NOT be able to provide the "sustained level of detail" (as you say) for a lot of PDB entries, such as linear peptides (e.g. studied by solution NMR under various conditions), proteins with multiple alternative structures (there are hundreds of them in the PDB), many protein complexes, proteins represented by single or a few sequences in genome and "intrinsically unfolded" proteins. It so happened that such cases were not present in the CASP-14 protein set or they were actually present in significant nombers [11], but were not taken for testing of AlphaFold-2. My very best wishes (talk) 00:07, 7 December 2020 (UTC)[reply]
@My very best wishes: I've made a first stab at expanding the "Responses" section, which I hope now supports a corresponding section in the lead. Let me know what you think! If anything, perhaps the article (lead in particular) now perhaps undersells what AF2 achieved. If there are bits that aren't referenced enough, I'd be grateful if you could tag them with {{cn}} or {{refimprove}}, rather than outright removing them, so I can see if I can find anything better to support them. Alternatively, if you know some good additional refs for any of the points, do please add them! Best regards, Jheald (talk) 00:35, 8 December 2020 (UTC)[reply]
At the end of "Responses" section one of well known researchers tells that "Their predictions of structures are very good, as good as those produced experimentally." This is obviously a false statement, as follows from CASP-14 presentations (consider slides 16-18 here as an example). This is very bad to include misinformation to WP without explicitly saying it is misinformation, even if it can be sourced. One must simply follow WP:MEDRS for extraordinary claims, such as that one. This is not even close to anything WP:MEDRS. I suggest to remove any extraordinary claims that "the problem was solved" sourced to non WP:MEDRS publications.My very best wishes (talk) 01:20, 8 December 2020 (UTC)[reply]

A few questions

This is an extraordinary success! Some questions:
  1. What distance cutoff did they use to calculate GDT_TS in second Figure on this wikipage? Is it "the average result of cutoffs at 1, 2, 4, and 8 Å" as our page Global distance test tells? Was it always calculated only for model #1 or for "best of five"? This must be a single cutoff of ~1.5A and only for structure #1. "Best of five" is manipulation with data.
  2. What the circles on the 2nd Figure on this page show? I can only guess these are the best GDT_TS values of predictions for individual proteins obtained using all methods by all groups? What would be a similar chart specifically for AlphaFold? That would be something more relevant to the subject of this page.
  3. Speaking about the procedure and especially energy functions, was everything described here or there are other publications on methodology?
  4. Do I understand correctly that inputs (in particular for the current AlphaFold version) are sequence alignments, and the results would be different for individual sequences?
  5. Did they say that a source code will soon be available or perhaps even a public web server to use AlphaFold? My very best wishes (talk) 02:57, 3 December 2020 (UTC)[reply]
  6. Why did not they apply their method to protein refinement and multi-subunit proteins on CASP-14? If it can not be applied for such task, then the claim of solving protein folding problem is at best premature.
  7. Do I understand correctly that it is not prediction/output of the program, but predictions by a large group of people who used the program? There is a big difference.
  8. How do they assess the difficulty of targets? In theory, for a method like that, one should run the targets against the entire PDB using SSM server (https://www.ebi.ac.uk/msd-srv/ssm/) to determine the % of structural coverage/overlap with known PDB structures. This coverage should then be compared with GDT_TS produced by prediction. This is something trivial, but did anyone do just that?
  9. What were the results of testing such method not in CASP setting, but simply for known structures in the PDB? For example, can it reproduce both inward- and outward-facing structures of TM transporters? Or it could be tested for new protein folds, a few of which are released by the PDB every week. My very best wishes (talk) 16:36, 3 December 2020 (UTC)[reply]
@My very best wishes: Thank you so much for your scrutiny of the article yesterday, and for your eyes on this. In answer to your Qs, as far as I can:
1. Yes. [12] : "GDT_TS - GlobalDistanceTest_TotalScore : GDT_TS = (GDT_P1 + GDT_P2 + GDT_P4 + GDT_P8)/4, where GDT_Pn denotes percent of residues under distance cutoff <= nÅ".
Most of the tables on their system allow you to choose whether to plot model #1 or best-of-the-five. Not sure (atm) which was plotted (it maybe says in the video of the presentation; and no doubt it would be stated in the discussion of the corresponding pics in the Proceedings volumes for previous CASPs).
2. I think so. Detailed data, is available from the site under Results Home -> Table Browser, where you can select just for AlphaFold; also similarly for 2018. It's all also available as raw data to download, in an R-friendly format. For 2020, I think AlphaFold placed best for all but a handful of proteins, so the big circles are almost all its own. For 2018, t placed first for about a half. I think the small circles on the right of the chart are mostly AlphaFold's, as to those to the left I don't know how far off it was.
3 For AlphaFold 1, the Nature paper I think is the most detailed. There is also a write-up in the special issue of Proteins about the conference (link on article page), and the two original presentations they gave at the 2018 conf (links above). The YouTuber linked above does quite a nice talk-through of the Nature paper.
4 I think what they do with AlphaFold 2 is take the sequence they are given, and use standard methods to pull matching DNA sequences (given in their conference abstract). They then appear to use a transformer network to assess how individually significant each one of these sequences is to each residue position, in a context-dependent way that depends on everything else the system currently thinks it knows (in particular, but not only, its latest best guess of the structure). This information, plus the internal latent variables found to determine it, is one of the feeds into the second transformer network, which estimates which residues have relationships (near and far) to which other residues. All of this feeds into the prediction module. I don't know whether they take the alignment between sequences as a given, but it wouldn't surprise me ('single differentiable end-to-end system') if it was also up for grabs during the convergence process for a particular structure.
5 It's interesting, isn't it. Will they try to monetise it? Being an end-to-end system it may well be easier to release than AlphaFold 1, where they released about half, but said the other half depended on third-party services, the internal configuration of their hardware etc. I imagine at the very least a service would be available, though it might be by subscription. But I imagine that now people know this is possible, there will be 101 groups trying to clone it, and trying their own different tweaks or preferred sorts of networks. The extent to which DeepMind can 'own' this will be determined by what they can patent, how abstract and high-level of the design patent offices let them patent, and whether those patents survive objections or challenge. (Also, how determined DeepMind are even to try to own it). I *hope* that they won't be allowed to own anything so fundamental that it can't be worked around. In which case DeepMind probably won't be worth $100s bns. But who knows.
6 They were quite busy! A completely new blank-sheet design, to find ways to make work. No small task. I think they probably set themselves a defined project, and then to focus to do the very best they could on that defined task; and then think how to go next. Would the same approach extend to multiple sub-units? It might well. There could still be co-evolution data. And it could be that their residue-residue attention system and their predictor might have got a bit better than conventional physics models. And there will no doubt be more new ideas to come. While the team said they thought they'd taken AlphaFold 1 about as far as they thought it could go, I suspect they're right that what they've managed this week may only be the start of this road.
7 According to the abstract, there was very little human intervention. sometimes they chose older runs for #4 and #5 models, if otherwise the predictions would have been too similar.
8 Not sure. There obviously is a real-valued measure, from the charts, but I haven't found where the calculation is.
This is how they assessed difficulty in CASP 8 (2008) [13]. But the calculation may have developed. Jheald (talk) 22:02, 3 December 2020 (UTC)[reply]
These guys suggested a particular difficulty metric in 2016, [14], but I don't know if CASP adopted it. Jheald (talk) 22:18, 3 December 2020 (UTC)[reply]
Here's the overview of CASP 11 (ie 2014) [15], with a similar distance graph. Discussion of difficulty measure in Supp Materials [16], appendix 2, which cites this review of progress up to CASP 10: "we consider the difficulty in terms of two factors: first, the extent to which the most similar existing experimental structure may be superimposed on a modeling target, providing a template for comparative modeling; and second, the sequence identity between the target and the best template over the superimposed residues." This was still the reference cited in the review of CASP 13 (2018) [17], so would seem to be the scale used. Jheald (talk) 22:23, 3 December 2020 (UTC)[reply]
So: for the actual calculation see "Target Difficulty" in the Methods section at the end of [18] (rather than the earlier sections on Target Difficulty, where it is critiqued). Jheald (talk) 22:44, 3 December 2020 (UTC)[reply]
9 They trained the algorithm on the entire PDB, so DeepMind could maybe tell you :-)
Jheald (talk) 21:27, 3 December 2020 (UTC)[reply]
Thank you!
1. Well, I am afraid this makes CASP assessment less certain. This way even very poor predictions would get a modest score. I think the assessors should keep it simple and use cutoff of ~1.5 A if anyone, including organizers, wants to make the claim of solving protein folding problem to some degree. That cutoff can be for CA atoms (that is what they usually do) or all atoms.
4. No, they used sequence alignments and said (last slide in the first pdf of their presentation): "With few or no alignments accuracy is much worse". This is exactly what experts would expect. The question is: how much worse? This is really important because there are numerous "singleton" sequences in genomes. This alone is a reason they can not claim solving protein folding problem.
5. This is sad. Sure, this is exactly what people will do and a lot more. But that's the purpose. As Shota Rustaveli said, "What you hid has been lost, [but] what you gave [to others] is yours". And remember, what they created is not science (what have they learn about proteins by doing this?) but merely a "black box" which seems to be useful only as a predictor's black box. I assume however that other people would be able to create their own codes based on information they will have.
6. The refinement. Answered here. There is way to go.
7. That can make a huge difference based on results of previous CASPs. That means they were not able to completely automate the procedure, which is fine, but a different prediction category.
8. Thank you. So yes, that is exactly what assessors did [19]. I simply did not read CASP papers for a long time.
9. Yes, testing the method on the training set is not a good idea, especially in machine learning, because unlike in simple QSAR methods (for example), one does not even know how many adjustable/fitting parameters they have. This can be a million. Well, maybe less. The developers should know. But still, this is a 100% legitimate question: what was the performance of the method on the training set (i.e. the transporters I mentioned, etc.). I think the reviewers of their next publications should ask them to provide such results somewhere in a readable form, which would not be a problem for authors of the method. The reviewers should also ask them to recalculate GDT_TS as above. My very best wishes (talk) 23:22, 3 December 2020 (UTC)[reply]
But here is the bottom line. Until they make a public web server, I can not even independently assess how their method is actually working for certain cases I would be interested in. I have seen http://www.predmp.com/#/home server [20], which is based on a similar methodology. Did it exceed my expectations? Yes. Was it really so great and useful for my work? No. My very best wishes (talk) 04:06, 4 December 2020 (UTC)[reply]

DYK submission

Since the article has fallen off the bottom of the page at WP:ITNC (final state of discussion), I have made a submission to DYK with the following hook:

Did you know...

(I'm hoping they may be able to overlook that the submission is one day past the specified deadline for DYK. They sometimes do).

cc: @Ktin, Bender235, Gotitbro, Alexcalamaro, GreatCaesarsGhost, Glencoe2004, and Keilana: who supported it at WP:ITNC. Jheald (talk) 19:05, 8 December 2020 (UTC)[reply]

Jheald, Thanks much for this one! I am pinging Yoninah who is an expert on these DYK topics. I agree that this article is good for DYK. Lots of good work has happened in the build out of this article. Ktin (talk) 19:15, 8 December 2020 (UTC)[reply]
In spirit of MOS:PEACOCK, it's probably best to simply state the facts: that AlphaFold has been the first competitor to reach over 90% prediction accuracy in the 26-year history of CASP. And then we could add an expert evaluation of that achievement. I guess what I'm trying to say is I'd prefer a blurb like: ... that AlphaFold 2 won the 14th biannual CASP competition achieving 92% accuracy, essentially solving the decades-old protein folding problem. I still can't believe this didn't "qualify" for ITN. --bender235 (talk) 19:40, 8 December 2020 (UTC)[reply]
One day late is fine. I agree that a simpler-worded hook is best. Yoninah (talk) 19:52, 8 December 2020 (UTC)[reply]
If you've got a better hook, feel free to submit it. I've done what I can. Fair point about mos:peacock, something more substantive and crunchy is better. But note section below, re claims like "has solved... " Jheald (talk) 21:19, 8 December 2020 (UTC)[reply]
(Added: For what it's worth, my original thought was DYK ... that the results of DeepMind's AlphaFold 2 program in the CASP 14 protein structure prediction competition have been called "astounding", transformational, and "a solution to a 50-year-old scientific challenge", but then I thought to drop the last bit, if it was no longer going to be in the article lead) Jheald (talk) 11:00, 9 December 2020 (UTC)[reply]
  • "AlphaFold has been the first competitor to reach over 90% prediction accuracy in the 26-year history of CASP" - even that is slightly problematic because what does it mean "accuracy" of 90%? Global distance test includes distance cutoffs like 4 and 8 A for CA atoms in the best of five computational models (this is a manipulation with numbers!). Make it single cutoff of 1.5 A for all atoms (roughly the precision of solution NMR) in the model #1 - and what it will be? 70%? 50%? I do not know. This must be calculated and published. "essentially solving the decades-old protein folding problem. No, this is certainly not true - see my comments below. My very best wishes (talk) 23:55, 8 December 2020 (UTC)[reply]
AlphaFold's detailed GDT-TS scores are available from CASP at [21] (select group='427 AlphaFold 2', model = '1', and targets = 'TBM' and 'FM'). From it one gets a median score (#46) of 91.72 (not quite sure why that doesn't exactly match what's been quoted elsewhere, which maybe was for all models, but it's very similar). A GDT-TS score of 92.5 implies a lower bound on GDT-1A of 70 for that structure, and on GDT-2A of 85; though as those are lower bounds, median GDT-1A and GDT-2A scores for AF2 will in fact be higher than that.
AlQuraishi has a nice chart comparing AF2's scores to those of the group that performed 2nd best overall (a reasonable proxy for the state of the art excluding AF2). As the assessor said in the introduction to his presentation on high accuracy assessment "this year all structures have become high accuracy".
As so to why so many of CASP's measures (eg the graph of the article page) show the best results out of all the groups, or the best result of all the models of a group, it may be that in the past running X-ray experiments and interpreting their results was long expensive and slow; whereas running models was cheap and quick, so one could afford to run many, and if any of them could help phase your X-ray data, that was a win. Given the astronomical number of possible structures, "best of 5" was not really "manipulation with numbers". Statistically, all but a tiny tiny fraction of random guesses are nothing like the true structure. The best of 5 random guesses will still be rubbish. For any of the guesses to be any good demonstrates skill. This year, however, AF2 has demonstrated extraordinary skill. Jheald (talk) 11:48, 9 December 2020 (UTC)[reply]
  • Speaking about GDT as the measure of success on CASP, this is a poor one. For example, having a GDT of 100% with a crystal/experimental structure does NOT mean the computational model is "as good as" the corresponding experimental structure. It only means it is close in terms of CA atom coordinates. Why did not they simply use the rmsd of coordinates of all atoms for model #1, as commonly accepted in the field? The assessors introduced GDT long time to be able evaluating very poor quality models which are not even close to exprimental structures. But GDT artificially inflates the success of prediction. I think this is good time for CASP assessors to get rid of GDT and simply use the rmsd of coordinates of all atoms for model #1 as the measure of success, as experimentalists would do. My very best wishes (talk) 15:24, 9 December 2020 (UTC)[reply]
  • I am not sure how to phrase this better. First, this certainly can be described as a "highly significant breakthrough" in the field of protein structure prediction using an AI-based approach (stunning, astounding, transformational, whatever). There is no dispute about it. But we can not say that "AF2's predictions of structures are as good as those produced experimentally" (because they are not as good according to assessment on CASP-14) or that "AF2 has solved the protein structure prediction problem" - see below. My very best wishes (talk) 23:08, 8 December 2020 (UTC)[reply]

Did you know nomination (transclusion of live discussion page)

The following is an archived discussion of the DYK nomination of the article below. Please do not modify this page. Subsequent comments should be made on the appropriate discussion page (such as this nomination's talk page, the article's talk page or Wikipedia talk:Did you know), unless there is consensus to re-open the discussion at this page. No further edits should be made to this page.

The result was: promoted by SL93 (talk00:54, 5 February 2021 (UTC)[reply]

  • Comment: Note: I am one day over in submitting this, because it was previously up for consideration at WP:ITNC (discussion), and only fell off the page there at midnight this morning. So any leeway you could give it would be appreciated.
  • Reviewed: The Adults Are Talking

Converted from a redirect by Ktin (talk), Jheald (talk), and My very best wishes (talk). Nominated by Jheald (talk) at 18:36, 8 December 2020 (UTC).[reply]

  • @Bender235: While claims such as "In a serious sense the single protein single domain [prediction] problem is largely solved" have been widely made (that quote is from conference chair John Moult's closing presentation to the conference), were very widely featured as a top line in media coverage, and have also been supported in thoughtful commentary by eg Mohammed AlQuraishi [22], they have also met with opposition; and so we are not currently running them on the article. (Though this could be changed). See article talk page for extended discussions. That is why I submitted the DYK text as above.
Note also that while AF2 has made a very significant advance in the protein structure prediction problem, this is a different question to the question of how protein folding develops in nature, so caution should be taken not to confuse the two. — Preceding unsigned comment added by Jheald (talkcontribs) 09:43, 10 December 2020 (UTC)[reply]
  • Folks @Jheald, My very best wishes, Alexcalamaro, and Bender235:, this one has been open for sometime now, let's go ahead and drive this one to closure. I think the below text is the best that someone on homepage would be able to follow; anything more and we run the risk that folks find it too wordy or too complex. Let's move ahead, if you are good. Also @Yoninah: I do not want to presuppose your background but can you read the below two hooks as a layperson and let me know if you a) find it interesting b) generally get the gist of this one? If you are not a layperson for this topic, I am happy to go chase down some laypersons for this topic. Cheers.Ktin (talk) 22:42, 14 December 2020 (UTC)[reply]
ALT 3.0.... that DeepMind's protein-folding AI AlphaFold 2 has solved a 50-year-old grand challenge of biology? (source: MIT Technology Review).
OR
ALT 4.0 .... that DeepMind's AI AlphaFold 2 can predict the shape of proteins to within a width of an atom? (source: MIT Technology Review).
  • Wonderful. Thanks both of you @Alexcalamaro and Bender235:. Please can one of you review the hooks per our guidelines and approve both the hooks, we can choose one from the two post that or empower the posting Admin to make a choice. But, first step, lets approve the hooks. Cheers. Thanks again folks. Ktin (talk) 23:14, 14 December 2020 (UTC)[reply]
General: Article is new enough and long enough
Policy: Article is sourced, neutral, and free of copyright problems
Hook: Hook has been verified by provided inline citation
Image: Image is freely licensed, used in the article, and clear at 100px.
QPQ: Done.

Overall: Both hooks ALT  3.0 and ALT 4.0 meet our guidelines Alexcalamaro (talk) 18:08, 15 December 2020 (UTC)[reply]

Passing the baton over to you Yoninah to take it from here. I am good with either of the hooks (ALT3 or ALT4). I know you had prefered ALT3 and Bender235 had prefered ALT4. Alexcalamaro -- do you want to cast the tie-breaker vote? ;) Ktin (talk) 18:54, 15 December 2020 (UTC)[reply]
I vote for ALT 3.0 option (after all, we are talking about folding proteins). Alexcalamaro (talk) 19:24, 15 December 2020 (UTC)[reply]
Thanks much Alexcalamaro. Passing the baton to Yoninah. Over to you now for next steps :) Thanks everyone. I want to specially thank @Jheald and My very best wishes: who have done and continue to do lots of good work on the article. Genuinely thank you folks. Ktin (talk) 19:27, 15 December 2020 (UTC)[reply]

I think all these versions of hooks, including ALT3 and ALT4 misinform a reader. No, the "50-year-old grand challenge of biology" has not been solved. There will be many future CASP meetings to assess further progress in this direction. Just saying that it "predicts the shape of proteins to within a width of an atom" is also wrong. No, it does not. AlphaFold-2 makes sufficiently precise predictions only for 2/3 of proteins, according to CASP assessors. But even in these good cases it does NOT predict protein structure with such precision for all atoms, as a reader would assume. Actually, such claim is simply ridiculous because there is protein dynamics and there is no such thing as width of an atom. There are only atomic radii, but but this is not a single number; they are very different for different types of atoms. Also, this is not "shape", but a three-dimensional structure. The referencing is to a misleading opinion piece. Author does make a claim that AlphaFold can predict the shape of proteins to within the width of an atom, but he apparently does not have a slightest idea what he is talking about. Let's not multiply the misinformation in Wikipedia. Please see the hook I suggested above (it can be shortened if needed). My very best wishes (talk) 19:51, 15 December 2020 (UTC)[reply]

  • Yes, there are indeed WP:News sources about it (some of which claim nonsense like predicting "the shape of proteins to within the width of an atom"). However, this is an extraordinary and exceptional claim about solving a fundamental scientific problem, and not everyone agree (some similar WP:News type sources claim the opposite). I think we do need some WP:MEDRS quality sources here, such as serious independent scientific reviews. There is none. The method (AlfaFold-2) has not been published. The official assessment on CASP has not been published in any peer reviewed journal.
For example, as this article tells, "DeepMind’s press release trumpeted “a solution to a 50-year-old grand challenge in biology” based on its breakthrough performance on a biennial competition dubbed the Critical Assessment of Protein Structure Prediction (CASP). But the company drew criticism from academics because it made its claim without publishing its results in a peer-reviewed paper. ... “Frankly the hype serves no one,” and so on. I just do not think we should multiply this "hype" in WP. My very best wishes (talk) 20:29, 15 December 2020 (UTC)[reply]
  • @My very best wishes: in a literal sense the protein folding problem is not "solved," since we can obviously always move the goalposts regarding the necessary precision (≥90% accuracy? ≥99%? ≥99.99%?). The jump in precision at this year's CASP certainly deserves to be called a "breakthrough." I agree that the catchy "width of an atom" is not a precisely determined length (just as the even more popular "width of a human hair" is not); the press release said less than two angstrom, which we could use, too. --bender235 (talk) 21:31, 15 December 2020 (UTC)[reply]
Yes, one can say a "breakthrough" (I agree), but one can not say that "the problem was solved" for a number of reasons, such as (a) the protein set on CASP is absolutely not a representative set of proteins (it included only one membrane protein and the group was ranked #43 for this target, it did not include any "intertwined" protein structures or any linear peptides or any proteins with unique sequence in genomes, and so on.), (b) the method has not been even published and is not publicly available for independent evaluation, (c) AF2 has failed for a single multi-domain protein in CASP14 data set, while such proteins represent a majority in Eukaryotes, (c) the method was not tested for protein complexes. This is not at all about the percentage. We simply do not know that percentage. We do not even know the percentage on CASP until the assessment has been officially published. My very best wishes (talk) 18:52, 16 December 2020 (UTC)[reply]
  • I would oppose to most of these hooks. OK, let's keep it simple. We do have page AlphaFold. I think this is fair page. However, any hook above (except my suggestion) simply contradicts this page. Does it follow from our AlphaFold page that it "has solved a 50-year-old grand challenge of biology"? No, it does not. Does it follow that AF2 "can predict the shape of proteins to within a width of an atom?" No, it does not. Not at all. Take the lead of this page and summarize it in the hook please. That is what I was trying to do. My very best wishes (talk) 15:03, 16 December 2020 (UTC)[reply]
Now, let's consider first hook at the top that the results of DeepMind's AlphaFold 2 program in the CASP 14 protein structure prediction competition have been called "astounding" and transformational?. Well, this is actually much better than last versions. Yes, this is advertisement (just as others), but at least this is not an explicit misinformation. Some people did say that, and most important, yes, the results were very good. My very best wishes (talk) 15:23, 16 December 2020 (UTC)[reply]
  • In the spirit of serving our homepage readers, I will still recommend that we go with either of ALT3 or ALT4. Sufficient backing form WP:RS to move ahead. Ktin (talk) 00:00, 16 December 2020 (UTC)[reply]
  • Maybe we could change the "problematic" word solve by crack (also used in the MIT review), so we keep the catchy hook for the "layperson", without multiplying the "hype". What do you think of this one? :
ALT 3.1 ... that DeepMind's protein-folding AI AlphaFold 2 has cracked a 50-year-old grand challenge of biology? (source: MIT Technology Review).

Alexcalamaro (talk) 04:06, 16 December 2020 (UTC)[reply]

@Alexcalamaro: I am good with this hook (i.e. ALT 3.1). Ktin (talk) 06:30, 16 December 2020 (UTC)[reply]
OK. I am an uninvolved reviewer because I was not on CASPs for a long time and I do not have connections to CASP organizers or any participants. I only helped with editing page about AF2 in WP. Here is my independent assessment. Yes, there was a great progress with protein structure prediction on CASP14. True. However, "protein folding problem" was NOT solved by AF2 (yet). This is hype. Here is why:
  1. There was only one transmembrane protein in CASP14 dataset, and AF2 team was ranked #43 for this target; the prediction for this target by AF2 or other groups is far cry from solving the structure. Transmembrane proteins constitute at least 25-30% of proteins in human genome [23] (more by other estimates)
  2. The performance by AF2 was not great for multidomain proteins, as could be expected because AF2 was not tested for predicting protein complexes. The subunits in complexes are similar to domains. Up to 80% of all eukaryotic proteins are multidomain proteins [24].
Was it solved by AF2 at least for single domains of water-soluble proteins? There is no proof of that because
  1. Many proteins are represented by just a single or by a few related sequences in sequence databases, when one can not make large sequence alignment. However, AF2 method is actually based on using large high quality sequence alignments. We do not know if AF2 was tested for such cases and how did it perform.
  2. As follows from presentations on CAS14 (for example, [25]) AF2 did NOT achieve the accuracy of experimental methods. Moreover, looking at the distance cutoff-sequence coverage graphs here for specific CASP14 targets (T1024, T1027, T1028, T1029, T1030, T1032, T1040, T1047, T1052, T1061, T1070, T1091, T1092, T1099 T1100), one can see they are not even very close. For example, T1024 has only 50% of residues covered by best models for distance cutoff of 2A. Yes, they correctly predicted protein "fold", even family where it belongs (which is great!), but this is far cry from "solving protein folding" problem.
  3. AF2 is not publicly available for an independent evaluation
  4. AF2 and assessment of AF2 were not published not only in WP:MEDRS sources, but in any peer reviewed sources.
  5. GDT measure used by CASP assessors is a poor (insensitive) measure of performance for high-precision modeling. Having GDT of 90 or 60 (e.g. [26]) does not mean that 90% or 60% of the structure was predicted with the same accuracy as provided by X-ray crystallography, for example.
My conclusion Hook ALT 3.1 is misinformation. Do not do it. My very best wishes (talk) 14:31, 18 December 2020 (UTC)[reply]
  • Following the comments above by My very best wishes and aiming to reach a wide consensus, I propose the following alternative hook :
ALT 3.2 ... that DeepMind's protein-folding AI AlphaFold 2 has made great progress towards a decades-old grand challenge of biology? (source: MIT Technology Review Nature)).

Alexcalamaro (talk) 08:05, 19 December 2020 (UTC)[reply]

Yes, I think that's OK, with one correction: if you need a ref, it should be this [27]. That MIT writer makes too many incorrect claims, such as AF2 used 170,000 PDB structures for training (they used less), etc. My very best wishes (talk) 21:19, 19 December 2020 (UTC)[reply]
  • Comment I have changed the source of ALT3.1 to Nature, and added the hook text to the "Responses" section of the article (to meet Hooks criteria). We need more reviewers to validate the proposal. Alexcalamaro (talk) 06:31, 21 December 2020 (UTC)[reply]
Hey @Yoninah and Ktin: I think we have a consensus here with ALT3.2. I am not very familiar with these matters. What is the next step in the DYK process? Thank you. Alexcalamaro (talk) 21:40, 26 December 2020 (UTC)[reply]
Missed this one. @Yoninah: as an uninvolved editor, please can you help review this one? I know this has been waiting for a long time, but, worth wrapping this one imo. Appreciate your helping hand in the review. Ktin (talk) 22:56, 2 January 2021 (UTC)[reply]
  • OK, ALT3.2 looks good but there is a bit of run-on blue linking in the beginning of the hook. What words don't need to be linked? I also would like to know why the two images from a CASP presentation are licensed as fair use. It seems to me that OTRS permission should be obtained from the author. Alternately, can't someone draw up a similar graph that would be freely licensed? Yoninah (talk) 18:04, 9 January 2021 (UTC)[reply]
The last point is addressed in the fair-use rationales. Regarding OTRS permission, before Christmas I emailed the DeepMind press account for the block-diagram image, and both the CASP account and John Moult for the graph, and didn't get back a reply from any of them. Jheald (talk) 19:12, 22 January 2021 (UTC)[reply]
As for the hook, I would suggest unlinking "AI", as that should be pretty obvious and is a term known to most people. Jheald, still no update on the OTRS? Yoninah seems to be on a short break atm, but I think this DYK should be finished at some point. It's the only one remaining from November. --LordPeterII (talk) 12:03, 3 February 2021 (UTC)[reply]
@LordPeterII: Thanks much. This has been waiting for quite some time. Thanks again for picking this up. Let's go without the image. I have written ALT 3.3 with AI removed. Appreciate your approval. Thanks. Ktin (talk) 03:52, 4 February 2021 (UTC)[reply]
ALT 3.3 ... that DeepMind's protein-folding program AlphaFold 2 has made great progress towards a decades-old grand challenge of biology?
@Ktin: Oh, I'm afraid I don't feel confident enough to approve this nomination myself :/ I have never reviewed anything, and this article's topic is rather complex. I was merely pinging to inquire about the progress, to get the discussion going again. Maybe some experienced editor or admin can help out... maybe @Cwmhiraeth would you have time for this? (sorry, I'm not really sure whom to ask) --LordPeterII (talk) 09:35, 4 February 2021 (UTC)[reply]

Claim: "AF2 has solved the protein structure prediction problem"

This claim is not in the article (removal diffs), for reasons set out by User:My very best wishes above.

At first I thought that missing it out was 'serving the steak without the sizzle', given how much the claim was repeated in the media (and by the CASP organisers). But, actually, I think the article reads all right without it. And what does "solving the protein structure prediction problem" mean anyway? Jheald (talk) 19:10, 8 December 2020 (UTC)[reply]

Yes, we can not say that "AF2 has solved the protein structure prediction problem" - for several reasons:
  1. Speaking about "protein structures" in nature, they are typically structures of protein complexes (proteins never work alone), but the AF2 team did not even try to predict structures of protein complexes;
  2. According to presentation by AF2 people, the results were much worse if the sequence alignment used as input for AlphaFold included only one or a few sequences. Well, maybe 40% of proteins in genomes would belong to that category, although I do not know exact number (probably can be found somewhere);
  3. The testing on CASP is grossly insufficient to make such claim. Basically, the method was tested only for a few dozen of proteins (those in CASP-14), while there are 120,000 protein structures in the PDB. Of course one does not need to check all these thousands, but only a limited number of cases which are known in advance to be the most challenging for AlphaFold (linear peptides, proteins with multiple alternative structures (there are hundreds such cases in the PDB), proteins which form Intertwined complexes [28], single sequences and small or questionable alignments as input, "intrinsically unfolded" proteins). That is how such methods should be tested.
  4. Such claim was not made in any sources which would qualify as WP:MEDRS. My very best wishes (talk) 23:07, 8 December 2020 (UTC)[reply]
@My very best wishes: Can you clarify where (2) above was said? Was it a presentation about AF2 or AF1 ? Jheald (talk) 08:56, 9 December 2020 (UTC)[reply]
For general reference, here are some of the statements that have been made:
  • John Moult: "This is a big deal. In some sense, the problem is solved." (Nature, 30 November)
  • CASP press release ([29]_ "Artificial intelligence solution to a 50-year-old science challenge" (30 November)
  • John Moult: "But actually two-thirds of these points... are above 90; and in that sense, this is a real, atomic-level, competitive-with-experiment solution to the protein folding problem. This is a really extraordinary set of data. I wasn't sure I was going to live long enough to see this, but here it is. Again, with a few caveats, but the caveats are relatively minor." (CASP 14 introductory presentation, 30 November. CASP stream day 1 part 1, at 0:30:30)
  • John Moult: "In a serious sense the single protein single domain problem is largely solved" (CASP 14 closing presentation, 4 December. CASP stream day 5 part 3, at 0:04:40), cf tweet
but later "there was some talk that we should abandon this in CASP, as it is a solved problem. Clearly that's not true. It is solved by one group, and it is one solution; and I think what we heard in the nice discussion earlier today is there are probably many ways of solving this problem, and I think it is very important that we continue to monitor those, and see which actually work." (same, from 0:05.20)
Some pushback (most of which we note in the article):
See also discussion in AlQuraishi's new essay (he thinks it has been solved; summary of points to follow & comments).
User:My very best wishes's point about WP:MEDRS is a strong one (and "exceptional claims demand exceptional sourcing" -- WP:EXTRAORDINARY). Also that it may be better to work around a dubious or false claim, rather than to introduce it even with a rebuttal. Jheald (talk) 09:27, 9 December 2020 (UTC)[reply]
The poor performance for single sequences. Yes, this is something they said in 2018. What did they say about it this year? BTW, testing the performance of method for single sequences versus sequence alignments is something more or less standard. Authors know it. Did authors do such test for predicting structures from the PDB and what results did they get for AlphaFold-2? That could be a typical question by a reviewer of their future paper. My very best wishes (talk) 16:15, 9 December 2020 (UTC)[reply]
OK, I checked the concluding talk by John Moult accessible on Google [30], and I agree with him about everything except only one point: that the problem of 3D prediction was largely solved by one group for single protein domains. Actually, we do not know it because of insufficient testing on CASP-14 (see my point #3 above). I would like to ask the developers of AlphaFold2 (AF2) what was the performance of their method for the following categories of proteins:
  1. Transmembrane alpha-helical proteins. There was only one such example in CASP14 dataset, with a simplest 4-helical fold. That server http://www.predmp.com/#/home does pretty good predictions for such simple cases, but it fail for more difficult TM folds. Why AF2 should be better? Perhaps it is, but there is no any proof.
  2. Linear peptides with environment-dependent conformations studied by solution NMR. There was no such cases in CASP14 dataset.
  3. Proteins with multiple very different structures like transmembrane trasporters. There was no such cases in CASP14 dataset.
  4. Proteins which form intertwined complexes like CFTR. There were probably such cases in CAPRI dataset [31], but it seems that AF2 was not tested for such cases probably because they are beyond the capability of the method.
  5. Proteins represented by a single sequence/small alignment in genomes. This is easy to check by simply using one sequence from sequence alignment, but the developers of AF2 did not say anything about it on the Meeting (did they?) In fact, AF2 is inherently based on using large sequence alignments, hence it is not expected to work (at all?!) for single sequences which is the case for a lot of proteins.
  6. The measure of success (GDT) on CASP artificially inflates the success rate. They suppose to use the percentage of coverage with a single rmsd of ~1.0 A for all atoms calculated only for model #1, rather than a bunch of rmsd of up to 8 A calculated for "the best of five" predictions. My very best wishes (talk) 15:26, 12 December 2020 (UTC)[reply]
Of course anyone could check this himself if AlphaFold was publicly available and easy to use, but it is not available. Last figure on this page shows that the performance of automated web servers on CASP14 has achieved the level of best predictions made by AF-1 two years ago on CASP13. I hope same will happen during next two years. My very best wishes (talk) 21:40, 9 December 2020 (UTC)[reply]

Claim: "AF2's predictions of structures are as good as those produced experimentally."

Claim should be treated with caution, per User:My very best wishes above. (same removal diffs). It is not clear exactly by what basis the organisers have been making this claim, which has been contested. According to mvbw it is "obviously a false statement". There may be some nuance, but at the very least we should be cautious until we can clarify exactly on what basis it was being made. (Even though AF2's results were spectacularly good). Jheald (talk) 19:19, 8 December 2020 (UTC)[reply]

Yes, the predicted structures are "very close" for most (not all) CASP-14 targets, but they are not "as good". We can not say that "AF2's predictions of structures are as good as those produced experimentally" simply because they are not as good - according to assessment on CASP-14 (such as this, see plots on slides 16-18, this is one of many excellent illustrations by assessors). These assessments are now available only as PowerPoint Presentations. This will be even more clear when the official assessment on CASP-14 will be actually published somewhere. My very best wishes (talk) 22:55, 8 December 2020 (UTC)[reply]