Jump to content

Talk:GISAID

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 2603:6080:6502:e900:b7f:3d05:7e:3e7c (talk) at 04:46, 18 August 2021 (Additional discussion). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Template:WikiProject Genetics

WikiProject iconCOVID-19 C‑class Low‑importance
WikiProject iconThis article is within the scope of WikiProject COVID-19, a project to coordinate efforts to improve all COVID-19-related articles. If you would like to help, you are invited to join and to participate in project discussions.
CThis article has been rated as C-class on Wikipedia's content assessment scale.
LowThis article has been rated as Low-importance on the project's importance scale.

All?

An anonymous editor changed the "Avian" in the GISAID name to "All". While this makes sense, I can't find any evidence that the name was actually changed, so have reverted. Pol098 (talk) 12:59, 19 June 2009 (UTC)[reply]

If you look in the Nature article from Aug 2008 (https://www.nature.com/articles/442981a), the "A" originally stood for "Avian". However, by 2010, the "A" seems to be considered to stand for "All", according to "Influenza pathogen database of global significance set up in Bonn". BMEL Homepage. — Preceding unsigned comment added by 2620:0:691:4:0:0:0:58 (talk) 22:55, 9 July 2020 (UTC)[reply]

Reply

GISAID's homepage http://gisaid.org/ clearly states under the GISAID Foundation tab that it is called Global Initiative on Sharing All Influenza Data. Perhaps this was updated after Pol098's entry on 19 June 2009 —Preceding unsigned comment added by 114.251.14.2 (talk) 02:56, 24 March 2010 (UTC)[reply]

Logo update?

The GISAID logo seems to be updated with a gradient fill. I don't have experience loading new images, and I hope that's not a copyright issue, but I'll try to figure out how to do it on Commons. - AppleBsTime (talk) 03:36, 5 June 2020 (UTC)[reply]

A concern

Hello, I have been taking some of my time on Wikipedia to improve this article, quite relevant during the COVID-19 pandemic, with newer sources and a more readable intro. I am noticing multiple edits made (and re-inserted) by IP addresses appearing to have the single purpose of editing Wikipedia exclusively about GISAID, yet no other subjects. The basis of these edits repeatedly seeks to convey the perception that GISAID's terms of access are "restrictive". Ironically, GISAID terms of use are not at all dissimilar to those of Wikipedia itself. Participating scientists are free to contribute or read from the database, just as long as they agree to appropriately acknowledge the contributors of the information they use. Contributors of data can freely choose whether they don’t care about any of their rights and deposit in public-domain archives, or whether they share in a transparent manner preserving some of their rights, and thus share with the public via GISAID. It's obviously a model that works -- Wikipedia has millions of articles under the Creative Commons Attribution license, and GISAID has over a million flu and about 50k genomic sequences of the virus causing COVID-19, contributed by thousands of laboratories under its usage license. Calling this a "restriction" is far less accurate than calling it "terms of use" or "regulating" how data are shared. I would like other contributors to consider this and respond, as I fear that this IP editor (or editors) is pushing an agenda and may be unlikely to form consensus. - AppleBsTime (talk) 04:53, 20 June 2020 (UTC)[reply]

No response to this note, nor anything heard from the IP editors (who were notified). Given that, I am going to revert the single-purpose IP edits at this time. Happy to engage in more discussion (anything is better than zero), if that's seen as problematic. - AppleBsTime (talk) 15:07, 27 June 2020 (UTC)[reply]
Response to concern: First, GISAID's terms of use do not only require that scientists cite the data that they use. The terms also require that scientists "agree to make best efforts to collaborate with representatives of the Originating Laboratory responsible for obtaining the specimen(s) and involve them in such analyses and further research using such Data." This is now hidden further in the submission process, but you still have to sign it to get access. Second, these terms are quite dissilimiar to those of wikipedia: wikipedia does not restrict people from reading it, but GISAID does. Third, I think that GISAID is most naturally compared to other DNA sequence databases, not wikipedia. Since those databases do not impose terms forbidding users from sharing data or reading data, I think that saying that GISAID "restricts" the use of data helpfully clarifies how GISAID differs from other similar databases. However, I do not think this is a hill worth dying on. If you want to say "govern" I don't care that much. Fourth, the original article sought to suppress debate and discussion about whether restrictive access agreements promote data sharing by simply saying that GISAID promotes data sharing. In theory, this could be true. Perhaps scientists are more willing to share their data when they know they will have more control over it after they share it. But the fact that GISAID has taken this path should not be hidden, nor should disagreement be suppressed about whether this approach leads to science that is more open, or more closed. On a slightly different note... GISAID's divergence from its initial goal seems somewhat puzzling. It seems like the initial goal was to allow scientists to share data before first publication, and allow public domain usage after first publication. This is the general model of scientific data sharing, and it makes sense that scientists would be hesitant to share avian flu data before they had gotten any credit. However, at some point, this all changed to GISAIDs current model of public-domain-never, and it seems very unclear who made this decision, when, and why. It is also unclear (though I see no conspiracy here) when the "A" in GISAID changed from "Avian" to "All". Wouldn't you like to know? — Preceding unsigned comment added by 2620:0:691:4:0:0:0:58 (talk) 22:45, 9 July 2020 (UTC)[reply]

An inverse concern

Much of this page appears (i) overly positive, to the extent that it serves as an advertisement for GISAID and (ii) uses sources that simply quote GISAID's positive description of itself. For examples, GISAID's web page says that it overcomes "disincentive hurdles and restrictions". The claim about disincentive hurdles is interesting, though vague. However, no example of "restrictions" is given. And yet, the current page repeats the claim that data sharing was "restricted". Additionally, the History section contains a list of "endorsements", which sounds like an advertisement, not a balanced description. Here are a list of other concerns:

  • what are "submitters rights"? It is neither clear what specific "rights" are being claimed, nor what makes these things a "right".
  • what does it mean that WHO member states were concerned about sharing data? As far as I know, countries and states do not share data: individual scientists do. If I am wrong, that would be interesting. However, if I am right, this makes no sense.
  • what, exactly, does GISAID do to prevent sharing researchers being scooped pre-publication? And how does this different from post-publication?
  • why exactly is "verification of users" supposed to be a positive thing that public-domain database do not offer?

Finally, user AppleBsTime has removed interesting facts, seemingly because they reflect negatively on GISAID. For example, the original signed letter calls for shared sequences to be deposited in public databases eventually, which would then allow scientists to share pre-publication data w/o being scooped while still not restriction post-publication data. But AppleBsTime removed this comment, even though it had a citation.

— Preceding unsigned comment added by 2620:0:691:4:0:0:0:1B (talk) 23:28, 15 August 2020 (UTC)[reply] 

A response from an experienced editor

I want to thank the Duke University IP address(es) for this opportunity to re-examine the Wikipedia article about GISAID from his/her perspective. It is reassuring and a healthy process to mutually share a common goal to make this article as informative and accurate as it can be, especially within the policies and guidelines of Wikipedia. The IP editor may not be familiar with all of Wikipedia’s practices, such as registering an account to build trust and gain access to more functionality, or such as signing comments with four tildes. I have been an editor for a number of years, having made over 500 edits to hundreds of different articles, and even created a handful of new articles. This is a nice opportunity to share with you some of what I’ve learned on Wikipedia, since you seem to have experience only with this one article about GISAID.
The first concern of the Duke University IP editor is that the page "appears overly positive". Given the significant amount of coverage of the role of GISAID from reliable sources over a considerable time period, versus the rather limited edits to this article during that same time period, no evidence is given to support your concern that "it serves as an advertisement for GISAID and uses sources that simply quote GISAID's positive description of itself."
Frankly, I suspect that we’re seeing an outcome of Wikipedia’s reliance on independent sources to reliably document how a subject topic should be characterized. This Wikipedia article has been built over the past 13 years and at that, very sporadically. If one looks up up "problems with GISAID" or "trouble with GISAID" in a search engine, one will not find anything. Try "criticism of GISAID" on any search engine, you will find about 2-3 results, which appear to be blog entries in the vein of rants, or Reddit posts, rather than the journalistic or peer-reviewed concerns for which Wikipedians search. Sources like websites operated by the originator of a complaint about a subject or a Reddit conversation about a subject are generally not allowed as reference sources in Wikipedia -- unless they become newsworthy themselves (e.g., if Dr. Ghebreyesus or Dr. Fauci were to start participating in the Reddit conversation, and this got picked up by the BBC or Associated Press).
A second concern presented by the Duke University IP editor is that some of the sources in the article are simply citations of GISAID's own material. While editors must ensure articles follow Wikipedia's content guidelines, see here on self-published sources used as sources of information which technically allows a limited amount of self-sourcing (but never in an unduly self-serving way). The suggestion that these edits are driven by GISAID itself seems far-fetched, as it is not substantiated. Nonetheless, it is clear that improvements to this article (in particular in sections that have not been vetted) can and should be made, and that finding independent, third-party sources to replace some GISAID.org sources would improve the article's quality, so it will not be merely categorized as a 'Start-Class Genetics' / 'Low-importance Genetics' / 'C-Class COVID-19' / or 'Low-importance COVID-19' article.
Currently, I count 3 references to GISAID materials out of 35 total references in this article. That doesn't seem undue or self-serving, compared to other Wikipedia articles about organizations.
With regard to the point-by-point "other concerns" itemized by the Duke University IP editor; allow me to address these as follows.
  • What are "submitters' rights"?
GISAID's Terms of Use, aka the Database Access Agreement, states in section 2a: "This Agreement does not transfer any other rights or ownership interests in the Data" and further in section 2c, the rights of the "Originating Laboratory where the clinical specimen or virus isolate was first obtained and the Submitting Laboratory where sequence data have been generated and submitted" are acknowledged.
In addition to a significant number of published reliable sources, please take note of the written Statement to the World Health Organization, given by the Federal Republic of Germany in 2015, which makes it clear GISAID employs "a unique sharing mechanism which ensures that inherent rights (e.g. IPR) of contributors of GSD are not forfeit."
  • What does it mean that WHO member states were concerned about sharing data? As far as I know, countries and states do not share data: individual scientists do. If I am wrong, that would be interesting. However, if I am right, this makes no sense.
Matter of fact, all countries and states decide how data are shared when it comes to pathogens, which is evident by governments regulating the safety levels of handling pathogens in the first place (see BSL biosafety levels, for example). The headlines we see read "Indonesia hands over bird flu data to new database", rather than "An individual scientist in Indonesia hands over bird flu data". It's also why we see wording in this article like, "China, Russia and other nations that have withheld virus samples...", rather than "Individual scientists in China, Russia and other nations…", or "… sequences for the novel coronavirus (2019-nCoV) … submitted by Chinese authorities to the GISAID platform" , rather than "submitted by a[n] [individual] Chinese scientist".
  • What, exactly, does GISAID do to prevent sharing researchers being scooped pre-publication?
The article states "GISAID sought to address medical researchers' reticence about sharing." Reading GISAID's Terms of Use makes it crystal clear "Your rights and privileges under this Agreement will terminate automatically and without need for written notice upon any breach by You of any term of this Agreement." Given that a username/password procedure is in place, GISAID can very well enforce/sanction violators who scoop, irrespective of a paper having been peer-reviewed or not. The sheer number of emerging coronavirus genetic data and metadata in GISAID, but also influenza data, when compared to public-domain archives are evidence that GISAID has somehow addressed researchers' reticence about sharing. Though I should also say, it's not Wikipedia's responsibility to document a process, when the claim is merely that the organization sought to address a problem.
  • Why exactly is "verification of users" supposed to be a positive thing that public-domain database do not offer?
I'm doubtful that this is Wikipedia's responsibility to prove. It's our job as editors to find if reliable sources say that it is the case that GISAID provides "verification of users" which, for example, public-domain archives that permit anonymous access do not. The Catherine Saez article in Intellectual Property Watch (a publication utilized in dozens of Wikipedia articles) says that verification of users is something that GISAID provides that other public-domain databases do not. We don't speculate on why that's a positive thing, because that would be original research, which is forbidden by firm policy on Wikipedia.
Per the complaint that I removed "interesting facts, seemingly because they reflect negatively on GISAID", sorry to say, I removed some content because it pertained to a correspondence letter that conceived of an idea prior to the formation of the actual organization that the Wikipedia article is about.
I considered it misleading to suggest to readers that the correspondence letter in Nature called for sequences to be "deposited in the three publicly available databases participating in the International Sequence Database Collaboration" while omitting the preceding text, i.e., the proposal "to expand and complement existing efforts with the creation of a global consortium".
A consortium is an association of two or more individuals, companies, organizations, or governments and will by default not be open to the public. Even a Nature editorial understood at the time that an "Agreement on the principles of GISAID is only a beginning, however. Prompt progress in establishing the ground rules for sharing will be essential to build confidence and momentum." The peer-reviewed Elbe et al (2017) also addressed this correspondence letter: "However, … notwithstanding its good intentions, the brief letter still lacked much practical detail, and that the core issues of transparency and equity of data sharing would likely remain unresolved if data archives with anonymous access to data (like Genbank) were used."
Eighteen months after the correspondence letter appeared in Nature, GISAID did provide ground rules for sharing, by providing immediate access to the public and not merely to a consortium.
It would be like saying that Thomas Edison should be extensively criticized in Wikipedia for not sticking to his initial idea that platinum should be the filament in an incandescent light bulb, when he later found that carbonized bamboo was a much more practical, inexpensive, and longer-lasting solution. So, please, I’d ask that you not assail my removal of some content in the interest of making an article less confusing. It wasn’t about something "reflecting negatively" on the subject.
The lede historically had been too promotional, but after it was cut back, it was rather confusing and didn't even mention GISAID's contemporary work on the coronavirus pandemic. That's when I stepped in to edit the article. I'm not trying to paint a rosy picture, but the independent sources say things that simply recognize the success of GISAID.
I'll close with an interesting article from Duke University's publication Duke Today, where a professor of immunology, pathology, pediatrics, molecular genetics and microbiology is asked what she trusts for information about COVID-19… and without any complaint about restrictions, user verification, or submitters' rights, she says "For the latest on viral sequence dynamics, I check gisaid.org." - AppleBsTime (talk) 15:17, 5 September 2020 (UTC)[reply]

A balanced view

There seems to be something of a disagreement in the 2 sections above. I just heard an excellent radio show/podcast on NPR, that gives a lot of info on this topic, but I wanted to check it out here. My general impression is that it considers the same disagreement as above with views from both sides.

  • "On the Media - Not a Perfect Science" (podcast). National Public Radio. 28 May 2021. Retrieved 30 May 2021.

Smallbones(smalltalk) 22:21, 30 May 2021 (UTC)[reply]

Journalist Meredith Wadman has kind of flipped back and forth on the issue, herself. - 97.64.141.154 (talk) 13:24, 18 June 2021 (UTC)[reply]

Additional discussion

From the Duke researcher mentioned above: these articles from Nature and Science also talk about this disagreement:

I also note that http://www.nextstrain.org has begun offering two coronavirus tree reconstructions, one labeled "Latest Global Analysis - GISAID data", and one labeled "Latest Global Analysis - open data". I am still confused about whether GISAID is trying to restrict PRE-publication data sharing or POST-publication data sharing. It sounds to me like it restricts both equally. 2603:6080:6502:E900:B7F:3D05:7E:3E7C (talk) 04:46, 18 August 2021 (UTC)[reply]