Talk:GenBank

From Wikipedia, the free encyclopedia
Jump to: navigation, search
          This article is of interest to the following WikiProjects:
WikiProject United States / National Institutes of Health (Rated Start-class, Low-importance)
WikiProject icon This article is within the scope of WikiProject United States, a collaborative effort to improve the coverage of topics relating to the United States of America on Wikipedia. If you would like to participate, please visit the project page, where you can join the ongoing discussions.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Low  This article has been rated as Low-importance on the project's importance scale.
Taskforce icon
This article is supported by WikiProject National Institutes of Health.
 
WikiProject Molecular and Cellular Biology (Rated Start-class, Mid-importance)
WikiProject icon This article is within the scope of the WikiProject Molecular and Cellular Biology. To participate, visit the WikiProject for more information.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 
WikiProject National Institutes of Health
WikiProject icon This article is within the scope of WikiProject National Institutes of Health, a collaborative effort to improve the coverage of the National Institutes of Health on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 
WikiProject Computational Biology (Rated Start-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computational Biology, a collaborative effort to improve the coverage of Computational Biology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 High  This article has been rated as High-importance on the importance scale.
 
WikiProject Unique Identifiers
GenBank is part of, or of interest to, WikiProject Unique Identifiers, which encourages the use of unique identifiers in Wikipedia, and documents them in the article space. If you would like to participate, visit the project page.
 

Bias?[edit]

The way this article is written reads as a press release for GenBank, it doesn't read as an objective article written by disinterested parties. I have nothing against GenBank (quite the opposite in fact), but even so, I think articles should sound 'encyclopaedic'. This does not. I don't have the expertise on this to re-write it, so can only make the suggestion. — Preceding unsigned comment added by 81.227.236.252 (talk) 07:32, 12 October 2012 (UTC)

  • I don't agree with you at all. The article can always be improved, but it is in good shape so far. --Thorwald (talk) 02:09, 13 October 2012 (UTC)

Sequin[edit]

I really don't think that the "Sequin" link is the one that cooresponds to this article. --Piotrr 17:46, 15 December 2005 (UTC)

Doubling every 10 months?[edit]

That doesn't seem right to me. If we take the very first release and the very last release and assume a uniform rate of increase, the database is doubling every 18 months. (Actually 17.66 months.) Looking at the graph, it is quite clear that the rate is not uniform, but even taking just the longest period of approximately uniform increase (from February 1987, n=10961380, to October 2000, n=10335692655) the doubling time is a bit more than 16 months. Since February 2003, the doubling rate is {{56 \hbox{ mon}}\over{log_2 {81563399765\over 29358082791}}} \approx 38 \hbox{ mon}. 121a0012 21:00, 10 November 2007 (UTC)

  • Actually, the rate of increase is nowhere near uniform. The amount of new data added to GenBank every month is increasing super-exponentially. There has even been some theorizing that it may eventually start doubling every day! --Thorwald 23:00, 10 November 2007 (UTC)
There's no evidence of that in the data shown in the current (October 2007) release notes, your alleged source. Section 2.2.8 ("Growth of GenBank") says pretty much exactly what I said: "From 1982 to the present, the number of bases in GenBank has doubled approximately every 18 months." So please provide a source that actually supports your claim. 121a0012 02:30, 11 November 2007 (UTC)
GenBank is not my "alleged source", it is my source. The data are very clear: It is super-exponential growth. It does not follow Moore's Law (or doubling every 18 months), as you suggest (that is from older data; it is doubling faster and faster every month). Anyone in my field, bioinformatics, knows, understands, and deals with this tremendous growth on a daily basis. Disc space and computational power are nowhere near keeping up with our growth rates. If you need any more proof, a simple query of PubMed will return hundreds of results describing just this. From one university course description on just this problem: "That data repositories such as the GenBank sequence database are growing exponentially is commonly understood. There are now research tools that are generating data at much faster rates, well beyond Moore's Law."[1](PDF). I don't know how you look at that growth plot and don't see super-exponential growth? --Thorwald 02:55, 11 November 2007 (UTC)
I don't look at the growth plot, I look at the numbers published in section 2.2.8 of the GenBank release notes. The actual numbers don't support your claim. Plotting this on a semi-log scale, as I did, makes it clear that the growth, while still exponential, is significantly slower than it was several years ago. (And if you compare the data points on your plot to mine, it is self-evident that you are using the same figures, so where you get this idea that the growth is greater than exponential is a mystery to me.) 121a0012 03:15, 11 November 2007 (UTC)
PS: By the way, I was not the one who wrote that it doubles every 10 months. NCBI claims: "Over the past decade, the growth of GenBank has followed an exponential curve with a doubling time of between 12 and 15 months." (not 18 months, as it did before) [2] However, what I do claim, and what I am arguing here, is that its doubling time is quickly decreasing and some have theorized that it might eventually reach a daily doubling rate. You wrote that the rate of increase is "uniform"; I am arguing that is not the case. That is all. --Thorwald 03:05, 11 November 2007 (UTC)
And that claim, from 2004, was actually correct for a brief period before the time it was published. The rate of doubling was, in fact, approximately uniform during the period I named (1987 to 2000), as is evident from the plot. 121a0012 03:15, 11 November 2007 (UTC)

Image[edit]

Just looking at the deleted image... this was removed from the main page, but kept here for safety.

Is it just me, or does using wikipedia get harder every day? /end grumble ;-) --Dan|(talk) 08:15, 27 February 2008 (UTC)