Wikipedia:Size of Wikipedia

From Wikipedia, the free encyclopedia

Jump to: navigation, search
An image hypothesising the size of a printed version of Wikipedia as of August 2007.
Wikipedia size & users
English articles: 2,935,174
Average revisions: 108
Total wiki pages: 17,320,812
Total admins: 1,666
Total users: 10,029,607
UTC time: 20:02 on 2009-Jul-6

WikiStats
Main
General statistics
Breakdowns

This Wikipedia:Statistics page measures the size of the English-language edition of Wikipedia; mostly page and article count. There are currently 2,935,174 articles in the English Wikipedia.

Most of the earlier entries were extracted from Wikipedia: Announcements. Later entries are taken from observations of the new software's built-in article count features.

Contents

[edit] Wikipedia growth

Wikipedia's growth approximately follows a logistic growth model. This model is based on:

  • more content leads to more traffic, which in turn leads to more new content
  • however, more content also leads to less potential content, and hence less new content
  • the limit is the combined expertise of the possible participants.
Number of articles on en.wikipedia.org and logistic extrapolations to a maximum of 3, 3.5 and 4 million articles. Note: Thick line (dark blue) represents actual article count, but smoothed to match thin model lines.
Article growth per month (6 months average, smoothed at Oct 2002). Extrapolation to a maximum of 3, 3.5 or 4 million articles. Note: Thick line (dark blue), the actual article count, since 2002 in every year at the 3rd quarter (July), has jumped higher (steeper line) but then slowed later.


Some characteristics of this model are:

  • there will be a maximum to the number of articles. On Wikipedia one can hardly imagine this as there will be new events and people to describe in the future. Compared to the large number of existing articles this is a very small effect though. New articles will be offset by deletions or merges of older articles.
  • at the end the growth is zero.
  • at the pivot point (halfway the maximum) the growth is at its peak. For en.wikipedia.org this might have been in August 2006 with 60,000 new articles a month.

This model is related to the quantity (number of articles). The quality might still increase independently.

[edit] Graphs of size and growth rate

The 2 graph images show: in the first graph, the projected total articles; and in the second graph, the monthly growth rate, as it has slowed (line sloping downward) since late 2006.

In the 2nd graph, the 3 extrapolation lines (the thin, light blue bell curves) show projected growth to a maximum of 3, 3.5 or 4 million articles. The 3 curves continue, outside the graph, into the year 2014 or 2015.

Note that the actual article growth-rate (thick, dark blue line) since 2002 in every year, during the 3rd quarter (after July) of each year, has jumped higher (steeper line) but then slowed later. The historic trend has been that new articles are added faster after July, rather than in December or February, etc.

New articles have still been added every month, but the rate of increase was typically somewhat less in each December.

[edit] Annual growth rate

For the English Wikipedia.

Date   Article Count         Increase during  
Preceding Year
  % Increase during  
Preceding Year
  Average Increase
per Day during Preceding Year
 2002-01-01  19,700 19,700 - 54
2003-01-01 96,500 76,800 390% 210
2004-01-01 188,800 92,300 96% 253
2005-01-01 438,500 249,700 132% 682
2006-01-01 895,000 456,500 104% 1251
2007-01-01 1,560,000 665,000 74% 1822
2008-01-01 2,153,000 593,000 38% 1625
2009-01-01 2,679,000 526,000 24% 1437
2009-07-6 2,935,174 [a]256174 -- [a]~1377  
Notes:   [a] - Calculated live, so far, as only for partial year.

[edit] Comparisons with other Wikipedias

Codes: en - Englishde - Germanfr - Frenchpl - Polishja - Japaneseit - Italiannl - Dutchpt - Portuguesees - Spanishsv - Swedish

This graph is based on data from http://stats.wikimedia.org/EN/TablesArticlesTotal.htm as of June 2, 2007, with recent values for the English Wikipedia taken from the data below. The sum includes all 240+ Wikipedia languages. See the front page at http://www.wikipedia.org for a recent article count for the 10 largest Wikipedias.

The English edition remains the largest Wikipedia, over three times as large as the second largest edition, the German Wikipedia. Many other editions shared the quasi-exponential growth of the English edition, though lagging one to three years behind. As these other Wikipedias have grown, the overall percentage of articles in English has been steadily decreasing, and it fell below 25% in March 2007. The percentage of articles in the ten largest Wikipedias has also been decreasing, although these top ten still account for about 67% of all Wikipedia articles as of June 2007.

[edit] Chronology of software versions

  • Phase I UseMod Wiki-based software: January 10, 2001 – January 25, 2002
  • Phase II PHP-based software: January 25, 2002 – July 20, 2002
  • Phase III PHP-based software: July 20, 2002 – present

This data set notes the fact that these figures are drawn from multiple data sources and different estimates (see the key below for details), and presents them as a spreadsheet-ready table for graphing. The original data sets are archived: see the links below. Note also that the figures are sampled at random times of day.


[edit] Hard copy size

The following illustrates how big the English-language Wikipedia would be printed and bound in book form. (Each volume 25cm tall, 5cm thick, and containing 1,600,000 words or 8,000,000 characters.) It uses the live article count.

958 volumes


                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       

[edit] The data set

Key to the data below:

  • approx: this figure is an approximation
  • lowerbound indicates that there were at least this many pages
  • mpac3.1: main page article count from the Phase III software since May 25, 2003: article namespace, not redirects, containing at least one internal wiki link
  • mpacIII: main page article count from the Phase III software up to May 22, 2003: article namespace, comma, not redirect
  • mpacII: main page article count from the Phase II software
  • spII: stats page article count from the Phase II software
  • all: total of all pages of any sort
  • commapp: pages which include a comma, a crude way of finding "real" articles
  • conscnt: "conservative count" taken by removing the count of various types of non-article from the comma page count
  • MF: Malcolm Farmer
  • LMS: Larry Sanger
  • WA: Wikipedia:Announcements

Now extended and annotated with (somewhat gnomic) source information. Note that sampling times are only recorded to the day given by the user recording the entry, and that there is no clear time-zone information for that day.

Note: The current mpac3.1 article count for the English-language Wikipedia is 2,935,174 articles

2009-04-17, 2843391
2009-03-07, 2780320,     mpac3.1
2009-02-17, 2746003,     mpac3.1
2009-02-01, 2722366,     mpac3.1
2009-01-01, 2679785,     mpac3.1
2008-12-10, 2652896,     mpac3.1
2008-11-21, 2631334,     mpac3.1
2008-11-04, 2612599,     mpac3.1
2008-10-20, 2591574,     mpac3.1
2008-10-02, 2569004,     mpac3.1
2008-09-14, 2553609,     mpac3.1
2008-08-16, 2510197,     mpac3.1
2008-08-11, 2500000,     mpac3.1 (inserted retroactively from data in WA)
2008-07-30, 2479280,     mpac3.1
2008-07-15, 2456599,     mpac3.1
2008-07-02, 2436830,     mpac3.1
2008-06-21, 2421535,     mpac3.1
2008-06-04, 2399617,     mpac3.1
2008-05-20, 2379635,     mpac3.1
2008-05-13, 2370238,     mpac3.1
2008-04-30, 2353068,     mpac3.1
2008-04-06, 2320245,     mpac3.1
2008-03-21, 2291433,     mpac3.1
2008-03-08, 2269796,     mpac3.1 
2008-02-23, 2246333,     mpac3.1
2008-02-07, 2214256,     mpac3.1
2008-01-30, 2200000,     mpac3.1
2008-01-12, 2170716,     mpac3.1
2007-12-16, 2129551,     mpac3.1
2007-12-08, 2118053,     mpac3.1
2007-11-27, 2104052,     mpac3.1
2007-11-15, 2090929,     mpac3.1
2007-10-20, 2055408,     mpac3.1
2007-10-06, 2037246,     mpac3.1
2007-09-09, 2000000,     mpac3.1 (inserted retroactively from data in WA)
2007-09-05, 1991220,     mpac3.1
2007-08-28, 1978895,     mpac3.1 
2007-08-09, 1941186,     mpac3.1 
2007-08-01, 1925624,     mpac3.1
2007-07-22, 1900000,     mpac3.1 (inserted retroactively from data in WA)
2007-07-12, 1879826,     mpac3.1
2007-06-23, 1847297,     mpac3.1
2007-06-10, 1828002,     mpac3.1
2007-06-02, 1813007,     mpac3.1
2007-05-25, 1800000,     mpac3.1
2007-05-16, 1785384,     mpac3.1
2007-05-01, 1763270,     mpac3.1
2007-04-16, 1740243,     mpac3.1
2007-03-22, 1700000,     mpac3.1 (inserted retroactively from data in WA)
2007-03-07, 1674588,     mpac3.1
2007-02-14, 1638583,     mpac3.1
2007-02-08, 1626802,     mpac3.1
2007-01-24, 1598645,     mpac3.1
2006-12-19, 1539908,     mpac3.1
2006-11-24, 1500000,     mpac3.1 (inserted retroactively from data in WA)
2006-11-08, 1473418,     mpac3.1
2006-10-21, 1444717,     mpac3.1
2006-10-05, 1418517,     mpac3.1
2006-09-22, 1396727,     mpac3.1
2006-08-26, 1344771,     mpac3.1
2006-08-15, 1317963,     mpac3.1
2006-08-06, 1300000,     mpac3.1 (inserted retroactively from data in WA)
2006-07-04, 1234741,     mpac3.1
2006-06-19, 1200000,     mpac3.1 (inserted retroactively from data in WA)
2006-06-05, 1173156,     mpac3.1 
2006-05-12, 1129680,     mpac3.1
2006-04-27, 1100000,     mpac3.1 (inserted retroactively from data in WA)
2006-03-24, 1040919,     mpac3.1
2006-03-17, 1027914,     mpac3.1
2006-03-01, 1000000,     mpac3.1 (inserted retroactively from data in WA)
2006-02-27,  995933,     mpac3.1
2006-02-14,  971518,     mpac3.1
2006-01-30,  945642,     mpac3.1
2006-01-16,  921824,     mpac3.1
2006-01-03,  897700,     mpac3.1
2005-12-20,  874359,     mpac3.1 
2005-12-05,  851295,     mpac3.1
2005-11-11,  815460,     mpac3.1
2005-10-27,  791967,     mpac3.1
2005-09-12,  726894,     mpac3.1 (another long gap since last manual update)
2005-08-25,  700000,     mpac3.1 (inserted retroactively from data in WA)
2005-06-19,  600000,     mpac3.1 (inserted retroactively from data in WA)
2005-04-29,  544514,     mpac3.1 (more than one month without data...)
2005-03-18,  500296,     mpac3.1 (after another more minor server glitch and recovery)
2005-03-13,  496881,     mpac3.1
2005-03-01,  486042,     mpac3.1 (after server crash and DB recovery)
2005-02-10,  471659,     mpac3.1
2005-01-27,  458779,     mpac3.1
2005-01-01,  438533,     mpac3.1
2004-12-04,  414023,     mpac3.1
2004-11-18,  398159,     mpac3.1
2004-11-16,  396408,     mpac3.1
2004-11-08,  389614,     mpac3.1
2004-11-01,  384336,     mpac3.1
2004-09-01,  337911,     mpac3.1
2004-08-25,  333015,     mpac3.1
2004-08-23,  331191,     mpac3.1 
2004-08-13,  324551,     mpac3.1 
2004-08-07,  320532,     mpac3.1
2004-07-25,  312524,     mpac3.1
2004-07-10,  302333,     mpac3.1
2004-05-13,  264854,     mpac3.1
2004-05-07,  261964,     mpac3.1
2004-04-28,  255596,     mpac3.1
2004-02-27,  215140,     mpac3.1
2004-02-15,  206665,     mpac3.1
2004-02-07,  202410,     mpac3.1
2004-02-01,  199738,     mpac3.1
2004-01-28,  198011,     mpac3.1
2004-01-22,  196070,     mpac3.1
2003-12-31,  188538,     mpac3.1
2003-12-23,  186539,     mpac3.1
2003-12-16,  184273,     mpac3.1
2003-12-08,  180008,     mpac3.1
2003-12-03,  178093,     mpac3.1, after database rebuild and recalculation of count
2003-11-25,  176091,     mpac3.1
2003-11-21,  174904,     mpac3.1
2003-11-19,  174376,     mpac3.1
2003-11-12,  172387,     mpac3.1
2003-10-30,  168762,     mpac3.1
2003-10-14,  164565,     mpac3.1
2003-10-04,  162636,     mpac3.1
2003-09-27,  160361,     mpac3.1
2003-09-12,  156047,     mpac3.1
2003-09-09,  155316,     mpac3.1
2003-09-04,  154061,     mpac3.1
2003-08-31,  153047,     mpac3.1
2003-08-15,  148859,     mpac3.1
2003-07-27,  144467,     mpac3.1
2003-07-15,  141607,     mpac3.1
2003-07-07,  139896,     mpac3.1
2003-07-04,  139031,     mpac3.1
2003-07-01,  136737,     mpac3.1
2003-06-27,  135963,     mpac3.1
2003-06-25,  135504,     mpac3.1
2003-06-22,  134869,     mpac3.1
2003-06-19,  134256,     mpac3.1
2003-06-16,  133411,     mpac3.1
2003-06-11,  132267,     mpac3.1
2003-06-09,  131900,     mpac3.1
2003-06-06,  131249,     mpac3.1
2003-06-04,  130797,     mpac3.1
2003-05-31,  129884,     mpac3.1
2003-05-28,  129240,     mpac3.1
2003-05-25,  128475,     mpac3.1  (correction of broken counter, new algorithm)
2003-05-22,  120704,     mpacIII  (installation of new webserver on 05-14 fixed the count at 120704)
2003-05-12,  120173,     mpacIII
2003-04-29,  117658,     mpacIII
2003-04-23,  116408,     mpacIII
2003-04-22,  116259,     mpacIII
2003-04-21,  116066,     mpacIII
2003-04-20,  115879,     mpacIII
2003-04-18,  115361,     mpacIII
2003-04-15,  114861,     mpacIII
2003-04-12,  114469,     mpacIII
2003-04-11,  114274,     mpacIII
2003-04-01,  112564,     mpacIII
2003-03-28,  112035,     mpacIII
2003-03-15,  109933,     mpacIII
2003-03-07,  108593,     mpacIII
2003-03-05,  108298,     mpacIII
2003-02-27,  107243,     mpacIII
2003-02-26,  107027,     mpacIII
2003-02-24,  106700,     mpacIII
2003-01-22,  100321,     mpacIII
2003-01-12,   98475,     mpacIII
2003-01-08,   97913,     mpacIII
2003-01-04,   97137,     mpacIII
2003-01-02,   96664,     mpacIII
2002-12-26,   95735,     mpacIII
2002-12-23,   95452,     mpacIII
2002-12-16,   94497,     mpacIII (so; is the counter fixed now?)
2002-12-13,   93834,     mpacIII
2002-12-10,   93301,     mpacIII, Corrected counter drift
2002-12-10,  100046,     mpacIII, RAMBOT causing Error in counter.
2002-12-04,   94165,     mpacIII
2002-12-01,   93880,     mpacIII
2002-11-27,   93263,     mpacIII
2002-11-25,   92777,     mpacIII, ~775 bot generated articles added.
2002-11-22,   91580,     mpacIII, some recent performance problems
2002-11-18,   90905,     mpacIII, article counter is back, after being switched off
2002-11-09,   90266,     mpacIII
2002-11-07,   90003,     mpacIII
2002-11-06,   89375,     mpacIII
2002-11-01,   88597,     mpacIII
2002-10-30,   88292,     mpacIII
2002-10-27,   87285,     mpacIII
2002-10-26,   87206,     mpacIII
2002-10-25,   87037,     mpacIII, rambot in operation
2002-10-24,   80887,     mpacIII, rambot in operation
2002-10-23,   76958,     mpacIII, rambot in operation
2002-10-22,   74005,     mpacIII, rambot in operation
2002-10-21,   66738,     mpacIII, rambot in operation
2002-10-20,   66372,     mpacIII, rambot in operation
2002-10-19,   61128,     mpacIII, rambot in operation
2002-10-17,   54339,     mpacIII
2002-10-14,   53174,     mpacIII
2002-10-11,   52571,     mpacIII
2002-10-10,   52435,     mpacIII
2002-10-08,   52092,     mpacIII
2002-10-04,   50953,     mpacIII
2002-10-03,   50804,     mpacIII
2002-09-30,   49724,     mpacIII
2002-09-27,   47448,     mpacIII
2002-09-26,   47152,     mpacIII
2002-09-25,   46133,     mpacIII
2002-09-24,   45707,     mpacIII
2002-09-23,   45462,     mpacIII
2002-09-22,   45159,     mpacIII
2002-09-21,   44920,     mpacIII
2002-09-17,   43962,     mpacIII
2002-09-16,   43762,     mpacIII
2002-09-10,   42268,     mpacIII
2002-09-09,   42021,     mpacIII
2002-09-07,   41559,     mpacIII
2002-09-05,   41141,     mpacIII
2002-09-04,   40934,     mpacIII
2002-09-03,   40718,     mpacIII
2002-08-31,   40093,     mpacIII
2002-08-22,   38780,     mpacIII
2002-08-14,   37508,     mpacIII
2002-08-12,   37259,     mpacIII, upgraded to phase III software after major performance problems
2002-05-17,   33333,     spII approx (WA)
2002-04-16,   32000,     spII approx (WA)
2002-03-28,   30000,     spII approx (WA)
2002-02-04,   23000,     mpacII approx (WA)
2002-01-09,   20000,     estimate approx (LMS WA)
2001-12-14,   19000,     conscnt approx
2001-12-07,   18000,     conscnt approx
2001-11-06,   16000,     conscnt lowerbound
2001-10-25,   15053,     conscnt (LMS WA)
2001-10-19,   14000,     commacnt lowerbound 
2001-10-04,   13182,     conscnt (MF WA)
2001-09-19,   12502,     conscnt (LMS)
2001-09-09,   11208,     conscnt (MF)
2001-09-07,   10000,     conscnt lowerbound
2001-08-22,    9043,     conscnt
2001-08-07,    8000,     conscnt
2001-07-27,    7243,     conscnt
2001-07-26,    6947,     conscnt
2001-07-08,    6000,     conscnt lowerbound
2001-05-20,    4985,     commapp
2001-05-10,    3969,     commapp
2001-04-27,    3281,     commapp    
2001-03-30,    2221,     commapp
2001-03-24,    1910,     commapp
2001-03-07,    1323,     commapp
2001-02-12,    1000,     all lowerbound
2001-02-08,     900,     all lowerbound
2001-01-31,     617,     all
2001-01-25,     270,     all
2001-01-10,       0,     all




These pages hold the earlier source data in its original ad-hoc tabular format:

[edit] Notes


[edit] See also

[edit] External links

  • Wikipedia Statistics, auto-generated from database dumps, but badly out of date as regards the English and French Wikipedias
Personal tools