Talk:Languages used on the Internet

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Internet (Rated Start-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Internet, a collaborative effort to improve the coverage of the Internet on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.

Question about quality and reliability of sources[edit]

I agree that Internet World Stats does not look like reliable data. Also the first chart that appears in the entry, with the footnote [1] leading to Internet World Stats... the page does not even support that chart. Is the chart from a different source, that got misattributed somewhere along the way? — Preceding unsigned comment added by (talk) 13:20, 15 September 2014 (UTC)

The data comes from a website called [Internet World Stats |]. This site is published by the [Miniwatts Marketing Group |] in Colombia. The website looks very unfinished. It contains - literally - a lot of "Lorem ipsum". It looks like someone did a very quick job of getting up the website and did not fill in all the fields in the template.

Can someone please say something more about the source? I am interested in using the data, but I feel uncertain about it at this stage. —Preceding unsigned comment added by Dnordfors (talkcontribs) 19:49, 28 November 2008 (UTC)

Internet World Stats looks like a marketing company, reads like a marketing company, and feels like a marketing company. I think it's irresponsible to be using them as a source. What would better serve people, however, is a quick look at the Open Net Initiative. More specifically, the vastly more reputable ITU statistics which are reviewed and actually legitimate: [1] —Preceding unsigned comment added by (talk) 17:23, 19 January 2010 (UTC)

I have the exact same feeling on this "Miniwatts Marketing Group". There is no way to contact them except by mail and no information about their way to extract the numbers they present... I will remove them and the source. If new clues about their reliability arise, then we could suggest a roll-back. G.Dupont (talk) 14:51, 30 July 2010 (UTC)

After some extra surf, it appears that even the others sources (one of them being now down) does not state clearly how they did compile the number. It seems to be that this is strange. How could we claim this on wikipedia without a clear validation of the data ? Counting the number of internet users in a country is in my sense a very complex problem that maybe government and/or internet providers (if they work all together) could solve. Compiling such numbers on the whole world should be an enormous work... or a very nice fake. What do you think ? Without answers in few days, I will suggest to delete this article. G.Dupont (talk) 15:22, 30 July 2010 (UTC)


When I saw the title, I thought it would be about a government sponsored plan to give everyone on the planet internet access, if of a low bandwhich. However, it's a list of what language people on the internet speak. Move this to Internet user statistics or Internet user atributes or something similar. HereToHelp (talk) 02:03, 25 October 2005 (UTC)

In response to the above comment, the page was moved from Global internet access to Global internet usage.

Weird number[edit]

According to this organization, in 2006 there are 28 million French-speaking users. However, French stat companies state that in 2005 there were 26 million French Internet users and it's growing fast. [2] Since there are a number of other French-speaking countries or subnational entities out there, it seems to me that the number cited in this article is underestimated. David.Monniaux 06:57, 28 April 2006 (UTC)

We should split North and South Korea. Majority of the Internet users are from South, but the population is combined in the table; South alone is less than 50M. —Preceding unsigned comment added by Dean2026 (talkcontribs) 19:39, 19 August 2009 (UTC)

Duplicate articles?[edit]

It seems that this article and Languages on the Internet may be duplicates. They should probably be merged, preferably Languages on the Internet into this article (Global internet usage). Also, shouldn't internet be capitalized? It is the network as a whole afterall. - Rudykog 13:37, 17 June 2006 (UTC)

Merge OK[edit]

I think Languages on the Internet article should stay where it is. It was due to that reason I got here. This article had what I was searching for so I think the title is pretty descriptive. But I have no ojections if it is merged with the other article Global internet usage. That second article title is a bit misleading since I think it should be cover more than just what languages spoken. More statistics over how many people at all in different countries uses Internet. Browsers and other system statistics.

Average number of users on a single day[edit]

I suppose that with "Total number of Internet users" one means the total number of people that use internet with a given frequency, or that have used it at least once (which one, by the way?). Does anybody know if there are sources on the average number of people who connect to internet on a single day? And does "Total number of Internet users" as used in this article mean the same as "Internet users" in

Languages table[edit]

I think it should be deleted or fully rewritten, because it doesn't make any sense or contradicts other wikipedia pages. e.g.

1) this table says that there are 874 Chinese speakers, wikipedia page on Chinese says that Mandarin Chinese has at least 850 speakers[3].
2) languages don't have GDP and even if it shows GDP of countries, China doesn't have GDP per capita of $7,200. DVoit 14:18, 25 September 2007 (UTC)
What are you talking about? The number is in million. 874 million or 850 million, who knows that exactly? And about GDP, I don't know how the page looked like in 2007, but it doesn't matter. However, one thing is clear. You are a sinophobe and this is racist! This is not exaggerated. You don't agree with 874 million, so you looked for another source (Wikipedia can't be a good source for Wikipedia) showing a smaller number. Then you are not happy with China's high GDP, so you argue with per capita. -- (talk) 10:08, 6 May 2015 (UTC)

Web or Internet?[edit]

Is this article about web pages or total internet usage? In other words, does it include figures for what languages are used in e-mail mesages, IRC chat, instant messaging conversations etc, or is it just about web pages published and web pages accessed? I think it is the latter, and so should be renamed appropriately, asap. --Nigelj (talk) 19:42, 10 August 2008 (UTC)

Websites or users?[edit]

I wonder whether the 'estimated online population' adequately reflects the proportions of languages actually found on the internet. I would surmise that, although China has a substantial online population, there are far fewer websites written in Chinese than in German. Does anybody know of statistical data which could corroborate or refute this?-- (talk) 18:34, 12 November 2008 (UTC)


The total population of Dutch speakers is way to low and I think the amount of Dutch users on the internet is also to low. —Preceding unsigned comment added by (talk) 11:45, 23 November 2008 (UTC)

23 m speakers of Dutch. see:

You are right, if you look at this page you see that there are almost 14 million internet users in the Netherlands and 7 million in Belgium, if we take 60% of the users in Belgium(the amount is probably higher because Flanders has a higher internet rate than Wallonia) I get 4,8million+13,8million=18,6million(exc. Suriname, Nether. Antilles and Aruba). And it should be 27million(first+second language speakers) like the other languages have it too. —Preceding unsigned comment added by MaxvJ (talkcontribs) 14:50, 20 October 2009 (UTC)

Stats outdated[edit]

The stats of Internet users are outdated, if someone wants to take the time to update them : —Preceding unsigned comment added by GRAND OUTCAST (talkcontribs) 07:19, 29 May 2009 (UTC)

As stated earlier, numbers from this sources are subject to question. It's rather obscure source which does not present well its methodology. I suggest to not use it neither update the article based on this. G.Dupont (talk) 21:08, 17 January 2011 (UTC)

Number of Pages[edit]

I would like to see some statistics on the number of web pages in different languages.

Also, I wonder how much these statistics account for non-native speakers of a language. German is a very popular second language in a number of countries, esp. in Central Europe and the major English-speaking countries. That would almost double its numbers.Bostoner (talk) 02:46, 13 July 2009 (UTC)

Languages used on the Internet[edit]

This article used terms like 'languages used on the Internet' without specifying what that means. The 'Internet' is just a network. If it means the content published in various languages, it should state so. But most likely it refers to the number of users that speak various languages, and I rephrased some of the article's statements accordingly, as the sources seem to indicate this is what was measured. Kbrose (talk) 20:28, 28 September 2009 (UTC)

The NiteCo Survey is here to stay![edit]


I've witnessed that the Average Age of Internet Users survey conducted by NiteCo has been falsely removed after citing bogus and incoherent reasons.

for those of you who are unfamiliar with it , here is a summary it's a user particaption based survey that collects input from surfers who visit the link upon doing so , the average age is displayed followed by a question regarding their age.

some ppl have suggested that this survey is inaccurate by citing the following these irrelevant claims

1) the survey is not random

i guess that's the most absurd claim ever , obviously the fact that it's a public survey open to any surfer implicitly means that IT IS random by nature.

2) the survey is self selected

well that's true , but the point is that this totally irrelevant as there is no evidence that self selection is harmful in the sense that it may corrupt the results.

for instance , there is no data that may support the hypothesis that older ppl or younger may have a tendency to report their age more often or less than the other age group.

in a nutshell , self selection does not neccessarily lead to a selection bias at least not in this case. in other words the burden of proof lies on the critics ! Cowmadness (talk)

I'm afraid you are wrong on every count here. I refer to this reversion of yours. First, you need to look at things like Sampling (statistics) to see that there is far more to it than you seem to realise. Then look at the survey page in question[4] to see that it is just a webpage buried on a single, non-notable website that is only going to be seen by people visiting that site, or whatever links exist to it (like this one). Next, look at the heading, and build up you give it: 'Average age of Internet users' and 'the average Internet user is 28.3037 years old', then check Internet and WWW to see that this is a web site, not representative of the whole web, let alone the whole internet, and that there is a difference. Finally, have a look at WP:BURDEN where it says, "The burden of evidence lies with the editor who adds or restores material". The second reference you give, apart from the one that proves the survey web-form exists, is to a digg page that no longer mentions the site or the survey. What we need is a third-party WP:RS reliable source that says that this survey is accepted by serious academics (or other people of similar standing) as being a recognised statistical survey that is widely regarded as representative. In the meantime, what you have there looks mostly like WP:LINKSPAM and should be removed in toto without further warning. --Nigelj (talk) 17:13, 22 February 2010 (UTC)

Usage per capita[edit]

(Usage percentage : number of speakers) would make an illustrating extra statistic. (It's trivial to calculate, but including this in the table would make sorting possible.) --Trɔpʏliʊmblah 16:38, 12 February 2010 (UTC)

Outdated statistics[edit]

The stats are out of date again. Can some update them?

Ouyuecheng (talk) 09:44, 18 February 2010 (UTC)

Outdated figure[edit]

The study done by W3Techs mentioned in the article is from December 2011. There is already a newer version as of 2012, which states English is used by 54.9% of websites and not 56%. Many Thanks, Zalunardo8 (talk) 15:08, 12 December 2012 (UTC)

Outdated information[edit]

Hello, I believe the last paragraph of the section 'Languages Used' should be updated with more current information. The research shown is dated of 2007. The same info about the percentage of content per language is shown below in a graph, so I think we should either remove the paragraph, or use the same numbers. Cheers, Zalunardo8 (talk) 16:13, 17 December 2012 (UTC)

Should the article "Foreign language internet" be merged into this article?[edit]

A merge template was added in April 2013 suggesting that the article Foreign language internet should be merged into this article, but I can find no discussion of the proposal anywhere. So, here is a place to hold that discussion. Would such a merge be a good idea? --Jeff Ogden (W163) (talk) 12:26, 2 July 2013 (UTC)

Yes, definitely. Please. --Atlasowa (talk) 14:35, 2 July 2013 (UTC)

On 13 August 2013 the article Foreign language internet was nominated for deletion. The result of the deletion discussion was to replace that article with a redirect to this article. That was done on 20 August 2013. The consensus was that there was very little or nothing of value to be merged from the Foreign language internet article. --Jeff Ogden (W163) (talk) 00:40, 21 August 2013 (UTC)

Web sites with most languages[edit]

I'm adding the text from the subsection that was deleted by and is the subject of Jeffro's comment below:

Most languages on One Web Site
Wikipedia has more languages than any other site on the internet. There are presently 285 languages which have at least one article.[1] Jehovah's Witnesses official website follows close behind with articles in 274 languages. [2]

--Jeff Ogden (W163) (talk) 01:03, 21 August 2013 (UTC)

I have removed the subsection about websites with the 'most languages' for the following reasons:

  1. Though it may be true, there is no indication that Wikipedia has "more languages than any other site". It only indicates that Wikipedia has articles in a lot of languages.
  2. There is no indication that the JW site has the second highest number of languages.

Please do not restore unless there is a reliable source indicating that these are the highest.--Jeffro77 (talk) 02:12, 10 August 2013 (UTC)

Please see the article "Global Recordings Network".
Wavelength (talk) 16:41, 10 August 2013 (UTC)

Pie charts vs. bar graphs?[edit]

I've copied this discussion over from my talk page on Wikipedia Commons since more folks with an interest in this are likely to see it here. --Jeff Ogden (W163) (talk) 18:26, 15 March 2014 (UTC)

Just wanted to let you know I've replace the pie charts on Languages used on the Internet with bar charts as part of a group effort to introduce more perceptually accurate charts. See Save the Pies for Dessert, among others. The first table in particular is ill-suited for a pie chart because some sites use multiple languages and so the percents sum to more than 100%. I also updated the data in the first table. I will update other occurrences of those charts if you have no objections. Daggerbox (talk) 02:29, 13 March 2014 (UTC)

I'm fine with switching from pie charts to bar graphs.
I'll note that both pie charts showed percentages, but with the new bar graphs, one shows percentages and one shows millions of users. Is there a reason for the switch? Shouldn't they both be based on the same thing? Or perhaps we need four graphs?
I've thought for sometime that it would be good if this article could be based on figures from the first of a month, since there is some history associated with the first of the month figures and there is no history to use for verification for the other days of the month. Of course, while I thought about this, I never did anything about it.
In the upper bar graph a percentage figure is given for English, but no similar figures are given for other languages. Why is that? Seems like we should give the percentage for all languages or none.
Captions and other labels in English are included as part of the graphs. It might be better to minimize the use of English in the graphs to facilitate the use of the graphs in other language versions of Wikipedia.
--Jeff Ogden, W163 (talk) 03:29, 13 March 2014 (UTC)
Good to see your comments, Jeff. I'll address each one.
I prefer to include actual numbers with real life units where possible, relying on the graphic elements to provide the feel for the relative values. The first data source only provides percentages, of course. The graphs do not have to agree, especially since they don't necessarily appear together, but I can see value both in using a percent scale and in having the charts agree. I'll update if you have a preference since you're closer to the subject matter. Maybe a count axis with percentage labels would work best if both are useful to the message.
Good idea on first-of-the-month data. I was a little concerned that since the data is updated the daily, the graph and table will never really keep up. Would 1-Jan be even better? Looks like it's present in the monthly historical trend page at W3Techs.
In general, I don't like to label every graph element. Somehow it feels like the graph is not doing its job if you have to repeat the table text in the graph. I labeled the English bar to highlight the strongest point from the bar chart, which is that English is used by over half of the sites, and because that bar is pretty far removed from the axes. The other bars are roughly labeled by the reference line.
I agree with your sentiment about localization, but I'm not sure what the remedy might be. Is there a way to support localization in SVG, for instance? Looks like it is by using the systemLanguage attribute -- is that what you meant? Or maybe you mean to leave out the axis titles and leave the language names in English. "Language" is certainly not adding anything.
Daggerbox (talk) 23:27, 13 March 2014 (UTC)
I would go with the percentage based chart for both. That is what the pie charts that are being replaced were doing. I think mixing percentages and counts in one graph is likely to be confusing.
Using January 1st data would be fine. Or July 1st.
I'd omit the percentage label. The fact that English dominates comes across OK without it.
I'd omit as many of the labels from the chart itself as you can. I understand that it isn't possible to omit the language names. Much or even all of the other stuff can be left to the caption that is added by the articles themselves.
--Jeff Ogden, W163 (talk) 02:28, 14 March 2014 (UTC).
OK, I'll make another pass this week-end. Daggerbox (talk) 02:20, 15 March 2014 (UTC)
Done, except I forgot to update the data. Not a bad idea to start a new topic for that, anyway. Daggerbox (talk) 21:42, 15 March 2014 (UTC)

Frequency of data updates[edit]

The source data used by the table and graph for "Content languages for websites" is updated daily? How often should this page's reflection of that data be updated? Lately it's been valid as of the day of the most recent update. In the discussion above (pie charts vs. bar graphs), W163 and I thought more regular date would be best, such as the most recent January 1 or July 1. — Preceding unsigned comment added by Daggerbox (talkcontribs) 21:48, 15 March 2014 (UTC)