Wikipedia talk:WikiProject Languages/Archive 6

From Wikipedia, the free encyclopedia
Jump to: navigation, search


Template:Quebec English

FYI, {{Quebec English}} has been nominated for deletion. IT is an WP:ENGVAR template. (talk) 04:28, 29 April 2010 (UTC)


FYI, zh-tw has come up on WP:RFD for retargetting. (talk) 04:58, 29 April 2010 (UTC)

Cantonese (Yue)

FYI, there is again, an RFC on the naming of the article, see Talk:Cantonese (Yue) (talk) 08:37, 30 April 2010 (UTC)

English words with uncommon properties

I am not sure under what project's "juristiction" this falls under, but there seems to be a problem with English words with uncommon properties, a page with few watchers but 1,000+ views a day, which tends to list the words with high percentage of vowel and other orthographic outliers.
It is interesting and worth keeping, but it is not very scientific. The majority of words are from IP users who have though about that word. Consequently I put together a code to count such entries in wiktionary and automatically get a list of the top scores to avoid people adding words after simply chewing on pencils. The code needs adjusting to remove acronyms and I have not added all pages. I have not heard a single comment about the idea and despite being entertaining I really do not have the time to do something no one likes.
I am not asking for a barnstar/pat on the back, I just would like to hear someone's honest opinion if it is a good idea or not (talk page of article). --Squidonius (talk) 13:10, 5 May 2010 (UTC)

US English Dialect Page Titles

Should there be a common form for pages dealing with the varieties of English spoken in a US region? There's currently a debate on the topic of the name of the variety(ies) spoken in Baltimore Talk:Baltimore_dialect#Recent_page_move_--_should_have_waited_for_consensus. Note the two predominant forms are PLACENAME Dialect as in Baltimore dialect and PLACENAME English as in Pittsburgh English. My own impression is that the latter is tending to predominate in the dialectological literature.mnewmanqc (talk) 19:30, 5 May 2010 (UTC)

As List of dialects of the English language demonstrates, there is nowhere near consensus about this issue. "X dialect", "X English", "X accent", and other, less normalized naming formats (e.g. Scouse, Cockney) are all used for some regional variations of English. Unless we want to propose moving all of the articles to a single naming format—an unwieldy task to undertake (not to mention the decision on what name to use)—it might be best to just address each name on a case-by-case basis. I mean, I wouldn't necessarily be opposed to using a single naming format, but I would imagine a lot of people would be, and there are compelling arguments for both sides of a "dialect" vs. "English" debate. (Upon a second look at that list, there does appear to be preponderance of "English"; however, some of names of the articles differ from the text used to link to them.) Gordon P. Hemsley 04:06, 28 May 2010 (UTC)
Thank you Gordon for your response. To clarify, our issue on the "Baltimore dialect" page outside of the "Dialect" vs. "English" argument, is whether or not the article title should be on of those titles OR should be its local nickname of "Baltimorese". What are your thoughts? Why doesn't the "Pittsburgh English" page use "Pittsburghese" as their title?--oldlinestate 01:46, 31 May 2010 (UTC) —Preceding unsigned comment added by Oldlinestate (talkcontribs)
I intended to cover that part of the argument in my original response, I think, but I kind of abstracted away from it. I would like to refrain from forming an opinion on the issue beyond what I've already said, as I would prefer every article name follow the same format. However, as is common with language (see, for example, the debate going on and on about Yue Chinese), these things are often based on people's opinions, rather than any hard fact. If we can't agree on one set format for all language-related articles (and I'm pretty sure we can't), then I'd be more inclined to go with whatever is the dominating term in the academic literature, as those (for me, anyway) hold more weight than a layperson's opinion on the name of what they speak (that's what redirects are for); as I said, I'd prefer things stay systematic, and linguists are more likely to do that than laypeople. With all that being said, though, I can tell you that I do not like the phenomenon of adding "-ese" to some place in order to make it the name of a language—and, I think, in this case, they're not even languages, they're dialects, so I'd be even more opposed to it. So that goes for Baltimorese, Pittsburghese, New Yorkese, etc. (Apologies for the rambling.) Gordon P. Hemsley 19:55, 1 June 2010 (UTC)


There's now a claim that Tartessian language is Celtic. Seems suspicious to me that s.t. as obvious as the way that is presented should have gone unrecognized for so long. I toned down the claim, but it should be reviewed and probably reworded. (And of course if this was a breakthrough, my edits should be reverted.) — kwami (talk) 00:40, 8 May 2010 (UTC)

Tartessian will very likely be reclassified as the earliest Celtic language. Refer to "Tartessian: Celtic from the South-west at the Dawn of History" (2009) by John Koch. In addition, the first segment of the University of Wales "Celtic in the South-west" study will be released in August 2010. The project's material adds further confirmation to Tartessian being the first Celtic language. —Preceding unsigned comment added by London Hawk (talkcontribs) 12:54, 24 June 2010 (UTC)

Given that no one except John Koch believes this, I really doubt that there will be any change to the communis opinio about the Celtic family tree any time soon. +Angr 19:09, 24 June 2010 (UTC)

Excuse me, "no one except John Koch believes this"? I don't think so. Do some research. —Preceding unsigned comment added by London Hawk (talkcontribs) 20:24, 25 July 2010 (UTC)

Creating language stubs

User:Jasy jatere has requested bot assisted creation of the missing ISO 639 language redirects, which seems uncontroversial because of the existing template:R from ISO 639 and category:Redirects from ISO 639. The user has also asked whether my bot could create the missing language stubs in the process (see User talk:Anypodetos#Bot request). Is this desireable? --ἀνυπόδητος (talk) 09:57, 12 May 2010 (UTC)

What would a bot say about a language for which there is no article yet? +Angr 10:58, 12 May 2010 (UTC)
The proposal is a one-sentence stub based on the Ethnologue data, on the lines of
  $language is a language spoken in [[$country]] by $population people. It is also known under the names @alternatenames.{{lang-stub}}

--ἀνυπόδητος (talk) 11:53, 12 May 2010 (UTC)

And, I assume, a link back to the Ethnologue article? I'm actually a little skeptical about letting a bot start an article on a language just because the language has an Ethnologue entry. See Wikipedia:WikiProject Languages/Template#External links. +Angr 12:07, 12 May 2010 (UTC)
Thanks and never mind. I just wanted a second opinion. ἀνυπόδητος (talk) 13:42, 12 May 2010 (UTC)
I think it is a great idea. Go for it! Landroving Linguist (talk) 18:07, 16 May 2010 (UTC)
The Ethnologue is a highly unreliable source. There should be no connection between it and Wikipedia. G Purevdorj (talk) 18:16, 16 May 2010 (UTC)
I would call it only a moderately unreliable source; it's like the curate's egg. I do think it gets more right than wrong, but it gets enough wrong that the whole thing has to be viewed with a healthy dose of skepticism. Nevertheless, it has no competition as a compendium of the world's languages. For better or for worse, there is nothing that can be used in its place. +Angr 18:43, 16 May 2010 (UTC)
If there is already language article that has to be put into some historical context, the Ethnologue is one way. If you work on any more substantial language article, looking into the literature on the history of this language is the better choice. But creating articles only because there are entries in the Ethnologue is not a good idea at all. The existence of a number of articles like China Buriat language is unjustified, and in the case of Oirat-Kalmyk-Darkhat languages I am still wondering where the Ethnologue got this from! Let's not conjure any other ghosts like these. G Purevdorj (talk) 20:00, 16 May 2010 (UTC)
While one can question some portions of the Ethnologue, SIL (the maintainer of Ethnologue) is the official registrar for the ISO 639-3 standard [1]. The opinion of an ISO registration authority has enough weight to be used as a source on wikipedia, even if one can disagree with it from time to time. The Ethnologue information will only be used to create stubs. If more information is/becomes available, these stubs can be expanded accordingly and the information from the Ethnologue can be complemented with alternative views. Jasy jatere (talk) 14:36, 17 May 2010 (UTC)

Three relevant questions to be asked about the creation of a new article are 1) is the topic relevant and 2) is there enough content to contextualize the article and 3) is the content sourced to a reliable source. As for 1), all languages which have an ISO 639-3 code are surely relevant. As for 2), the one liner given above seems to be sufficient to contextualize the entry. As for 3), I would argue that SIL/the Ethnologue passes the criteria for WP:RS

Wikipedia articles should be based on reliable, published sources, making sure that all majority and significant minority views that have appeared in reliable, published sources are covered;

Even if Ethnologue might not always represent the majority, it is surely one of the sources that should be given as a significant minority view. Jasy jatere (talk) 14:48, 17 May 2010 (UTC)

Any linguistic variety that is either sufficiently distinct or treated in sufficient detail is significant. But it does not make sense to create hundreds of stubs for linguistically similar varieties that lack detailed descriptions. A reasonable human editor would probably prefer to treat some varieties in one article that would also be more informative to the reader. Secondly, while there is no (or only little) harm in merely citing the Ethnologue, creating hundreds of articles confirming to the opinion of the Ethnologue (which barely confirms to scientific standards with its sparingly used citations and completely arbitrary reliance on the available literature) would enforce a highly dubious classificational scheme in Wikipedia. The Ethnologue is readily available, and anyone who wants can look it up. Wikipedia should therefore best constitute a source of independent not disproportionally reliant on that source. G Purevdorj (talk) 22:49, 17 May 2010 (UTC)
The reason why I proposed this is actually that I would like to see a full coverage of ISO 639-3 on wikipedia. The ISO 639:xxx namespace exists, and people will expect that this will cover any code contained in ISO 639-3. They will be surprised if coverage of some codes is left out because wp disagrees with ISO.
I for one want to create software that uses the URL template I suppose that there will be other people who would want to use that URL as well. We are creating a reference catalogue of all the world's language with around 130000 entries at my institute now. For these entries, we list the languages covered and link to other resources, like OLAC, Ethnologue, or wikipedia. A uniform way of accessing the ISO-languages contained in wikipedia makes this a lot easier (which is probably why the ISO 639:xxx namespace was created in the first place). Now we should try to fill that namespace. And if everything we can say about a particular ISO 639 code is what Ethnologue reports, then we have to accept this. It cannot be the business of wikipedia to find fault with ISO for selecting SIL/Ethnologue as registration authority.I personally think that a dubious organization like SIL should not be allowed to be registration authority, but I have to accept the fact that ISO has decided otherwise. The problems linguists have with SIL/Ethnologue cannot be solved on wp, but must be solved with ISO Jasy jatere (talk) 08:31, 19 May 2010 (UTC)
I second this. As a member of this dubious organization, I'd like to add that ISO 639 information can be changed. The trick is to write to the editor of the Ethnologue (see contact information here), give evidence that is superior to the one on which current information is based, and then wait (at the most a year) for the changes to be implemented. No need to grudge over bad info in the ISO 639 if you have good information to update the code. Landroving Linguist (talk) 10:03, 19 May 2010 (UTC)
I'm not so sure about the plans to use the ISO 639:xxx link format, but I do agree that every language with an ISO 639 code should have an article on Wikipedia. So I'd say it's a good plan to auto-create stubs for those that do not currently exist. The reliability of Ethnologue should have no bearing on this matter. Gordon P. Hemsley 11:55, 23 May 2010 (UTC)

Just for your information: The bot has finished creating the ISO 639 redirects (about 2540). A list of existing redirects the bot couldn't verify is at User:PotatoBot/Lists/ISO 639 log. These are mostly redirects to ISO 639 macrolanguage instead of individual language articles. A list of languages that probably don't have Wikipedia articles (about 5000) is at User:PotatoBot/Lists/ISO 639 language articles missing. --ἀνυπόδητος (talk) 15:05, 3 June 2010 (UTC)

On the proposal to create stubs for every iso-639-3 language, I think that many rare languages might better be covered in articles on their language families or branches, while some dialects are notable enough for separate articles. Some languages with separate iso-639-3 codes are best covered in joint articles (e.g. punjabi pan/pnb), although the separate codes are important to cue script encoding or other features. Bcharles (talk) 21:10, 1 July 2010 (UTC)

I disagree. Any language spoken on this planet (including extinct ones) is noteworthy enough to warrant a Wikipedia entry. Admittedly, not much is know on many languages, but even this very fact is noteworthy, and it is conveyed by having a stub article on the language. Landroving Linguist (talk) 09:09, 2 July 2010 (UTC)

Searching language infoboxes

That's an impressive list. Would it be possible for your bot to go through that list and compare codes with the |iso3= parameter in existing Language Infoboxes and create redirects in case of a match? I've just done that for Sudanese Arabic language (which was a redlink until a few minutes ago and now redirects to Sudanese Arabic), but it might be easier for a bot to do that. +Angr 18:17, 3 June 2010 (UTC)
Interesting idea. I can't promise anything for the next few days, but I will set my bot on that task as soon as I've got the time. Hope you aren't in a hurry. --ἀνυπόδητος (talk) 18:51, 3 June 2010 (UTC)
If it's not too hard, it should also check the |lc1=, |lc2=, |lc3=, etc. parameters. +Angr 21:52, 3 June 2010 (UTC)
Looks like the bot will have to collect data from the ISO code tables, its own list, and the language infoboxes, and cross-check everything. Many cases (like two articles having the same ISO code) obviously cannot be fixed automatically, but would be logged for inspection by a human editor (i. e. Angr, most likely). The coding will take some time, though. --ἀνυπόδητος (talk) 08:11, 4 June 2010 (UTC)
Angr, searching lc1 etc. will produce lots of duplicate matches. An example: xcl is the value of lc2 on Armenian language and the value of iso3 on Classical Armenian. What if an iso3 value "overrides" an lcnn value, so that the bot concludes Classical Armenian to be the correct article? In cases where there is no matching iso3 (which is probably the case for lc1=hye on Armenian language), the page with the lcnn value would be used. Does that make sense? --ἀνυπόδητος (talk) 18:15, 4 June 2010 (UTC)
Yep, that sounds good. +Angr 18:19, 4 June 2010 (UTC)
Another one: Is it a problem if both redlinks the bot finds to be likely titles are redirected to the language article, e. g. Uzbeki Arabic language and Arabic language (Uzbeki) to Uzbeki Arabic? There seems to be no good way of detecting the "better" one, and redirects are cheap. --ἀνυπόδητος (talk) 12:51, 7 June 2010 (UTC)
Yeah, it's probably better to have an unnecessary redirect than to omit a useful one. +Angr 12:55, 7 June 2010 (UTC)
The bot has finished the first run with its new code. User:PotatoBot/Lists/ISO 639 language articles missing has been reduced to 4300 items, and User:PotatoBot/Lists/ISO 639 log has been blown up to 800, now also listing discrepancies between ISO codes in language boxes and redirects and/or the ISO 639-3 lists. Please leave me a note if you have any suggestions (if the bot could fix any of these issues by itself, if the log should be rephrased, etc.) Also, please add any language articles that you have checked manually to User:PotatoBot/Excludes/Language articles, so the bot will not log them in its next run. Regards, ἀνυπόδητος (talk) 05:44, 18 June 2010 (UTC)

Adding language infoboxes to articles

Jasy jatere has kindly sent me the SIL and Ethnologue data as text files. I intend to check these against the ISO 639 lists and update them; but the data could also be used to add missing {{infobox language}} transclusions to language articles. The following parameters could be filled in (though not all params for all languages): name, iso3, familycolor, famn, script, speakers/signers. region could also be filled in, but I'm not quite sure whether this field borders the threshold of originality and thus might be copyrighted by SIL.

Since it will need some work to implement this into my bot's code, I wanted to ask whether this procedure would be supported by WP:LANG. Thanks --ἀνυπόδητος (talk) 19:59, 26 June 2010 (UTC)

IANAL, but stating code and name of a language does not look very original to me. As for script, speakers, region, sometimes these are taken over from older sources, but sometimes it is SIL's own team who collected the information. However, it is clear that we are dealing with science here, and it is good scientific practice to refer to the work of other scientist, giving due credit in the form of attribution. For an individual page, this should not be a problem, but what we are doing here is very much a replication of the Ethnologue information. That might be a legal problem, and we should get real legal advice by qualified persons on this. Jasy jatere (talk) 10:48, 9 July 2010 (UTC)
My first hunch would be that all this information will be found in each language article anyway, and is often also retrieved from the Ethnologue. Compiling it all in an info-box should not be a bigger infringement on SIL's intellectual property rights. In any case, it may be good to ask. this page contains an e-mail address where you can ask the Ethnologue editors for advise on how they would see it. My guess is, as long as SIL is credited, they will not object. Landroving Linguist (talk) 13:52, 9 July 2010 (UTC)

World Classical Tamil Conference 2010

Hope the members involved in improving the projects related to LANGUAGES can contribute a lot to this article about the upcoming World Tamil Conference. I need your help improving the article guys. --Ben (talk) 09:43, 18 May 2010 (UTC)

Slang categories

I recently listed a number of slang-related categories for deletion, because they are of an unencyclopedic nature: CfD for Mimzy1990's slang. The result of that discussion, based on only a small handful of opinions from users outside the field of linguistics, was keep. (I actually think it was no consensus, but that's another story.) So, I come here to ask the opinions of the larger WP Languages editorbase. What do you guys think about these categories? And if you agree with my rationale (I went into more detail on the actual CfD), please feel free to relist them for deletion. Gordon P. Hemsley 11:46, 23 May 2010 (UTC)

template:Languages portal

FYI, {{Languages portal}} has been nominated for deletion. (talk) 05:16, 24 May 2010 (UTC)

Now replaced by {{portal}}. Thanks! Plastikspork ―Œ(talk) 22:47, 14 June 2010 (UTC)

Yue Chinese / Cantonese

FYI, Yue Chinese is up for renaming , this is the article on the Cantonese language. See Talk:Yue Chinese. (talk) 03:43, 29 May 2010 (UTC)

Just keep it at Yue Chinese and move on with your life. Seriously. How many times is this debate going to go on? Enough already. Gordon P. Hemsley 19:35, 1 June 2010 (UTC)

List of languages by name

FYI, List of languages by name and related articles have been prodded for deletion on 30 May, see Category:Proposed deletion as of 30 May 2010 (talk) 04:51, 31 May 2010 (UTC)

For the record, I agree with this PROD. Gordon P. Hemsley 19:27, 1 June 2010 (UTC)
I proposed the del. because the lists strike me as hopelessly arbitrary, with no fix in sight. But User:Wavelength finds them potentially useful, as he discusses on my talk page. — kwami (talk) 19:34, 1 June 2010 (UTC)
I agree with you and your rationale, as I've mentioned on your talk page. It seems to me like it would be a wasted effort. Gordon P. Hemsley 19:45, 1 June 2010 (UTC)

List of languages by name: A through M have been deleted by prod, but List of languages by name: N through List of languages by name: Z exist. Is this a deletion process in progress? or one that has been reconsidered? The initial-letter lists contain redlinks not in List of languages by name, but like Gordon P. Hemsley and kwami I'm not convinced of the value of having such an open-ended list. At least, I don't see the need for the main list and the sublists by initial letter. Cnilep (talk) 13:28, 11 June 2010 (UTC)

FYI, a bunch of list of languages by name have been prodded for deletion, see Category:Proposed deletion as of 21 June 2010. (talk) 05:01, 22 June 2010 (UTC)

I tagged A thru M, then got bored and neglected the rest. However, A thru M were deleted two weeks ago and no-one has bothered to comment on that, so I tagged N thru Z today. (P has only a couple languages, far fewer than the main list!) I untagged the main list because there were objections, but it would seem that no-one's tried to use the A-M sublists in the past two weeks. — kwami (talk) 09:06, 22 June 2010 (UTC)

Virgin Islands Creole

I have a problem with Virgin Islands Creole. Ethnologue considers "Netherlands Antilles Creole" to be a synonym, and assigns them the same iso3 code, something which is consistent with, but not actually confirmed by, other refs that I have found. However, when I rewrote (rather sloppily, I'm afraid) the Virgin Islands Creole article to reflect this, and redirected Saint Martin Creole stub to it, others objected that the Netherlands Antilles are not the Virgin Islands, and that they are distinct dialects. Dialects, yes (Ethnologue implies that St Martin is more distinct from the other Windward NA than they are from the VI); my question is whether they are distinct languages, and if language distinctions is what we should go by when writing the article. Also, the old St Martin Creole article had classified it as being Antiguan, which has a different iso code. Of course, all three may be the same "language". Any opinions on how to proceed here? — kwami (talk) 06:42, 5 June 2010 (UTC)

default naming

The consensus for naming language articles seems to be "X language" or "X dialect", apart from Chinese and Arabic, in which mutual-intelligibility languages are "X Chinese" etc.. However, there are large numbers of articles which follow the Chinese/Arabic format: "X German", "X Quechua", "X Zapotec", etc. Is this formalized anywhere? Is it what we want, and, if it is, should we be using it for cases where it's unclear whether we have a language or a dialect, or is it simply a shortcut for "X Y language", where Y is Arabic, German, etc.? If the latter, would it be preferable to retain the word "language" if the Y is obscure enough that few readers are likely to recognize it as a language? or is it acceptable also for obscure things like Franconian?

Related formats are "Standard X", and "Old/Middle/Modern X", but "Proto-X language", not just "Proto-X".

There doesn't appear to a similar convention at WP ethnic groups, though there does seem to be a passive consensus on "X people", at least for ethno-linguistic groups. (Though one editor has opined that since language is dependent on people, plain "X" should be used for an ethnicity, and "X language" for their language, as opposed to the common existing practice of using "X" for a dab page directing the reader to "X people" and "X language".) Might it be useful for us to work with that other branch of anthropology and come up with a common philosophy for both language and ethnicity articles, so we don't have future disagreements on what title "X" should cover? — kwami (talk) 22:19, 7 June 2010 (UTC)

For dialects, I don't think the default is "X dialect", nor should it be. The default is "X Y" where X is the dialect name and Y is the language name, e.g. American English, Ulster Irish (where "American dialect" and "Ulster dialect" really wouldn't work since those names could also refer to American Spanish or Ulster English) - and that's for all languages with dialects described, not just for Chinese and Arabic. I do think the bare names should remain dab pages, though. When used as a noun, any term "Fooian" is likely to mean the language when used in the singular and the people when used in the plural: German is... refers to the language, while Germans are... refers to the people; and sometimes singular and plural are the same, as in French is... (language) vs. The French are... (people). +Angr 05:40, 8 June 2010 (UTC)
What do you do in cases where it is questionable where the dialect belongs to? E.g. Darkhad dialect might become "Darkhad Mongolian", "Darkhad Oirat" or (in a minority opinion) "Darkhad Buryat". We have quite a few such cases of regiolects changing under the influence of a contact variety. G Purevdorj (talk) 03:25, 10 July 2010 (UTC)
Well, in cases like that it probably is best to just call it "Foo dialect", but surely cases where it isn't clear what language a certain linguistic entity is a dialect of are a small minority. In the case of Darkhad, since it has its own ISO 639-3 code I'd probably just call it Darkhad language anyway. +Angr 10:22, 10 July 2010 (UTC)

Pacific Northwest English at Regional vocabularies of American English

Three different users have objected to content in the subsection "Pacific Northwest" on the page Regional vocabularies of American English based on personal experience. Three editors from the region have claimed that some or all of the forms are either nonexistent, rare, or "somewhat derogatory". Two of these editors have removed the items; the third expressed incredulity on the talk page. I returned the items to the page, as they are sourced to the Oxford English Dictionary and the Dictionary of American Regional English. Does anyone have an opinion on whether it would be better to a) seek sources calling these items rare, b) remove the items or the subsection, or c) leave the items in place with current sourcing? Cnilep (talk) 21:49, 10 June 2010 (UTC)

Serbo-Croatian grammar merger

I'm planning on merging Croatian grammar into Serbo-Croatian grammar soon, in case there are any comments / objections. So far the only worry has been the reaction of Croat nationalists; no linguistic or procedural objections have been raised. Everything at C. grammar is duplicated at SC. I just merged Serbian grammar, despite the fact that not everything is yet duplicated, as it was entirely in Cyrillic. — kwami (talk) 01:53, 4 July 2010 (UTC)

semi-automated conversion of American spellings to British spellings

For anyone who is interested, a maintenance script is available to convert the entire contents of a page from American spelling to British spelling, see the documentation here. If you have any queries or feel that the script needs modifying in any way, you know where to find me ;-) Ohconfucius ¡digame! 09:33, 9 July 2010 (UTC)

Paul the Octopus or Paul the octopus

Could somebody please advise on this matter, please, double please? Anna Frodesiak (talk) 18:15, 15 July 2010 (UTC)


Have an editor who insists on using a BBC figure of 490M speakers in the info box, to the exclusion of any other refs, even though he admits it seems "inflated". He then added a table of pop. in various countries, which total 65-66M (the figures seem reasonable), but fudges the total at the bottom to 88-172M, for which he cites Ethnologue, though Ethnologue actually has a figure of 61M. Seems to be more interested in propaganda than the actual language; could use a few more pairs of eyes. — kwami (talk) 21:26, 19 July 2010 (UTC)

Bantu language names

Should we follow the example set by Swahili language for other Bantu languages and avoid the prefix in the title, for example at Ganda language? There's a request to move the article to Luganda, but I'm finding plenty of references to "Ganda" both in linguistic and non-linguistic works (such as art, history, and evangelism), some published in Uganda by Ganda authors. Also, in general, since many Bantu languages are obscure, should we try to follow the native or anglicized form? For example, some journals request that names in articles be in a specific format (such as "Swahili" and "the Swahili" rather than Kiswahili and Waswahili), and I can see advantages to consistent usage in an encyclopedia as well. — kwami (talk) 19:05, 20 July 2010 (UTC)

Rosetta Stone

I guess this may not be directly relevant to the Languages project, but it still might interest some participants. The Rosetta Stone article, a FA candidate, has a fairly long section on languages and decipherment. If anyone would like to read and comment at Wikipedia:Featured article candidates/Rosetta Stone/archive1, all such comments would be welcome! Thanks -- Andrew Dalby 14:51, 21 July 2010 (UTC)


There are articles at Langues d'oïl, Langues d'oc, and both articles mention a Langues de si. Is there an article on this "si" linguistic group? (talk) 23:35, 21 July 2010 (UTC)

Bergish (language)

I have some concerns about the validity of this article, but insufficient expertise to comment on it. Could someone with more experience check it over? Catfish Jim and the soapdish (talk) 11:01, 24 July 2010 (UTC)

FAR notice for Gbe languages

I have nominated Gbe languages for a featured article review here. Please join the discussion on whether this article meets featured article criteria. Articles are typically reviewed for two weeks. If substantial concerns are not addressed during the review period, the article will be moved to the Featured Article Removal Candidates list for a further period, where editors may declare "Keep" or "Remove" the article's featured status. The instructions for the review process are here.-- Cirt (talk) 16:08, 31 July 2010 (UTC)

Third opinion requested at Written Cantonese

Please can someone look at the recent changes and the thread on the talk page, to do with the various translations added by User:Danielsms. --JohnBlackburnewordsdeeds 23:30, 2 August 2010 (UTC)

Sources requested for designation of the language described in Standard Mandarin

The current article Standard Mandarin makes clear enough what language it is talking about, and does a great job of describing the several different Chinese names for that language and where those names are used. But I get the impression that "Standard Mandarin" is not the most common English name for that language, nor is it the name that would be found as an encyclopedia entry or dictionary entry in most English-language reference works. I note that the Chinese Wikipedia version of that article is titled 現代標準漢語, which would suggest a different English title for the article.

The Wikipedia rules on article titles says that while "not always possible, the ideal title is: * Recognizable – Using names and terms commonly used in reliable sources, and so likely to be recognized, for the topic of the article."

From the point of view of you editors who read a lot about various languages in English, what do you think is the most commonly used name in reliable sources for the language about which the article Standard Mandarin is written? I appreciate your thoughtful suggestions. -- WeijiBaikeBianji (talk) 20:32, 12 August 2010 (UTC)

I would say the most common name for it is "Mandarin Chinese" or even just "Chinese", so if I for example say "I am learning Chinese" most people would understand or assume I am learning standard Mandarin. But Mandarin is more than this, in that many other dialects are also classified as Mandarin. This is a lot like other languages: if I say "I am learning/teaching English" although there's no formal standard you can assume it's something close to BBC English or it's US equivalent, not a dialect like Scouse or Geordie.
It's covered in the introduction of Mandarin Chinese pretty well, so anyone searching for that term can find Standard Mandarin from there, which itself describes its connection to the broader dialect group. I can't think of any other name for it myself, and can't think of a better way of doing it.--JohnBlackburnewordsdeeds 00:42, 15 August 2010 (UTC)
But the standard language isn't called "Standard Mandarin", it's called "(Modern) (Standard) Chinese" in the vast majority of sources including popular spoken usage, the media, popular literature, specialist literature, and reference literature. --Taivo (talk) 01:02, 15 August 2010 (UTC)
John, thank you, I'm also still pondering your comments on the talk page of Written Cantonese. What I would say in response to your thoughtful reply is that if Wikipedia had but a single article titled Mandarin Chinese, there would be no issue about naming that article, nor would there be if there were just a sole article Mandarin (language). But what we have now (checking as I type this) is an article Mandarin Chinese, a broader article (linked to from the lede of that) called Spoken Chinese, and a narrower article, the one I am asking about, called now Standard Mandarin, which is not even to mention Mandarin dialects, Beijing dialect, and a variety of other articles that might include information about the current standard language in Chinese-speaking countries, which I know by the name (in English) "Modern Standard Chinese." I can tell from the English of some of the articles that many of the articles are being edited by persons who are not native speakers of English, and who perhaps in addition are not completely sure how the English language draws some of those distinctions. That's why I'd like to check what current reference books (especially other encyclopedias) and educated usage in English does about distinguishing the standard form of speech and writing officially promoted now in China (and elsewhere) from the broader language/dialect group of which that language is a part. Thanks for any further thoughts you have on this issue, with those considerations of article-naming in mind. -- WeijiBaikeBianji (talk) 01:04, 15 August 2010 (UTC)
the reason for not calling it "Chinese" is that "Chinese" is more than just "Mandarin". Spoken Chinese also includes Cantonese, Hakka, Shanghainese, or more broadly the dialect groups they are in such as Yue Chinese and Wu Chinese. Written Chinese is not tied to Mandarin either: it is used by speakers of all varieties of Chinese, although with regional variations such as in Written Cantonese. I should clarify what I wrote above: if I say "I am learning Chinese" followed by something in either Cantonese or Putonghua to demonstrate 90% of British people would not notice the difference, and would be ignorant of the fact that there is a difference at all and that there are multiple mutually unintelligible varieties of Chinese. I suspect most of the minority of English speakers who can tell them apart know also to add e.g. "Mandarin" to disambiguate it, as is done here.
I can see a need for all of those articles, as they all describe something different though there is some overlap. Mandarin dialects is the main article for something summarised in Mandarin Chinese, while Beijing dialect is one of those dialects, one very close to Putonghua but not exactly same. We can't have "Mandarin (language)" as that causes problem with NPOV; the main varieties of Chinese such as Mandarin and Cantonese are described as separate languages by some people but dialects of Chinese by others. Anyone asserting one or the other viewpoint, especially in an article title, is likely to be vociferously resisted by some editors. So phrases like "Mandarin language" or "Mandarin dialect" are best avoided when considering names.--JohnBlackburnewordsdeeds 15:03, 15 August 2010 (UTC)
Taivo and kwami have both done a good job of finding sources on this issue, as is reflected on the article talk page and on the Naming conventions: Chinese talk page, and their research replicates my research with different searches of sources, showing that the most standard (and widely used) English name for the exact language described in that article is "Modern Standard Chinese," matching the Chinese title of the corresponding article on Chinese Wikipedia exactly. -- WeijiBaikeBianji (talk) 16:01, 16 September 2010 (UTC)

Per John's objection that it should be "Standard Mandarin" because it's Mandarin, I'd raise a few counterpoints:

  • It is the standard for all of Chinese, not just of Mandarin. Cantonese, Shanghainese, and Hakka children are all educated in it, and it is the sole register of national communication. "Standard Cantonese", on the other hand, is the standard for Cantonese alone.
  • Modern written Chinese is now at Vernacular Chinese; by John's argument, it should be moved to "Written Mandarin".
  • The standard language of Italy is based on Florentine (dialect or language, take your pick); by John's argument, we should call it "Standard Florentine" rather than "Standard Italian".
  • The word "Mandarin" originally referred specifically to the standard language, with the broader use for beifanghua being a secondary extension, so "Standard Mandarin" is a bit like saying "Standard RP".

None of these points make "Standard Mandarin" actually wrong, but they may explain why the ELL never uses the phrase, and instead restricts itself to "Mandarin" and "(Modern) Standard Chinese". — kwami (talk) 23:21, 16 September 2010 (UTC)

Admiralty Island languages

RfM based on the claim that there is a field of Admiralty Island linguistics with its own terminology, that's independent of the Loyalty Islands or Solomons or any other subgroup of Oceanists. — kwami (talk) 06:00, 14 August 2010 (UTC)

Kwami has made a rather biased statement of the issue reflecting his own POV. A request has been made to move the Admiralty Island languages article to Admiralty Islands languages. The arguments for and against are placed there. --Taivo (talk) 00:01, 15 August 2010 (UTC)

Consensus building for language templates when Limited Recognition States are involved

There is occasionally a discussion that goes on something like this:

  • Editor A inserts a limited recognition state name in a list of states where a language is spoken (see "Other States" list at List of sovereign states)
  • Editor B objects because that state is not listed in Ethnologue or some other source and removes it
  • Editor A points to effective consensuses that were built on compromises at Ossetic language (for South Ossetia), Abkhaz language (for Abkhazia), Turkish language (for Northern Cyprus), Armenian language (for Nagorno-Karabakh), Ukrainian language (for Transnistria), etc. were the disputed states name is italicized with an explanatory footnote
  • Editor B digs in his/her heels and says if it's not in Ethnologue then it shouldn't be listed
  • Editor A points to a map, etc., etc.

An effective compromise has reached consensus at the language articles listed above (and others) where the disputed state's name is listed in italics with an explanatory note about its limited recognition in parens following. This compromise, in effect, says that completely ignoring the limited recognition state is POV, but listing it as a full partner with other states is also POV, so listing it in italics with a note is the NPOV route. It would be helpful to build a consensus here so that editors who object can be pointed to one place for the discussion. This consensus is construed to effect only language articles. --Taivo (talk) 23:58, 14 August 2010 (UTC)

Why not? Otherwise we cover the same issues time and time again repeatedly. --Taivo (talk) 15:38, 15 August 2010 (UTC)
Because the situations are different. Abkhazia has a different status on the global stage from Somaliland, and Abkhaz has a different status in Abkhazia from that of Yemeni Arabic in Somaliland. +Angr 15:44, 15 August 2010 (UTC)
The problem of not having a general consensus, however, is that we sometimes encounter the nationalist (full equality) or the anti-nationalist (no mention at all) who are unwilling to reach any kind of compromise. Going the arbitration route for these occurrences seems to be overkill, so we are left with whoever survives the longest rather than a reasonable NPOV compromise based on consensus. --Taivo (talk) 15:53, 15 August 2010 (UTC)
  • Support the listing of disputed states in language articles in italics and with an explanatory footnote (illustrated, e.g., at Ossetic language, Abkhaz language, Turkish language, Armenian language). Taivo's suggestion is very reasonable, and the very fact that Angr points out that situations are different proves Taivo's point right. If we do not adopt this policy, we will continue to have endless pointless discussions, debating politics when really we want to discuss languages. Landroving Linguist (talk) —Preceding undated comment added 17:59, 15 August 2010 (UTC).

Loyalty Islands languages

A move request has been made to move Loyalty Island languages to Loyalty Islands languages. --Taivo (talk) 15:01, 15 August 2010 (UTC)

Adding a language to an article

Today, I was about to add the Arabic language to the #REDIRECT Gowalla article, and frustration completes here, I didn't know how, surfed the web and went to the help page and I still didn't know how-to, after joining the wikipedia channel in freenode, someone gave me tips on a how-to add one, found out that its a complicated procedure and this is where it hit me...

How about adding a hyperlink option that is called "Add a language" under the Languages sidebar menu where simple Wikipedia contributors can easily and without a hitch add another Language for the article?

I love Wikipedia so much and I have a lot to give and learn, but these sort of frustration that I stumble upon in the website prevents me and probably prevent others from continuously contributing the wiki unconditionally.

Thank you for hearing me and I can't wait to know do you think about it.

Comments from a simple and a newbie user. —Preceding unsigned comment added by CEnTR4L (talkcontribs) 13:10, 27 August 2010 (UTC)

If you mean a link to another language wiki such as the Arabic one the name for these is interlanguage links, and the page for them is Help:Interlanguage links. You can simply add the link as described there. A link to the Arabic wiki would look like this
with the dots replaced by the Arabic article name. They are usually added at the end of the article (I see there are already some at Gowalla). This is quick and easy to do, and few editors need to do it, so I don't see it ever being added to the toolbar. --JohnBlackburnewordsdeeds 13:18, 27 August 2010 (UTC)


There is dispute on Talk:Maharashtra#Marathi_statement_dispute about the definition of standard Marathi language. It affects Marathi (lead), Maharashtra and Pune articles. Please help to form a consensus by giving your valuable comments. A request is also put at Wikipedia:Requests_for_comment/All#Language_and_linguistics. Thanks. --Redtigerxyz Talk 05:34, 28 August 2010 (UTC)

Language and thought

The New York Times published "Does Your Language Shape How You Think." (Guy Deutscher - August 26, 2010) WhisperToMe (talk) 18:41, 28 August 2010 (UTC)

Articles naming

With regard to an unsuccessfully requested move - in Category:Language comparison there are the articles Differences between Malay and Indonesian, Differences between Norwegian Bokmål and Standard Danish, Differences between Spanish and Portuguese, Differences between Scottish Gaelic and Irish, Differences between standard Bosnian, Croatian and Serbian (all without the word language in their titles), but also Differences between Slovak and Czech languages and North-South differences in the Korean language (whose titles do contain language). Besides, this template links to History of French, History of Greek, History of Dutch, History of Danish, but also to History of the Russian language, History of the Korean language, History of the Welsh language, History of Slovak language. How shall we unify those titles? --Theurgist (talk) 07:24, 11 September 2010 (UTC)

Nomination of infobox templates for deletion

I've nominated template:English language and template:Norwegian language for deletion. Please see Wikipedia:Templates for discussion#Language infobox templates for deletion entry.

Peter Isotalo 08:36, 12 September 2010 (UTC)

Boston English

FYI, Boston English has been put up for renaming to Boston accent. (talk) 04:17, 14 September 2010 (UTC)

Linking to Wiktionary Swadesh lists — a "WikiVocab" project

I'd like to link all Wikipedia language articles with lists in Wiktionary's Swadesh lists appendix to their respective lists. Wiktionary currently has lists for around 200 languages, many of them in language-family rather than individual lists — see I have personally created and finished around 20 different Swadesh lists, with more coming on their way. I'm wondering if it's possible to do so in the {{Infobox Language}} template, or to create a separate template for this purpose.

My dream is for there to be a 'big database' on the Internet where anyone can access the basic vocabulary words (in standardized topical lists) of all the world's languages. Wikipedia has information on the grammar and demographics of languages, but does not often include vocabulary, which is the core and essence of language. The closest things we have to a massive comparative database on world languages are the Austronesian Basic Vocabulary Database, Intercontinental Dictionary Series, and of course, Wiktionary's Swadesh lists. As a side note, even though this is basically the Rosetta Project's goal, the website is still quite unwieldy for ordinary users, has a very low Alexa site ranking, and does not allow wiki-style contributions. The Rosetta Project has also pulled off Swadesh lists that used to be on there, and does not have any searchable vocabulary databases as of now. And why do this? To help in language preservation, comparative linguistic studies, language learning, and more.

Or perhaps we can even create a separate "WikiVocab" website, similar in style to WikiSpecies! If we do create a big, unified, and searchable database for all the world's languages — all in one place — I believe it will be one of the greatest human achievements in modern times.

Thanks for your considerations! — Stevey7788 (talk) 10:45, 16 September 2010 (UTC)

See also
I think your alternate suggestion, to create a separate WikiVocab site, is preferable. Wikipedia is an encyclopedia, not a resource for linguistic research. Your proposed open-access, cross-linked Swadesh list information may be useful to various people doing research in diachronic or comparative linguistics, but will probably not be of much use to encyclopedia users, who are probably by and large looking for more elementary information on the languages they research here.
Similarly, Infoboxes are meant to summarize information in the encyclopedia article to which they attach. Cross-linking to Wiktionary appendices seems to be beyond their intended purpose. (Though if I am in a clear minority on the former point, I think there can be some flexibility in this).
I wish you all success with the creation of WikiVocab, but think that the cross-linking you describe here is a beyond the scope of Wikipedia. Best, Cnilep (talk) 13:16, 16 September 2010 (UTC)
Thanks for your feedback! Another alternative would be to expand this on Wiktionary and have them feature it on their main page if a separate wiki site cannot be created for the word lists.
Also, many ordinary encyclopeida users would want to have access to vocabulary lists, since learning vocabulary is one of the most important parts of language learning. We've already linked languages to Ethnologue, but just imagine an Ethnologue WITH vocabulary lists! — Stevey7788 (talk) 20:39, 16 September 2010 (UTC)


The article Aorist is in need of editors who can help develop it. One difficulty (as I see it) lies in trying to balance the need for technical accuracy as required by linguistics with WP:UCN, that is, explaining the aorist in a way that helps the readers most likely to come to it. Input from members of the Languages project would be most welcome and appreciated. Cynwolfe (talk) 20:59, 16 September 2010 (UTC)

Languages of <Country> templates

Just a query as to what people think of the new Languages of <Country> templates which are being added at the moment, mostly by User:Iketsi (talk) I think. For example, Template:Languages of Uganda, Template:Languages of Pakistan, Template:Languages of Benin etc. (see also Category:Africa_language_templates). These are being added to any language articles which fit the country concerned.

I don't have a very strong opinion on this, but it's be good to get some consensus on the usefulness of such boxes. Some comments are:

  • Some of the languages mentioned (English and French being obvious examples, but also e.g. Swahili language) cover large numbers of countries so putting templates for each at the bottom of the article would take up a lot of space.
  • The information is typically duplicating what is already present in the Template:Infobox language in the language article, i.e. which countries it is spoken in.
  • The categories at the bottom of the language articles, e.g. Category:Languages of Uganda also provide a quick link to a list of relevant languages
  • Some countries have so many that it is not possible to list them all in such a template, for example Template:Languages of Cameroon states "55 Afro-Asiatic languages, two Nilo-Saharan languages, and 173 Niger-Congo languages" without listing them.
  • Not all the boxes have been generated yet, so Kinyarwanda now has a red-link template. This is presumably fixable in due course.

Thanks.  — Amakuru (talk) 11:10, 21 September 2010 (UTC)

Which languages qualify? In the UK, English, Welsh, Irish and Gaelic are obvious inclusions, but what about languages such as Cornish that have no native speakers, although there is an interested group keeping the language "alive". There are also many "non-native" languages, from immigrant communities, which are widely spoken within groups or localities, some with significant roots in the UK; candidates would be Italian, Urdu, Polish, Cantonese, etc. Another odd case might be Esperanto. Scope of the box needs to be defined - this is relevant to many countries where colonial languages have either taken popular root or become a lingua franca. Folks at 137 (talk) 11:24, 21 September 2010 (UTC)
Although I agree that these boxes have questionable usefulness when affixed to major languages (the ones spoken in too many countries), I believe we should keep them because they allow users to sift through related languages with ease. It also adds a feeling of completion to smaller, rarer language articles which otherwise wouldn't be part of any "network" other than categories. There's nothing wrong with categories, but they're not very user friendly for the average wiki surfer and would require more clicks to find anything.
  • It's not really duplicate information because Template:Infobox language shows countries in which the language is spoken. The <Country> templates, on the other hand, show which languages are spoken in a defined country. It's done the other way around.
  • The Kinyarwanda template is no problem :) If we decide to keep the templates, I'll have all the rest done in a week or two. - Iketsi (talk) 11:47, 21 September 2010 (UTC)

Diacritics in Russian texts

Regarding a now-archived thread here, is there really a need to supply stress marks to Russian words on Wikipedia? The mere purpose of those diacritics is to indicate the correct pronunciation whenever it could be ambiguous, just like what the IPA for English is doing when it accompanies English words. The diacritics are usually there in bilingual dictionaries or some children's books, and they were present in my Russian textbooks when I studied Russian at school, but in all other cases their use is plainly unnecessary, and a native Russian may take offense if addressed with a message showing the stress marks in his or her native language. The Russian text in the English Wikipedia is almost exclusively inserted inside the {{lang-ru}} template, and I daresay the vast majority of our readers are not familiar enough with the Russian spelling rules to avoid assuming that not unlike the acute accents in the modern Greek orthography, the diacritics are obligatory when writing in Russian Cyrillic. Besides, we usually don't proceed in the same manner with words in other Cyrillic-based alphabets, for instance those that are inside {{lang-uk}} or {{lang-bg}}. By the way, there is a section above, which hasn't been replied to yet. --Theurgist (talk) 02:46, 26 September 2010 (UTC)

Certainly in cases where we also provide the IPA for the Russian the orthographic stress mark isn't necessary. In other cases I can see it might be useful but don't have a strong feeling one way or the other. —Angr (talk) 16:43, 26 September 2010 (UTC)
Just now I corrected to Cyrillic the stressed letters in a biographical article linked on the Main Page, because there were Latin ones. The IPA is indeed a good alternative, but it would be very hard to provide it for all Russia(n)-related articles, and short stubs would become too overburdened with an IPA in them. --Theurgist (talk) 23:40, 26 September 2010 (UTC)
The Cyrillic alphabet article had a Russian diacritic mark in the lead, which strikingly contrasted with the other native names - and at that the Ukrainian and the Belarusian were missing - so I removed it as it could very easily mislead our readers. --Theurgist (talk) 02:34, 29 September 2010 (UTC)

List of ISO 639-1 codes

FYI List of ISO 639-1 codes has been requested to be renamed into templatespace. (talk) 00:59, 27 September 2010 (UTC)

Advice for ledes of Serbian and Croatian

I'd like to ask for some advice, hopefully here in a place that isn't swamped by nationalists. Or if I'm out of line, tell me that too.

IMO, the ledes for Serbian language and Croatian language cater too much to nationalists who insist that the languages have nothing to do with each other. They insist at least on the wording "X is a South Slavic language" so they can pretend that they are just like any other pair of SS langs, such as Slovene and Bulgarian. The sociolinguistic reality of separate status is IMO adequately addressed by having separate articles, as opposed to the Encyclopedia of Language and Linguistics which subsumes both under a single article. These languages aren't even distinguished by dialect, they way, say, Bulgarian and Macedonian are, or Swedish and Norwegian, but by ethnicity: a Torlakian-speaking Croat speaks "Croatian", while a Torlakian-speaking Serb speaks "Serbian". I think we should make it clear from the outset that the distinguishing feature is ethnicity rather than anything inherent in the language; where there are differences, they are only associated with the languages because of their association with ethnicity. I suggest s.t. along the lines of,

Serbian is the name used for Serbo-Croatian as spoken by Serbs.[SC 1] It is official under this name in Serbia, Bosnia and Herzegovina, and Montenegro. Two principal dialects are spoken by Serbs, Shtokavian and Torlakian. The literary and standard language is based on Shtokavian, which is also the basis of Standard Croatian, Bosnian, and an emergent standard for Montenegrin.[SC 2] Indeed, in Bosnia and Herzegovina, "Croatian", "Bosnian", and "Serbian" are considered to be three names for the same official language.[SC 3]


Croatian is the name used for Serbo-Croatian as spoken by Croats. It is official under this name in Croatia and Bosnia and Herzegovina. Two dialects, Chakavian and Kajkavian, are spoken almost exclusively by Croats, and there are a few Croatian speakers of a third, Torlakian. However, the literary and standard language is based on the central dialect, Shtokavian, which also forms the basis of the official standards of Serbian, Bosnian, and an emergent standard for Montenegrin. These four dialects, and the four national standards based on one of them, are commonly subsumed under the term "Serbo-Croatian" in English, though this term is controversial for political reasons, and paraphrases such as "Bosnian-Croatian-Montenegrin-Serbian" are sometimes used instead, especially in diplomatic circles.
  1. ^ E.C. Hawkesworth, "Serbian-Croatian-Bosnian Linguistic Complex", also B Arsenijević, "Serbia and Montenegro: Language Situation". Both in the Encyclopedia of Language and Linguistics, 2nd edition, 2006.
  2. ^ Serbian, Croatian, Bosnian, Or Montenegrin? Or Just 'Our Language'?, Radio Free Europe, February 21, 2009
  3. ^ From the (1993) language law:
    In the Republic of Bosnia and Herzegovina, the Ijekavian standard literary language of the three constitutive nations is officially used, designated by one of the three terms: Bosnian, Serbian, Croatian.
    ("Language in the former Yugoslav lands" (2004) Ranko Bugarski, Celia Hawkesworth. p 142)

The most recent compromise version is the following:

Croatian is a South Slavic language spoken chiefly by Croats in Croatia, Bosnia and Herzegovina and neighbouring countries, as well as by the Croatian diaspora worldwide. Standard Croatian is a standardized form of the Shtokavian dialect. The same dialect is also the basis for the mutually intelligible standards of Croatian, Bosnian, and emerging Montenegrin. These are commonly subsumed under the term Serbo-Croatian, or a compound term like "Bosnian/Serbian/Croatian (BCS)".

This is better than previous versions, but has a couple factual problems and misleading wording which are there at least partly to placate the nationalists:

  • "Croatian is a South Slavic language". This is the type of wording found for other, independent, SS languages, and suggests that Croatian has the separate status that they do. The vociferous insistence of nationalists on keeping it ("Croatian is a South Slavic language!", though we clearly state that it is regardless) suggests that they're reading it that way too.
  • "Croatian is ... chiefly spoken by Croats". No, apart from immigrants and some acculturated Serbs, is is spoken only by Croats, because the defining feature of Croatian is that the speaker is an ethnic Croat.
  • "They are commonly subsumed under the term Serbo-Croatian". This suggests that only the standards are subsumed under SC, when in fact all four only partially intelligible dialects, including Chakavian and Kajkavian, have long been called SC. This is another point that (principally Croatian) nationalists insist on, that SC refers specifically to the defunct Yugoslav (bi)standard, which was forcibly imposed upon them, when in fact it was Croatian nationalists in the 1850s who chose and created a standard in common with their Serbian colleagues.

Any comments/preferences? (To be clear, I'm proposing s.t. like the first wording; the last one is what is currently in the article and which I find problematic.) — kwami (talk) 01:21, 2 October 2010 (UTC)

That all makes sense to me. It is a politically sensitive issue, but English Wikipedia ought to be written in English, and the lede paragraphs you suggest are better English than what came before them. -- WeijiBaikeBianji (talk) 05:07, 2 October 2010 (UTC)
Based on a comment on the talk page, I've modified it to
Croatian is a form of Serbo-Croatian spoken by Croats in Croatia, Bosnia ...
since several people objected to "is the name used for" and "as spoken by". — kwami (talk) 07:21, 3 October 2010 (UTC)


There is a request for comment at Talk:Croatian language. --Taivo (talk) 15:31, 6 October 2010 (UTC)

Language related AfD

Wikipedia:Articles for deletion/Göbekli Tepe scriptJoseph RoeTkCb, 06:53, 16 October 2010 (UTC)

Language articles have been selected for the Wikipedia 0.8 release

Version 0.8 is a collection of Wikipedia articles selected by the Wikipedia 1.0 team for offline release on USB key, DVD and mobile phone. Articles were selected based on their assessed importance and quality, then article versions (revisionIDs) were chosen for trustworthiness (freedom from vandalism) using an adaptation of the WikiTrust algorithm.

We would like to ask you to review the Language articles and revisionIDs we have chosen. Selected articles are marked with a diamond symbol (♦) to the right of each article, and this symbol links to the selected version of each article. If you believe we have included or excluded articles inappropriately, please contact us at Wikipedia talk:Version 0.8 with the details. You may wish to look at your WikiProject's articles with cleanup tags and try to improve any that need work; if you do, please give us the new revisionID at Wikipedia talk:Version 0.8. We would like to complete this consultation period by midnight UTC on Sunday, November 14th.

We have greatly streamlined the process since the Version 0.7 release, so we aim to have the collection ready for distribution by the end of November, 2010. As a result, we are planning to distribute the collection much more widely, while continuing to work with groups such as One Laptop per Child and Wikipedia for Schools to extend the reach of Wikipedia worldwide. Please help us, with your WikiProject's feedback!

If you have already provided feedback, we deeply appreciate it. For the Wikipedia 1.0 editorial team, SelectionBot 16:34, 6 November 2010 (UTC)

Abkhazian Che with descender

Hello. A group of us are trying to clear the backlog at The page of the above name is one of the several thousand articles lacking Sources that were tagged in October 2006. Can you help in finding good Sources for the facts in the article? Sincerely, GeorgeLouis (talk) 08:20, 8 November 2010 (UTC)

new Gentium font

FYI, SIL's "Gentium Plus" has been released.[2] Combines the design of Gentium (mostly) with the coverage of Charis. class=IPA has been updated to choose it over plain Gentium, if you choose to install it. — kwami (talk) 21:57, 8 November 2010 (UTC)

Feedback on improving Samoan language article, please

Hi everyone. We would really appreciate any feedback, especially from linguistics experts, on how best to improve the Samoan language article, as it has been tagged (by User talk:GPHemsley) for cleanup. I've made edits there but I'm not an expert - it definitely needs a lot more work. Kahuroa has made some suggestions in the Talk:Samoan language page. Happy to carry out the leg work with any advice from you all. Thank you and much appreicated. teinesaVaii (talk) 07:08, 18 November 2010 (UTC)

Standard Mandarin

FYI, Standard Mandarin has been requested to be renamed. (talk) 05:36, 23 November 2010 (UTC)

Proposed merge of New English English, Boston accent, and Vermont English

There is a discussion here regarding a possible merger of New England English, Boston accent, and Vermont English. Cnilep (talk) 05:25, 1 December 2010 (UTC)

ISO 639-3 lists

ISO 639:a and subsequent lists contain language names in French, Spanish, Chinese, Russian and German, which should be transwikied to Wiktionary per WP:NOT#DICT. I am planning to remove these columns with my bot unless there is any opposition. In the same go, I'd like to rearrange columns and split the column "Family" (which currently seems to contain a mixture of macrolanguages and language families) into two. The columns of the resulting tables would be:

ISO 639-3 - 639-2B - 639-1 - English name - Native name - Scope/Type - Macrolanguage code (639-3) - Macrolanguage name - Family code (639-5) - Familiy name

Suggestions and comments welcome. --ἀνυπόδητος (talk) 13:59, 5 December 2010 (UTC)

Chinese tone correspondences

I expanded Four tones. Caught a few errors; someone here might want to verify that I didn't introduce any new ones. — kwami (talk) 16:09, 11 December 2010 (UTC)

Valencian origins controversy

FYI. Valencian language article was tagged as disputed on November 2010. There was conflict because previous versions did not provide readers with neutral information on all sides of the discussion. Several editions later controversy not only remains unsolved, but users seem now engaged into an edit war on its classification. (talk) 20:52, 13 December 2010 (UTC)

The conflict is over denying that Valencian and Catalan are dialects of the same language / Valencian is a dialect of Catalan, pace ELL2, Ethnologue, and the like. I know nothing of the topic myself, but RSs seem to agree that they are dialects. Any comments would be welcome; this does look like it's going to be another politically motivated denial of reality, unless I've badly misread the sources and there actually is mainstream disagreement. — kwami (talk) 06:11, 15 December 2010 (UTC)

Article editors have been provided with other RS's that show possible other philological classification. My opinion is that classification, and all the whole article must be drafted again following WP:NPOV and WP:TRUTH. Other users seems to dissagree.IeXrivâ (talk) 00:52, 16 December 2010 (UTC)
Well, the truth is 99% of linguist as AVL and IEC assert Valencian is a dialect of Catalan, exactly of Western Catalan. ELL classifies valencian as a dialect of Western Catalan, including it with the Gallo-Romance and Occitano-Romance languages. WP:WEIGHT (neutrality weights viewpoints in proportion to their prominence) needs to be taken into consideration.Jaume87 (talk) 05:38, 16 December 2010 (UTC)

Books Ngram Viewer

I added a request at Village pump (technical)[3] to bring in the Books Ngram Viewer dataset to Wikipedia and to create a template to make use of it. -- Uzma Gamal (talk) 12:18, 20 December 2010 (UTC)

New article: Internet and Technology Law Desk Reference

New article, created, at Internet and Technology Law Desk Reference. Additional assistance in research would be appreciated, feel free to help out at the article's talk page. Cheers, -- Cirt (talk) 12:49, 20 December 2010 (UTC)

Languages of Slovenia: Prekmurian

Please, see Talk:Languages of Slovenia#Prekmurian dialect? and provide your comments. The issue is whether to include Prekmurian dialect in the article Languages of Slovenia. --Eleassar my talk 15:51, 27 December 2010 (UTC)

Additional question: is the Venetian language spoken in Slovenia? --Eleassar my talk 18:53, 27 December 2010 (UTC)


Elvish languages (Middle-earth) and Languages of Arda have been proposed to be renamed, see Talk:Languages of Arda. (talk) 12:31, 1 January 2011 (UTC)


I'm new to Wikipedia, so I'm unsure if this is the right place to post. I hope it is. The Sindarin article, on which I'm working right now, has not yet received a rating on your project's importance scale. Someone was asking questions about "Sindarin", in order to put a scale on it. But I'm unable to find him/her. :( If peoople need informations about Sindarin, and if they say what sort of infos are needed I can provide them, so as to be able to put the right scale on it. (talk) 00:13, 2 January 2011 (UTC)

Atlantean language

The conlang Atlantean language has been nominated for deletion. (talk) 05:10, 7 January 2011 (UTC)

Dacian language and script

Hi all! I am wondering if there are any linguists interested in reviewing and improving the articles around the Dacian language. The are many holy wars (vandalism and revert wars included unfortunately) and theories around this interesting subject. WikiProject Dacia is proposing a collaboration on this. Of special interest are the controversial Dacian script, Sinaia lead plates and Rohonc Codex. Any specialist opinions and help are greatly appreciated. Best regards! --Codrin.B (talk) 05:42, 9 January 2011 (UTC)

Related AfD

Please see: Wikipedia:Articles for deletion/Chicano vendido. Jaque Hammer (talk) 10:14, 10 January 2011 (UTC)

linguasphere classification?

Is the complete linguasphere classification available anywhere online? — kwami (talk) 10:11, 12 January 2011 (UTC)

Regional differences in the Chinese language

Regional differences in the Chinese language has been nominated for deletion. (talk) 06:33, 14 January 2011 (UTC)

Simplifications to written Chinese in Hong Kong

Simplifications to written Chinese in Hong Kong has been nominated for deletion. (talk) 06:25, 17 January 2011 (UTC)

English phonetic alphabet (EPA)

Does anyone know anything about English phonetic alphabet (EPA) ? It's been nominated for deletion. (talk) 07:06, 23 January 2011 (UTC)

Wikipedia uses IPA for English, I don't know if the EPA deserves an article. —Preceding signed comment added by Nicky Nouse (talkcontribswikia) 01:26, 24 January 2011 (UTC)


Template:Germanic_vowel_development is unused. Is it still wanted? If not, it can be deleted, as it is currently serving no purpose. — This, that, and the other (talk) 02:12, 25 January 2011 (UTC)

I have no objections to existing but it really should be an article rather than a template, like IE sound laws. I have no idea why it was a template to start with. Munci (talk) 14:43, 27 January 2011 (UTC)

English articles up for deletion

Basic Roman spelling of English and Roman Phonetic Alphabet for English have been nominated for deletion. (talk) 06:56, 28 January 2011 (UTC)

Sorry, didn't notice it was already listed on the deletion list. (talk) 06:57, 28 January 2011 (UTC)

IPA for Hindi and Urdu

At Wikipedia talk:IPA for Hindi and Urdu, there's a call to change all [ɛ] in our HU articles to [æ], along with a few other tweaks. Input would be appreciated. — kwami (talk) 07:14, 4 February 2011 (UTC)

Proposal for move of Regional differences in the Chinese language

See Talk:Regional differences in the Chinese language. Munci (talk) 10:02, 8 February 2011 (UTC)

ǃKung / ǃXun / Ju / Zhu population

Anyone have an estimate for the number of ǃKung / ǃXun / Ju / Zhu speakers? (The branch / language complex.) Ethnologue is not reliable (it seems they've counted dialects multiple times under different names), ELL just copies Ethnologue, and Heine & Nurse don't give a figure. — kwami (talk) 05:50, 12 February 2011 (UTC)

Koro language (India)

This article looks like a stub but is not marked as one. Please look at it. Cliff (talk) 20:48, 18 February 2011 (UTC)

The presence of secondary sources means that it is not a stub but a start-class article in need of expansion and improvement. Cnilep (talk) 01:36, 19 February 2011 (UTC)
Thanks for the help.Cliff (talk) 19:25, 23 February 2011 (UTC)

Konkani language: spelling

Does anyone know which spelling is correct [4] [5] [6]? --ἀνυπόδητος (talk) 19:23, 25 February 2011 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Konkani is spoken in three Indian states; Goa, Karnataka and Kerala. It is the official languageof Goa and a minority language in Karnataka and Kerala. Konkani is popularly written in three different scripts viz Devanagari (Also used for Hindi-Marathi-Nepalese-Romani), Kannada, Malayalam. Popular and prevalent pronunciation wise the name is कॊंकणि (koŋkaṇi ). This is also how it has been written in the Kannada and Malayalam scripts.

Hence this issue only pertains to how the name of the language is spelt in the Devanagari script which incidently has been promulgated as the official script of the language.

In the Indian state of Goa, where Konkani is the official language it is spelt as कोंकणी (Kōṅkaṇī) in the Devanagari script but read as koŋkaṇi. This paradox is the root of deliberation amongst Konkani speakers. There are two opposite views on this:

  • A section of writers believe that one should stick to the official spelling even though it in divergence with how it is pronounced.
  • Another section believes, regulations notwithstanding, the name should be spelt as spoken.Imperium Caelestis 11:44, 6 March 2011 (UTC)

Telugu langauge

The Telugu language page is in need of a severe cleanup. Compare with the Tamil language article. I don't have the necessary knowledge or writing skills to clean it up myself, so I'm requesting help from here and the other relevant WikiProjects. cntrational (talk) 21:29, 25 February 2011 (UTC)

Please review seriousness v. proposed deletion as parody of new article Names of small numbers at Wikipedia:Articles for deletion/Names of small numbers

Languages WikiProject members, this is being discussed at:

Wikipedia:Articles for deletion/Names of small numbers

Please also consider what additional sections from binary and other numbering systems and from educationally, historically, linguistically and epistemologically significant concepts and works, including fractions and parts of wholes other than simple number-base exponential systems, including terms from currencies, agriculture, art media, and pre-modern English language names of small portions should be made to this topic as a kept article, especially subtopics which may not be generally known by Wikipedian editors in other particular fields. Etymology for some SI and Metric terms is included in their respective articles to which this one is linked; please consider what portions and extents of etymological information from those sources and what other sources are appropriate to add to this article as well.

Thank you. Pandelver (talk) 04:06, 12 March 2011 (UTC)


The French have fr:Portail:Langues germaniques

Has anyone on EN wanted to do a "Germanic languages" portal? WhisperToMe (talk) 08:01, 12 March 2011 (UTC)

Konkani dialects

I request editors to take part in the discussion here. A consensus needs to be reached. Please comment there. Regards, Yes Michael?Talk 12:11, 12 March 2011 (UTC)


User: added dozens of "native names" to language articles today. There appears to have been a fair amount of thought put into them: many look rather convincing if you're unfamiliar with the languages in question. Several were reverted by the time I got to them, but the reverting editors failed to follow up on this user's other edits. I blocked the account, but since this is an anon IP, and the edits date only from today, it's quite possible they will show up tomorrow under a different IP. If you see info like this added, please check the editor's contributions to see if there's a pattern. — kwami (talk) 11:56, 17 March 2011 (UTC)

Blocked User: for adding a bunch of "incubator" links that didn't link to anything, including such unlikely suspects as Guanche. However, a couple of the edits appear to be okay. Can anyone verify if that actually is the native name of Omaha-Ponca? Was I hasty in blocking? — kwami (talk) 19:48, 19 March 2011 (UTC)

Adjectival phrase

This appears to follow the definition of an attributive phrase of any part of speech rather than a phrase where the head word is an adjective. I'm tempted to move it to attributive phrase (probably merging with attributive) and writing a new stub for adjective-head phrase in its place, but thought someone here might have a different idea. — kwami (talk) 22:56, 19 March 2011 (UTC)

Hong Kong English

There is a request to cleanup Hong Kong English at Talk:Hong Kong English. (talk) 06:40, 28 March 2011 (UTC)

Kalix language

Could use review. For one thing, it seems odd that a language should use bold el as a distinct letter from roman el. — kwami (talk) 00:02, 10 April 2011 (UTC)

WikiProject Endangered languages

Language categories and films

We currently have a category:Films by language with subcategories for many languages. For some languages these are (direct or indirect) subcategories of the corresponding language category, which makes sense. However, for many African languages, this is not the case because the language category doesn't even exist. So we have the awkward situation that we have categories for one aspect of a language, but none for the language itself. What should be done?

For example, I just created category:Diola-language films as a subcategory of category:Films by language, because this is the way it's done for all other foreign language films. However, I could not assign this category to category:Diola language (or category:Jola language), since that category doesn't exist.

This is a problem because it denies our readers information that would be particular useful. Someone reading an article about a lesser known language will likely find it useful to know that there are some films in that language. So we should offer our readers a way to discover that film category. In the case of Jola languages, I added a link to category:Diola-language films in the "See also" section. That's a bit of a hack, but I think it may be a more lightweight solution than creating scores or even hundreds of language categories that only have one article (that about the language itself) and one subcategory (that for the films).

Alternatively, I had been thinking about using bigger categories, such as category:Atlantic languages films, but that would neither be discoverable for someone reading the Diola article, nor would it be likely to get populated by film aficionados unfamiliar with language categorization. What do others think? — Sebastian 15:34, 26 April 2011 (UTC)

Well, the obvious solution is to create Category:Jola languages (matching the spelling and number of the article Jola languages). I wouldn't create Category:Atlantic languages films, not only because it wouldn't get populated, but because Category:Films by language doesn't get divided up into language families. (Notice we don't have Category:Germanic languages films or Category:Germanic-language films either, although that category would certainly be well populated.) —Angr (talk) 15:55, 26 April 2011 (UTC)
Obviously, I thought of that. As I wrote above, it would mean "creating scores or even hundreds of language categories that only have one article (that about the language itself) and one subcategory (that for the films)", since we're not talking just about Diola here. Is that really a good solution? — Sebastian 16:30, 26 April 2011 (UTC)
MMmmaybe not. The other option is to make Category:Diola-language films a subcat of Category:Atlantic languages, but that's a bit weird since one would expect any subcats to be about languages directly. —Angr (talk) 16:36, 26 April 2011 (UTC)
I agree. It would be less weird if we had the corresponding cat, Atlantic languages films. But that, too, would be a big change, because we ultimately would have to mirror the whole system of language categories in a "... films" series. That's probably too much red tape, and I don't presume the film editors would much appreciate that. — Sebastian 23:09, 26 April 2011 (UTC)
I don't really see a problem with creating a category for a film-language when there isn't one of the language itself. I created Category:Northern Sotho-language films a while back, and just put it in the films by language parent. Lugnuts (talk) 18:06, 26 April 2011 (UTC)
Thank you. I convinced myself, too, in the course of this discussion. :-) So what do you think about the second half of my approach, adding a link to the film cat in the "See also" section? I'd be up for doing that in all applicable language articles, since I believe it provides useful connections. — Sebastian 23:09, 26 April 2011 (UTC)
Sounds good - anything to help aid navigation is always a step in the right direction IMO. Lugnuts (talk) 09:31, 27 April 2011 (UTC)