Wikipedia talk:WikiProject Languages/Archive 9

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Languages of the World: An Introduction

Looking at the Sino-Tibetan section of the book

Pereltsvaig, Asya (2012). Languages of the World: An Introduction. Cambridge University Press. ISBN 978-1-107-00278-4. 

I was struck by some familiar wording. It seems that some of the text comes from older versions of Wikipedia articles, as seen in the following comparison:

Book text Wikipedia articles
page 126:

There are also some significant linguistic differences between Sinitic and Tibeto-Burman languages, so that some scholars, such as Christopher Beckwith (1996) and Roy Andrew Miller, argued that these two families are not related at all. They point to what they consider an absence of regular sound correspondences, an absence of reconstructable shared morphology, and evidence that much shared lexical material has been borrowed from Chinese into Tibeto-Burman. In opposition to this view, scholars in favor of the Sino-Tibetan hypothesis, such as W. South Coblin, Graham Thurgood and James Matisoff, have argued that there are regular correspondences in sounds, as well as in grammar. One of the main reasons why it is so difficult to apply the comparative method that we are familiar with from the previous chapters to the Sino-Tibetan languages is the morphological paucity in many of these languages, including modern Chinese and Tibetan.

Sino-Tibetan languages, March 2010:

A few scholars, most prominently Christopher Beckwith and Roy Andrew Miller, argue that Chinese is not related to Tibeto-Burman. They point to what they consider an absence of regular sound correspondences, an absence of reconstructable shared morphology,[1] and evidence that much shared lexical material has been borrowed from Chinese into Tibeto-Burman. In opposition to this view, scholars in favor of the Sino-Tibetan hypothesis such as W. South Coblin, Graham Thurgood, James Matisoff, and Gong Hwang-cherng have argued that there are regular correspondences in sounds as well as in grammar.

One of the chief difficulties of applying the comparative method to the Sino-Tibetan languages is the morphological paucity in many of these languages, including modern Chinese and Tibetan.

  1. ^ Cf. Beckwith, Christopher I. 1996. "The Morphological Argument for the Existence of Sino-Tibetan." Pan-Asiatic Linguistics: Proceedings of the Fourth International Symposium on Languages and Linguistics, January 8-10, 1996. Vol. III, pp. 812-826. Bangkok: Mahidol University at Salaya.
pages 129, 130:

Numerous languages in eastern Asia are tonal, including all the Chinese languages (though some, such as Shanghainese, are only marginally tonal) and, as we will see below, Vietnamese (an Austro-Asiatic language; see Section 7.2) and Thai and Lao (both Tai-Kadai languages; see Section 7.3). Some eastern Asian languages, such as Burmese, Korean and Japanese have simpler tone systems, which are sometimes called 'register' or 'pitch accent' systems. However, some languages in the region are not tonal at all, including Mongolian (a Mongolic language; see Section 11.2 below), Khmer (an Austro-Asiatic language; see Section 7.2) and Malay (an Austronesian language; see Chapter 8). Of the Tibetan languages, Central Tibetan (including the dialect of the capital Lhasa) and Amdo Tibetan are tonal, while Khams Tibetan and Ladakhi (spoken mostly in the Jammu and Kashmir province of India) are not. Thus, tonal systems appear to be an areal rather than a genealogical feature of languages.

Tone (linguistics), May 2010:

There are numerous tonal languages in East Asia, including all the Chinese languages (though some such as Shanghainese are only marginally tonal), Vietnamese, Thai, and Lao. Some East Asian languages, such as Burmese, Korean, and Japanese have simpler tone systems, which are sometimes called 'register' or 'pitch accent' systems. However, some languages in the region are not tonal at all, including Mongolian, Khmer, and Malay. Of the Tibetan languages, Central Tibetan (including the dialect of the capital Lhasa) and Amdo Tibetan are tonal, while Khams Tibetan and Ladakhi are not.
Tone is frequently an areal rather than a genealogical feature.

page 130:

Typically, tonal systems arise as an after effect of the loss or merger of consonants (such trace effects of disappeared sounds have been nicknamed Cheshirisation, after the lingering smile of the disappearing Cheshire cat in Lewis Carroll's Alice in Wonderland). As it turns out, in non-tonal languages the pronunciation of consonants affects the pitch of preceding and/or following vowels; when such consonants disappear as a result of regular sound change (recall our discussion of lenition from Chapter 2), the distinction in pitch may be preserved and used to distinguish meanings formerly distinguished by the consonants. And voilà – a tonal system is created! For example, in the development of Chinese languages final consonants, which affected the pitch of preceding vowels, weakened to /h/ and finally disappeared completely, but the difference in pitch – now a true difference in tone, carried on instead of the disappeared consonants. Moreover, the nature of initial consonants also affected which tone a given vowel carried: when the consonants lost their voicing distinction, vowels preceding a voiceless consonant acquired a higher tone, while those preceding a voiced consonant acquired a lower tone. The same changes affected many other languages of eastern Asia, such as Thai, Vietnamese and the Lhasa dialect of Tibetan, and at around the same time (AD 1,000–1,500).

Tone (linguistics), May 2010:

Very often, tone arises as an effect of the loss or merger of consonants. (Such trace effects of disappeared tones or other sounds have been nicknamed Cheshirisation, after the lingering smile of the disappearing Cheshire cat in Alice in Wonderland.) In a non-tonal language, voiced consonants commonly cause following vowels to be pronounced at a lower pitch than other consonants do. This is usually a minor phonetic detail of voicing. However, if consonant voicing is subsequently lost, that incidental pitch difference may be left over to carry the distinction that the voicing had carried, and thus becomes meaningful (phonemic). [...]

Similarly, final fricatives or other consonants may phonetically affect the pitch of preceding vowels, and if they then weaken to /h/ and finally disappear completely, the difference in pitch, now a true difference in tone, carries on in their stead. This was the case with the Chinese languages: Two of the three tones of Middle Chinese, the "rising" and "leaving" tones, arose as the Old Chinese final consonants /ʔ/ and /s/ → /h/ disappeared, while syllables that ended with neither of these consonants were interpreted as carrying the third tone, "even". Most dialects descending from Middle Chinese were further affected by a tone split, where each tone split in two depending on whether the initial consonant was voiced: Vowels following an unvoiced consonant acquired a higher tone while those following a voiced consonant acquired a lower tone as the voiced consonants lost their distinctiveness.

The same changes affected many other languages in the same area, and at around the same time (AD 1000–1500). The tone split, for example, also occurred in Thai, Vietnamese, and the Lhasa dialect of Tibetan.

In each case, the Wikipedia text not only predates the book, but it also reached that state through incremental changes by different editors over some years. Spot checks of other parts of the book revealed snippets that closely resemble passages in our articles on Georgian scripts (p74), Inuktitut (p124) and Afroasiatic languages (p205).

I'm not claiming that all of this book comes from Wikipedia, or even most of it. But it seems clear that some of it does, and it's difficult to tell which. I conclude that even though this book is published by CUP and is used as a textbook in major universities, it cannot be safely used as a source for our articles. Kanguole 13:21, 26 March 2014 (UTC)

Why do you jump to this conclusion?
So we have an expert who apparently thought that a few sentences written here were really good. Given our anonymous editing environment, it's possible that these sentences were originally written by that same expert (could any of those people in the article history have been this author?). The expert made a few changes, which suggests some attention to its contents, rather than laziness or confusion.
Why do you conclude from this that the entire book is suspect? You could alternatively interpret it as an expert endorsement.
I've had some of my work copied by experts—in one case, by a world-renowned researcher, someone who is definitely one of the best of the best in his niche. I don't conclude from this that he no longer knows what he's talking about. I conclude from this that where our texts agreed, I had done well, and that where we disagreed, I needed to re-think whether my summary was fairly representing the state of the art. WhatamIdoing (talk) 15:37, 26 March 2014 (UTC)
I have no doubt that Asya Pereltsvaig is not User:TheLeopard who made this edit in 2008. This is clearly plagiarism, and if the passages noted by Kanguole are plagiarized then we can have no confidence in the rest of book, and it should not be used as a reliable reference. BabelStone (talk) 17:56, 26 March 2014 (UTC)
Actually in that edit TheLeopard was restoring text that had been deleted; the earliest version of the paragraph was inserted by an IP in 2005 and expanded by cibeckwith in 2006, but by 2010 it had evolved considerably under many hands. Kanguole 18:13, 26 March 2014 (UTC)
Plagiarism is an ethical lapse, not a factual one. If you plagiarize a accurate information, the information does not become inaccurate as a result of your failure to add a footnote onto the end of the (freely licensed) text. WhatamIdoing (talk) 19:23, 26 March 2014 (UTC)
Yes but... (1) A book that contains plagiarism might contain other errors. It might not, and any other book might also contain errors, but such a lapse would make me wary of the book. (2) If some of the content in a book comes from Wikipedia, using other information from the book as a source for Wikipedia might end up being circular or otherwise problematic. Again, it might not, but it's something else to be wary of. My conclusion is that such a book need not be blacklisted, but that I would prefer other sources. Cnilep (talk) 03:52, 27 March 2014 (UTC)
What if there are no other sources, or the other sources are of a type we frown on? Case in point: a citation of this book was added in support of a date of 430 AD for the oldest known text in Georgian. The other references were encyclopedias, which we usually avoid as tertiary sources. The citation refers to the following parenthetical remark on page 74:
"the oldest uncontested example of Georgian writing using the asomtavruli alphabet is an inscription from 430 CE in a church in Bethlehem"
That is uncomfortably close to a sentence (lacking any supporting reference) in the 2010 version of our article Georgian scripts:
"The oldest uncontested example of Georgian writing is an Asomtavruli inscription from 430 AD in a church in Bethlehem."
Can we be sure that this represents the author's independent judgement, or are we citing ourselves in an opaque way? Relying on this book seems too much of a risk to me. Kanguole 17:13, 28 March 2014 (UTC)

Non-notable languages?

A proposal to delete a language article for being non-notable is here: Wikipedia:Articles_for_deletion/Bura_Sign_Language#Bura_Sign_Language. Certainly we delete conlang articles that fail WP:N (or WP:making stuff up), as well as languages which fail verification like Asash above, but AFAIK never attested, distinct, natural languages just because they're recently discovered and don't have much written on them. — kwami (talk) 10:07, 27 March 2014 (UTC)

English Wiktionary has no notability criterium for natural languages, but it has an approval requirement for conlangs. Maybe a similar principle can be adopted here. I don't think notability should be a criterium for natural languages; it seems a bit POV-ish if we start rejecting some of them. In linguistics, all languages are considered equal and therefore equally notable and worthy of consideration. Of course, that changes if there is disagreement on whether something could be called a language (which is not clear-cut obviously), but I don't think that's the case here? CodeCat (talk) 18:07, 28 March 2014 (UTC)

Invitation to Participate in a User Study - Final Reminder

Would you be interested in participating in a user study of a new tool to support editor involvement in WikiProjects? We are a team at the University of Washington studying methods for finding collaborators within WikiProjects, and we are looking for volunteers to evaluate a new visual exploration tool for Wikipedia. Given your interest in this Wikiproject, we would welcome your participation in our study. To participate, you will be given access to our new visualization tool and will interact with us via Google Hangout so that we can solicit your thoughts about the tool. To use Google Hangout, you will need a laptop/desktop, a web camera, and a speaker for video communication during the study. We will provide you with an Amazon gift card in appreciation of your time and participation. For more information about this study, please visit our wiki page ( If you would like to participate in our user study, please send me a message at Wkmaster (talk) 00:20, 6 April 2014 (UTC).

Languages 'critical from the security point of view'?

The hell is this? — lfdder 02:13, 6 April 2014 (UTC)

The US State Department's "critical languages" program – which as I understand it is essentially a scholarship program to encourage American college students to study the languages of certain countries or regions – might satisfy GNG, as it receives occasional mention in news stories. More likely, though, the program would just bear mention in some other article on US government scholarship programs. Neither the definition of "critical languages" nor the list of languages so identified seems notable, though. Cnilep (talk) 02:41, 6 April 2014 (UTC)
Right, there's Critical Language Scholarship Program and National Security Language Initiative as well. There's no indication of any such programme existing in any other country, so I'm PROD'ing it. Thanks. — lfdder 02:49, 6 April 2014 (UTC)

List of English words of Dravidian origin

I have proposed that List of English words of Malayalam origin, List of English words of Tamil origin, and List of English words of Telugu origin be merged into a new article, List of English words of Dravidian origin. Discussion is at Talk:List of English words of Malayalam origin#Merge discussion. Comments from contributors to this WikiProject are welcome and would be most appreciated. Cnilep (talk) 06:15, 8 April 2014 (UTC)

Sebastián de Covarrubias and the Tesoro de la lengua castellana o española

I have proposed merging Sebastián de Covarrubias, a biography, to Tesoro de la lengua castellana o española, his major book. Discussion is at Talk:Tesoro de la lengua castellana o española. Cnilep (talk) 07:02, 10 April 2014 (UTC)

Glottolog bot request

Pending bot request to add Glottolog codes to our info boxes here. I don't expect it will be controversial, but comments are welcome. — kwami (talk) 02:06, 17 April 2014 (UTC)

Changes to naming guideline under discussion

The last consensus for naming was that neither the language nor the people should take the root name, but rather we should have Foo language/dialect etc. vs. Foo people / Foos, etc. The guideline was just changed to say that either the language or the people should be at Foo depending on which is the primary topic. I've reverted pending discussion. — kwami (talk) 21:18, 12 April 2014 (UTC)

The guideline's here, if anyone's wondering. — lfdder 21:23, 12 April 2014 (UTC)
There's also a debate (and a debate on whether there's a debate) on having separate naming guidelines depending on the nature of the people. Not sure if that's one guideline for Anglo-America and one for the rest of the world, or one for aboriginal nations and another for non-aboriginal nations, but we have a conflict between native American peoples such as Cherokee vs Cherokee (disambiguation), and European peoples such as Germans vs German. That is, should the root name be a dab page, or should it be the article on the nation, or should we argue whether each case should to be the nation or the language per PRIMARYTOPIC? — kwami (talk) 02:35, 17 April 2014 (UTC)
There's not really a debate, but rather an effort to WP:Poll here to try and overturn the overwhelming consensus from dozens of RMs and compelling guideline citations as summarized by User:Cuchullain in his close/move of Tlingit:
The result of the move request was: Move. While support for this move was less clear than at other similar RMs recently, supporters were still more numerous, and had stronger arguments. The stronger oppose votes from JorisvS and In ictu oculi referred to the WP:NCL guideline, which has traditionally recommended disambiguating both ethnic groups and their languages. However, they did not address the WP:PRIMARYTOPIC concern, specifically the page view evidence and the fact that Tlingit already redirects to this article, and has for almost all of the three years since the page was moved to Tlingit people. As such, the invocations of the article titles policy (which trumps the guidelines) by several of the supporters become even more compelling. This, taken with what seems to be an emerging consensus that peoples are generally primary topics over their languages, leads me to find a consensus for this move.
He and others in related/moved RMs of this kind have said similar elsewhere. TITLE is very clear about PRECISION and CONCISENESS which NCL has never addressed (and therefore does need a rewrite); there is no "debate to debate", their is only an attempt to resist and overturn emergent consensus and, apparently, TITLE.Skookum1 (talk) 09:26, 17 April 2014 (UTC)
This paranoid conspiracy-theorizing is exactly why we need rational input.
And exactly: Do we accept that the people are the primary topic in all cases? What about when the people are defined by the language they speak? But several editors here have objected to the idea that we argue each case individually, by the weight of publications per PRIMARYTOPIC, as the moving admin suggests. But if the people are to be the primary topic, is that for all nations of the world, or just for aboriginal nations, or just for North America, etc? Where do we draw the line? Or do we move Germans to German, Russians to Russian, Japanese people to Japanese, etc.? — kwami (talk) 16:46, 17 April 2014 (UTC)

Should we link LLMap from our info boxes?

Wondering if we should link or footnote ref the maps at from the 'region' section of the language info boxes. — kwami (talk) 05:02, 19 April 2014 (UTC)

Language tags in templates restricted to ISO 639-1

Will anybody tell these individuals there's actually more languages than the 150-odd in part 1? Do note, this has been done in many more templates. — lfdder 12:06, 19 April 2014 (UTC)

Bokmål and Nynorsk

So what are these, exactly? They've ISO codes and Infobox lang, but their leads say they're just standard orthographies of Norwegian. — lfdder 01:01, 1 May 2014 (UTC)

They're not orthographies, but literary standards: They differ in vocabulary, not just spelling rules. Glottolog characterizes coding Bokmål as a distinct language "spurious", and doesn't bother even listing Nynorsk. — kwami (talk) 02:13, 1 May 2014 (UTC)

What qualifies a language as a "recognized minority language"?

For our info boxes. Just being listed on the census? How do we keep the list manageable? I'm asking because for Tamil someone keeps adding Canada, where they've declared Tamil heritage month, though not anything specific to the language. — kwami (talk) 22:59, 7 May 2014 (UTC)

I would think it would require some sort of legal protection in the country, like Saterland Frisian, North Frisian, Danish, Upper Sorbian, Lower Sorbian, etc. (but not widely used immigrant languages like Turkish, Polish, and Russian), have in Germany. Angr (talk) 12:17, 9 May 2014 (UTC)
I'll add wording to that effect on the template doc. — kwami (talk) 17:45, 9 May 2014 (UTC)

Drastic change in article naming, potentially moving thousands of articles

At the naming-guideline discussion, editors arguing for a change in wording at first said that it wouldn't "change anything drastically as far as these articles are concerned, it will just help head up future conflicts", though it doesn't address any conflicts we've had in naming language articles. However, the argument then became that we must move all language articles that do not have corresponding ethnic articles or other in-universe (WP) ambiguity. I asked if we would then need to move 'Indo-European languages' to 'Indo-European', and was told no, because "language families always use 'languages'," but there is of course no logical distinction – currently language articles use 'language' too, so we could just as easily start debates on moving a thousand family articles after we're done moving what are probably a good five thousand language articles. The argument is that we're not allowed a "walled garden" in how we name language articles, but of course practical application of our naming policies has always been WP practice. Input welcome. — kwami (talk) 18:00, 9 May 2014 (UTC)

The exaggeration "moving thousands of articles" is only one of the problems with this WP:CANVASSing the NCL discussion, which is attempting to revise that guideline so it is inline with policy, namely TITLE. The notification here should be unbiased and is against discussion guidelines. As is much of what has transpired at that discussion, which of course WP:Languages editors are welcome to join. (as if they didn't already know)Skookum1 (talk) 07:06, 10 May 2014 (UTC)
What do you mean "As if they didn't already know"? How could we possibly have known until it was brought to this talk page? I certainly didn't. Angr (talk) 10:00, 10 May 2014 (UTC)
Striking my comment, I shouldn't have assumed that people in this WikiProject would have the related naming convention on their watchlist; on the other hand, very few have taken part in its creation, so why would they? The issue isn't whether languages are to have dabs or not, but whether people articles must, as while NCL has maintained since Feb 2011 that they must, and TITLE and other policies and guidelines say not; reconciling NCL to be coherent with the rest of wikidom (including WP:NCET, which is a spinoff from WP:NCP, is what has been being discussed...and filibustered. A proper notification here would not have editorialized and been with a tone of enlistment and crisis; its tone is clearly CANVASS and should be removed and a neutral version put back in its place. As I observed there, I had to do exactly the same thing when notifying related WikiProjects of a series of CfDs (not ones I started); my own comments here are meant to be reportage, not lobbying; if the CANVASS is removed, mine can be also, of course.Skookum1 (talk) 14:51, 10 May 2014 (UTC)

Discussion is already going on at Wikipedia talk:Naming conventions (languages), among other places. I suggest that it is counter-productive to spread arguments all over the project. Cnilep (talk) 01:19, 12 May 2014 (UTC)

which is why this whole section should have been replaced with a neutral notification by its author, and yes, those who have asked questions here such as Johnuniq and Angr should be asking them on the guideline discussion, not here. FORUMSHOPPING and CANVASSing is only part of what's going on, but as long as this section remains in the state it's in, it's CANVASS and "out of order".Skookum1 (talk) 01:32, 12 May 2014 (UTC)
@Angr:, Kwamikagami already made a more neutral post further up this page at "Changes to naming guideline under discussion". CBWeather, Talk, Seal meat for supper? 02:26, 12 May 2014 (UTC)
Yes, where he also editorialized and NPA'd/AGF'd re talking about TITLE as "paranoid conspiracy-theorizing".Skookum1 (talk) 03:09, 12 May 2014 (UTC)
That's how I described your response in that thread. You have often used words like "cabal" and "conspiracy" (and "racist") for those who disagree with you. I'm not the only one to have noted this. — kwami (talk) 08:36, 12 May 2014 (UTC)
Trying to make it about me again, instead of your own penchant for NPA/AGF and mis-stating what others say, when not actually making it up (as so often the case). Your filibustering at NCL and your ongoing CANVASS and now re-NPA'ing here is more than tiresome, it is a waste of everyone's time. In the passage where you railed "paranoid conspiracy theorizing" and other things, so very clearly NPA, I was pointing out TITLE and the other matters at hand which you still refuse to acknowledge and continue to edit-war over.
This post remains a CANVASS and out of order per discussion guidelines, you are unrepentant and now are trying to turn an NPA you made into yet another one in return, and went so far to hassle me to get me out of the way to pursue an ANI for 3RR when a 3RR had NOT been committed. Both announcements here constitute CANVASS and are out of line. You pretend at NCL that people are agreeing with you when they are not, and continue to edit-war over the guideline itself. Trying to rally support for your position there by exaggeration and editorializing here is part of a too-familiar pattern in relation to NCL, which you do not WP:OWN, but are sure trying to. Consensus has spoken across seven or eight dozen RMs which you lost in your opposition to, yet you still rant about "thousands of articles" and now "five thousand"..... is that really how many you moved, without discussion? Perhaps you have the exact figure, I'm still finding more out there that are have only redirects from the titles you summarily changed, using a naming convention for languages to move people-titles, and asserted time and again that "the language and the people are equally primary topics", a claim which you have yet to provide a citation for and which is clearly "original research" (a polite term, perhaps, for "pure fiction").Skookum1 (talk) 09:06, 12 May 2014 (UTC)

Merging Filipino language to Tagalog language

Everyone is invited to join the discussion here. There is no clear consensus yet and discussion seems to have stalled. Your expertise can be of use for resolving the matter. Thanks. 舎利弗 (talk) 17:48, 19 May 2014 (UTC)

Help fleshing out an article?

I need some help with an article, How to Kill a Dragon. It's up for AfD and right now, deletion is not an issue. I've sourced it enough with reviews to where it will easily pass notability guidelines as a seminal work within its fields. However I am not familiar with the book, its author, or really much of anything to do with the topic in question beyond "Hey, myths are cool". It very sorely needs editors familiar with the work and editing to come in and fill the rest of it out. There is an interested editor attached, User:Jayakumar RG, but he's very new and it'd also be nice if someone could take him under their wing. (Tagging him in this so he can see I'm posting in places for help.) He's made a few errors, but he is willing to learn. Anyone want to edit this and finish turning it into another WP:HEY type article? I'm going to post this in WP:GREECE so we can get a few more people in on this. Tokyogirl79 (。◕‿◕。) 15:23, 24 May 2014 (UTC)

Audio recordings of languages and dialects

When I read Wiki articles about languages, I want to hear how they sound like. Is there any reason why almost no Wikipedia article feature lengthy sound recordings of the respective languages and dialects? Could be very interesting if this was done on every language and dialect article where possible. I'm sure there would be plenty of people (Wikipedia users even) who would want to record their languages and upload the recordings to Commons for free. Maybe an easily translatable standard text could be read aloud, so it would be easy to compare the recordings? The Lord's prayer is often used in the texts, but it should of course be something a bit more neutral... It may seem like a huge task, but I'm sure if only a few examples were made for some major languages, say for English, French, Spanish, and such, it would start a wave, and many others would automatically follow. FunkMonk (talk) 03:38, 26 May 2014 (UTC)

Mongolian language is one article that has such a recording (scroll down to the "Grammar" heading). I'm sure there are at least a few reasons why it is difficult to do this, foremost of which would be finding recordings with free or otherwise acceptable licenses. Secondly, verifiability probably comes into play, especially for rarer or more "exotic" languages. For example how would the average reader/editor verify that a purported recording of the Suoy language is what it claims to be and not, in fact, a different language, or not a reading of the "standard text" but a string of expletives, or just total gibberish? Thirdly, for languages with more than one standard (like English), which representative recording would we include? Currently, inserting an audio file into a WP article produces a rather large and unsightly "media player bar". Including multiple recordings (RP English, Standard American English, New Zealand English, Scottish English, etc.) would be unwieldy at best and subject to endless edit warring at worst. In theory, it's a great idea. I too often want to "get the feel" for the sound of language, but in practice, it would require a lot of work and consensus building. Quite honestly, this Wikiproject is not all that active, in general, and just may not be up to the task. Sad, but true.--William Thweatt TalkContribs 05:08, 26 May 2014 (UTC)
1903 recording of the extinct Tasmanian language
Yeah, I thought about obstacles like that, and we would most likely never get recordings of most languages and dialects, but even some would be nice. As for licences, they would be user created, so shouldn't be a problem (as long as what is being read is copyright free), apart from the same problem all other media on Commons has. And scrutinising more obscure languages before a recording of them is added would be worth it, if we end up having something good. As for which dialect to choose, many languages have a standard variant, English is of course a bit different, with British and American and all, but some variants of those are considered more standard than others, and could be chosen. As for the player bar, well, it is pretty small (the one in Mongolian is a video, which takes more space), take a look at this recording[1] (and to the right here) I added of the Tasmanian language, it takes less space than an image, and language articles have plenty of room, since few images are used, and often there is also a lot of white space next to various templates. Also, the video in Mongolian has the disadvantage of the reader not knowing what is being said, therefore the need for a standard text. But even then, I get more from the video in the Mongolian article than I get from most other language articles. I think any effort, however small the output would be, would be worth it. FunkMonk (talk) 06:29, 26 May 2014 (UTC)
Don't forget that Wikipedia is an encyclopedia of published knowledge; that implies that not any warm body can produce a recording of his or her language and upload it on Wikipedia. Any recording would have to be part of a corpus that is published elsewhere. This would take care of some of the problems that William mentioned, but it also means that for many languages nothing may be available for a long time. On the bright side, I believe that nowadays people working in language documentation make their recordings available under creative commons licenses, so using those on Wikipedia should not present a problem. Landroving Linguist (talk) 12:43, 26 May 2014 (UTC)
Well, Wikipedia's rules for self made content are much more lax than they are for text. Otherwise we would have very little media on Wikimedia Commons. Most of it hasn't been published elsewhere. Also, we have many recordings of word pronunciations and read throughs of articles created by random users. So it would be as "valid" as those, and they're quite widespread here already. FunkMonk (talk) 18:31, 26 May 2014 (UTC)
My Google search for language recordings found many results, with web pages hosting recordings in many languages. Each Wikipedia article about a language can link to one or more of those recordings.
Wavelength (talk) 18:55, 26 May 2014 (UTC)
Well, that leaves out the possibility of standardisation for comparison, and by the same logic, we could just link to off site images, instead of having them in articles, no? FunkMonk (talk) 19:00, 26 May 2014 (UTC)
It is possible that selections from any one of the most comprehensive of those websites would be more standardized than recordings by Wikipedians, because Category:Wikipedians by language indicates only 520 languages.
Wavelength (talk) 19:54, 26 May 2014 (UTC)
Hnn, what do you mean by standardization? I don't think it's a realistic goal to get e.g. a translated spoken sample or even only the same spoken genre. G Purevdorj (talk) 20:00, 26 May 2014 (UTC)
It could be just "Hello I speak X language".User:Maunus ·ʍaunus·snunɐw· 00:18, 28 May 2014 (UTC)
I like this idea. I think that a short, standard statement might be a desirable and simple starting place. Perhaps it could be something like, "Hello, <language> is spoken in <name of place>" (or the name of the culture that speaks it most, or something like that). For languages with significant regional variations, which includes not only English but also German, French and others, then you could say something like "Parisian French is the dialect of French spoken around Paris" or "French is spoken by French-speaking Quebecers in Canada". For long samples, I'd suggest getting a spoken Wikipedia version of the article's lead, translated into that language.
Audio recordings are handled under the same rules as photographs, which require that they look like (or, in this case, sound like) good-faith representations of the purported subject, not that they are provably real according to a published source (see WP:PERTINENCE). WhatamIdoing (talk) 23:52, 27 May 2014 (UTC)
(edit conflict)What about relatively broadly spoken constructed languages, such as Esperanto, which are not associated with a place or even a group of people who share a connection outside the language? What about minority languages such as Kiksht that are highly associated with a particular group of people even though most group members don't speak the language? What about languages spoken by a diffuse community, such as Chiwere languages? (And yes, I realize Chiwere is a poor example since we are unlikely to find speakers, but there are probably similar examples that I'm not remembering right now which some Wikipedians may speak.) Thinking only in terms of languages like major European ones could lead to some poor decisions in this regard. Cnilep (talk) 01:57, 28 May 2014 (UTC)
  • Yeah, by standardisation, I just mean that what's being said in the recordings is broadly the same/similar in structure. I like WhatamIdoing's suggestion as well. As for possibly few contributors, I think once we have a few examples up of the largest languages, the nature of humans is that anyone seeing those up, and not their own languages, would feel the urge to add a representation of their own. Representing one's language is a pretty strong urge for many people, and I imagine a scenario where people would sign up just to add a recording, people recording family members or friends who speak a language we're missing, and stuff like that. Especially if recordings were prominently feature din the articles, perhaps even with a slot in the infobox. Or that's the theory at least. FunkMonk (talk) 01:05, 28 May 2014 (UTC)

Dubious claims on Celtic articles

A user who seems to place a lot of importance in his knowledge of Celtic languages has been edit-warring for weeks over some dubious claims in our Celtic-language articles, specifically Gaulish and Lepontic, that are contradicted by our sources and by our other articles. They are that Continental Celtic is a valid genealogical clade, that Lepontic was a dialect of Gaulish (the proposal is the opposite, that "Cisalpine Gaulish" was not Gaulish at all, but Lepontic), and that the Insular Celtic and P-Celtic theories do not conflict. I've raised these points on the talk pages, but he prefers personal attacks to rational argument. Currently the articles are just tagged as "dubious". Any input would be appreciated. — kwami (talk) 00:30, 28 May 2014 (UTC)

That's a total misrepresentation of what's going on. If fact, it you YOU, kwami who are making destructing edits to the article (notably labeling widely accepted facts as "dubious"), misrepresenting what scholars in the field have to say about the classification of the languages (I even checked with one of the scholars that you keep citing in support of your restructuring of the Celtic language trees and he described your edits as "confused"), and attempting to remove all references to controversial, yet still important new developments in Celtic linguistics, such as the potential Celticity of Tartessian. We have debated this endlessly on talk and user pages, but you flatly refuse to admit that you are in the wrong, no less that you are not a specialist in Celtic linguistics and have an imperfect understanding of the languages.Cagwinn (talk) 00:57, 28 May 2014 (UTC)

Talk pages of articles where kwami and Cagwinn appear to be in dispute:

Cnilep (talk) 02:11, 28 May 2014 (UTC). You missed a few.Cagwinn (talk) 02:57, 28 May 2014 (UTC)

There are a couple of other editors who specialize in Celtic linguistics like @Akerbeltz: and maybe @Angr:. They could perhaps play a role in settling the dispute.User:Maunus ·ʍaunus·snunɐw· 02:21, 28 May 2014 (UTC)
We have already gotten others involved - I believe Angr has participated in some of the discussions in the various talk pages and recently I brought Cuchullain in to mediate, but kwami just went right back to business as usual, forcing me to start reverting his edits again.Cagwinn (talk) 02:52, 28 May 2014 (UTC)
The problem is that you engage in personal attacks rather than discussion, you don't listen to other editors, and you do not provide sources for your POV – or, when you do provide sources, they disprove your POV, but you refuse to acknowledge it.
I'd be happy to admit I'm wrong if you care to demonstrate it: Provide sources for your claims that Continental Celtic is a genealogical family, that Lepontic is a dialect of Gaulish, and that the Insular/P-Celtic debate has been settled. If you can do that, I'll bow to your superior knowledge.
As for Tartessian, I haven't removed all coverage. However, it is a fringe theory, as other editors have noted, and should be treated as such, not plastered over every article even peripherally related to it. — kwami (talk) 03:27, 28 May 2014 (UTC)
I'm not aware of any paradigm changing shifts in the mainstream classification of Celtic and most of my sources (I think the most recent one is Routledge's The Celtic Languages from 2012 (and Routledge usually do excellent and solid languages books) are quite happy to accept the 'traditional' classification of Cont/Ins etc. This also applies to the standard book in Irish on the subject, Stair na Gaeilge from 1994.
Most of the scraps around the classification I'm aware of are between historians & archaeologists on one side and linguists on the other when you get people who think the other side has nothing to contribute to their own field. In my experience, it's worse with archaeologists questioning the existence of Celtic fullstop.
So unless there are some very recent mainstream sources which disprove Ins/Cont I see no reason why to deviate. Has Cagwinn brought any references on the matter? Akerbeltz (talk) 09:06, 28 May 2014 (UTC)
@Akerbeltz: I'm not suggesting we abandon Insular, but that we reflect the ongoing debate, as the Routledge volume you mention has done. Insular is currently more popular than P-Celtic, but pro-Insular authors admit that it has not been demonstrated, and they continue to include P-Celtic in their coverage as well. The Routledge volume, for example, gives an Insular tree (fig. 2.2) immediately followed by a P-Celtic tree (fig. 2.3), despite the fact that the author favors the former. The intro says, The internal structure of the family has been just as controversial. The principal proposals for divisions ... are the pseudo-geographic division into Insular and Continental Celtic and the more linguistically based division into P and Q Celtic languages. That author thus appears to favor the latter. My argument is that we similarly acknowledge the lack of consensus, but place Insular first: For Brittonic languages, we should either list both in the infobox tree ("Insular or P-Celtic"), or, if that's too busy, omit that level and go from fam2 = Celtic to fam3 = Brittonic, as we do in other language families when there's a long-standing debate like this.
None of the trees in the Routledge volume have a Continental clade. Normally we wouldn't use a grade in an info box, but since we have an article on Continental Celtic we wish to link to, it should be marked to show it's not a clade, such as by placing it in parentheses. This is what we do for other language families.
Again in the Routledge volume, there's a debate over whether Cisalpine Gaulish/Celtic is actually Gaulish, but they have Lepontic as one of the earliest branches of Celtic, not as a dialect of Gaulish. Cagwinn's sources say the same.
Cagwinn's sources also disprove his other claims. His objections go back to his insistence on emphasizing a possible Celtic classification of Tartessian, which Angr has characterized as "fringe". Since I also treat that thesis (Koch et al.) as a minor opinion per WEIGHT, he's decided that I'm a "rogue editor" who must be fought at every turn, and that appears to be where the broader dispute comes from. The fact that the passages he quotes generally prove my point suggests that this is not an entirely rational debate. — kwami (talk) 16:55, 28 May 2014 (UTC)
More obfuscation and misrepresentation from Kwami!! First of all, he is the one who continually leaves harassing/threatening comments on my personal Talk page, acting as if he is an admin, when he was stripped of such title a while back because of his misdeeds on Wikipedia. In fact, after having researched his chronic misbehavior on Wikipedia, I have seen that many other users here accuse him of abusive and obnoxious tactics, edit warring, POV pushing, rogue editing, and other shady behavior. I looked into a long running dispute he's had with a user called Skookum1 and his many complaints against Kwami are virtually the same as mine! We are dealing here with someone who is a professional Wikipedia bureaucrat and egomaniac, not someone who is genuinely interested in making Wikipedia articles the best they can be. Anyone who crosses him immediately faces an assault, with threats of administrative action, blocks, and bans, for daring to suggest that he is not right about a subject that he shows little actual expertise in (here he is making pronouncements about Celtic historical linguistics and he didn't even know who Eric Hamp was and implied that one of the leading Celtic and Indo-European linguists of the past century was some sort of fringe figure!). I have provided plenty of references in the past in support of my positions, whereas Kwami provides little to none - and when he does provide them (as he continually does with Eska [201)0], he doesn't even understand them or quote them properly (causing Eska himself to scratch his head and say that Kwami got it wrong). His treatment of the Tartessian matter is so obnoxious that Koch, recognizing my name in Talk page debates on it, reached out to me expressing his concern over the ridiculous debate that had ensued here' whether or not people agree with Koch's conclusions, the man is a highly respected scholar (and has been for many decades now!) and his work on Tartessian (which a number of other important scholars now accept, including Eric Hamp and Barry Cunliffe - who called "unassailable" a few years back in a presentation on Celtic history) can by no means be considered "fringe". Cagwinn (talk) 17:58, 28 May 2014 (UTC)
Your opinion of me is irrelevant. You need to provide sources. If Eska has abandoned his position in the Routledge volume, provide a source where he says this. If Hamp has validated the Celtic classification of Tartessian, provide a source showing his work on it. If I'm confused in my reading of Eska or whoever, say what I got wrong and cite the passage to show that I'm wrong. I've been asking for such evidence for months. This is the most basic element of academic argument, one which you should have no trouble with. — kwami (talk) 19:09, 28 May 2014 (UTC)
Sources HAVE BEEEN PROVIDED! Go back and re-read some of the debates we've had on the various talk pages!! Yes, it was Eska who said to me in personal communication that your summation of his ideas on the classification of the Celtic languages was confused. Your ridiculous dismissal of Hamp's paper in which he accepted Tartessian as Celtic proves both that you have some sort of bizarre agenda regarding this language as well a disinterest in preserving a neutral point of view in Wikipedia articles. You want to reshape every Celtic language related article on Wikipedia so that they fit YOUR IDIOSYNCRATIC POINT OF VIEW on them and not the scholarly consensus. Meanwhile, you are attempting to damage my standing as a good faith editor on Wikipedia for daring to stand up to you, as well as trying to silence me with threats of administrative action. I guarantee you that as someone with 30 years experience in the study of Celtic historical linguistics, who runs several academic-oriented mailing lists whose members include many of the scholars whose work is being cited here, I have way more to offer Wikipedia on the subject of Celtic linguistics than you do.Cagwinn (talk) 20:35, 28 May 2014 (UTC)
If you want to convince others that you have the long end of the argument it would be helpful if you provided the arguments and the data instead of simply arguing against Kwami. This argument is not going to be resolved except through collaboration. It is not actually possible to read from your posts here what your point is or in what data it is based. That makes it very hard for third parties to make out who of the two are right.User:Maunus ·ʍaunus·snunɐw· 20:55, 28 May 2014 (UTC)
I am not going to re-litigate here all the various arguments that we have been having on numerous different talk pages.Cagwinn (talk) 03:19, 29 May 2014 (UTC)
It is going to be quite difficult to get other editors to see your perspective unless you are willing to argue in favor of it.User:Maunus ·ʍaunus·snunɐw· 18:53, 29 May 2014 (UTC)
My point is that I have already argued most of this to death in various Talk pages and don't feel it is necessary to repeat it all here - I don't have the time or patience for that. Cagwinn (talk) 22:03, 29 May 2014 (UTC)
It would appear then that Cagwinn is no more willing to provide evidence here than he was on the article talk pages – he hasn't provided anything coherent there either, except to insist that he's right despite being contradicted by his own sources. The Tartessian question, which seems to motivate his opposition, has been discussed ad nauseum elsewhere; I'm more interested in the easily debunked claims that Lepontic is Gaulish and that Continental Celtic is a valid clade. Settle those issues, and how to reflect the Insular/P-Celtic debate in the info boxes, and we'll have solved nearly everything. — kwami (talk) 04:59, 29 May 2014 (UTC)
LOL - I never claimed that Lepontic was the same as Gaulish!! My issue with Tartessian is that you repeatedly removed discussions of its potential Celticity; whether or not you agree with Koch, et al, you do not have the right to remove all references to this debate, which is being had not by fringe nutjobs, but by some of the most respected scholars in the field! You seem to be seriously confused - not only about the subjects we are debating here, but also about what I have and haven't said in various Talk pages!! 18:35, 29 May 2014 (UTC)
Here you are edit-warring over a supposed claim that Lepontic is a dialect of Gaulish. Here and here you're doing the same on at another article. You keep bringing up Tartessian, as if that were the dispute, but it has no connection to the majority of your reverts. — kwami (talk) 00:17, 30 May 2014 (UTC)
LOL!! You have got to be KIDDING ME! No, that is me undoing your ridiculous removal of Lepontic from the Continental Celtic classification (in the second case, removing important information about Lepontic) and you adding "dubious" to the Continental Celtic family! Because you are not a specialist in Celtic history or linguistics, you seem completely oblivious to the fact that it is many scholars accept that Lepontic was introduced into northern Italy by Celtic speaking immigrants from the Transalpine Celtic region and that some scholars, like Kim McCone regard Lepontic as not a separate language, but as simply an early dialect of Gaulish. Guess it's too much to ask that someone who edits Celtic language articles to actually know something about the subject.Cagwinn (talk) 02:26, 30 May 2014 (UTC)
Pontificating is not helpful. Repeating that you've provided sources without ever providing any sources, except for a few which disprove your argument, is not helpful. You have made several easily debunked claims, as outlined above. Provide sources that those claims are correct, and we can reflect both POVs, or even just yours if that is judged appropriate per WEIGHT. If you do not provide sources for your POV, then we can only report the POVs that are sourced. If I am confused, don't keep repeating that I am confused, but point out *where* I am confused. Wikipedia is a cooperative enterprise, and you need to work with other editors if you hope to get anything accomplished. I'm happy to be proven wrong – I often am – but you need to actually prove that I'm wrong. — kwami (talk) 20:52, 28 May 2014 (UTC)

Cagwinn, you're being unhelpful - to yourself. Sure, I know how frustrating it is to deal with an issue that spans a dozen pages but if you're ever going to get a final answer to the issue, this page is probably it. So it's in your own interest to rouse yourself and provide the sources. Failing that, I can't see how that can be read by non-involved parties as anything but bluster about something that cannot actually be proven. So unless you want to conduct a running battle with kwami over the next weeks - which will undoubtedly cost you even more time - I suggest you bring the resources to the table and end the debate. Akerbeltz (talk) 22:25, 29 May 2014 (UTC)

I am not on Wikipedia to help myself!!!! I first started editing articles related to Celtic studies because I was disgusted by the myriad of errors that I found in nearly EVERY SINGLE ARTICLE! It was embarrassing and I decided to do something about it! I think I have made solid contributions here and the article that I have focused on have improved greatly - but now we have someone who knows very little about the subject, but who knows an awful lot about Wikipedia bureaucracy and how to manipulate it, doing great damage to the articles and doing everything in his power to stop me from standing up to him. I can leave Wikipedia any time and let it continue to devolve into absolute garbage - it doesn't affect me one bit.Cagwinn (talk) 02:14, 30 May 2014 (UTC)
In showing that Eska believes in Insular Celtic, something which I have myself referenced in my edits, Cagwinn just summarized a source on the Gaulish talk page[2] that supports my argument for both of the disputed points in that thread. This is a very odd debate: How do you have a rational discussion with someone who repeatedly proves your point while claiming you're wrong? — kwami (talk) 00:29, 30 May 2014 (UTC)
This is the problem - you don't even understand what you are doing, or why people like myself have a problem with it! These articles were much better before you started manhandling them, now all of the Celtic language articles have been compromised by YOU and it is clear that you will never let them go. Wikipedia is certainly on a downward spiral because of users like yourself. Hope you're happy!Cagwinn (talk) 02:14, 30 May 2014 (UTC)
By the way, if people want to see what I am up against, look at today's edits on the Common Brittonic article - Kwami insists on adding a "dubious" tag to the classification of Insular Celtic, when it is nearly universally accepted by linguists as valid! Instead of taking the issue to the Talk page and engaging others in a discussion, he leaves threatening warnings on my personal Talk page! Kwami is POV pushing and is a Rogue Editor - none of what he does on the Celtic language articles can be considered "good faith" editing!Cagwinn (talk) 04:11, 30 May 2014 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────Ok, I was asked to weigh in by Cagwinn a few days ago (here, so I'll offer my take. It was my understanding that the majority view among linguists favors the Insular/Continental division. I didn't see much brought to the table that challenged that understanding. Kwami's point seems to be that some linguists favor the P-/Q- distinction, and that therefore we should include both or neither. However, it seems to me that this may be overstating how well accepted the P-/Q- formulation really is. Kwami did bring up some sources that mention it as a possibility, such as Eska, but Eska himself apparently doesn't give it much credence. On the other hand, there are numerous sources that support the Insular/Continental formulation and Kwami himself concedes that it's more popular.
In my opinion, this isn't a debate to hash out in infoboxes. The infoboxes should represent the majority view, while the specifics of the debate can be explained in full detail in the key articles (with summary and links in sub-articles where appropriate). I would much rather see everyone here try to improve our (often shoddy) article content on the Celtic languages rather than bicker over what's in the infoboxes, which should really just support the info in the articles.--Cúchullain t/c 04:38, 30 May 2014 (UTC)

Yes, this is a matter of opinion, how to reflect a long-standing dispute in the literature. There is no right answer, but I do feel showing both sides is a more balanced approach than showing just one.
However, the other points are not matters of opinion. When sources say that Lepontic is one of the most divergent Celtic languages, we can't present it as a dialect of Gaulish. And when our sources say that Continental is a geographic group, not a language family, we can't present it as a language family. I could understand Cagwinn differing from me on how best to reflect the Insular/P-Celtic debate, as you do, but I can't understand how he could say that Lepontic is not a dialect of Gaulish, provide sources that Lepontic is not a dialect of Gaulish, and yet edit-war to restore claims that it *is* a dialect of Gaulish. If there is anything rational in that approach, I really wish he would share it. — kwami (talk) 07:09, 30 May 2014 (UTC)
No one wants the dispute kept out of the articles, in fact, it really needs to be better presented. However, it doesn't need to be hashed out in the infoboxes. Those should reflect the majority view, with the details hashed out in the appropriate articles.--Cúchullain t/c 13:05, 30 May 2014 (UTC)
Additionally, the behavior of both parties in this discussion is pretty hard to defend Give it a rest, guys.--Cúchullain t/c 04:41, 30 May 2014 (UTC)
You may disagree with my editing style, but I would hope that after having interacted with me on many Talk pages you know that I am knowledgeable on a wide array Celtic matters - especially linguistics. Kwami, on the other hand, is not an expert on anything Celtic related and is a known trouble maker on Wikipedia. He is already re-attacking all of the Celtic language articles this evening. In addition to Common Brittonic, he has also just now assaulted Gaulish Language. I suspect by the time I wake up tomorrow, I will find all the above-listed articles will have been maliciously edited by him. He simply will not stop.Cagwinn (talk) 06:13, 30 May 2014 (UTC)
When the sources you present support my POV and contradict your own, it's hard to understand how you have anything rational to contribute. I'm happy to admit it when someone shows me that I'm wrong, but I can't see that you've even tried. — kwami (talk) 07:01, 30 May 2014 (UTC)
If we were to restrict our classification trees to divisions that are universally accepted as true genetic divisions (rather than areal classifications and so on), then for many language groups we would not be able to show any structure at all. To me, this is a symptom of infobox fixation. As far as I can tell, the insular/continental classification is pretty much universally used, if not as a truly genetic then at least as a structural classification, and that to me is plenty of justification to include it in the boxes. The debate over its precise historical interpretation vis-a-vis the p/q one can be handled in the text. Please don't overburden infoboxes with demands of absolute precision, which that format of presentation simply cannot fulfil in any case. (And please keep that squabble over Lepontic out of here. Also, Kwami, stop edit-warring.) Fut.Perf. 10:11, 30 May 2014 (UTC)
This is my feeling as well. In general, I don't see there's much if any support for Kwami's changes and a number of editors have expressed opposition. This makes it especially troubling that Kwami has continued making the changes after they were discussed and found no support. Perma-tagging the infoboxes as "dubious" also isn't helpful (there's nothing dubious about a language being labeled "Insular" or "Continental Celtic".) This is besides the revert warring from both editors, both of whom should know better.--Cúchullain t/c 13:05, 30 May 2014 (UTC)
Looking further through this, it also seems to me that the debate has somehow conflated at least two issues, which may be partly responsible for the way Kwami and Cagwinn have been talking past each other. One question is whether "Insular Celtic" is a valid genetic unit, as opposed to the possible alternative view that p-/q-Celtic is an older division. In this respect, my impression is that the sources I've seen cited here and those I quickly browsed through provide a rather clear answer: the strongly predominant view in scholarship appears to be in favour of "Insular Celtic". So, for languages in that branch, I really see no reason at all not to include "Insular Celtic" as a node in the infobox family tree. The other, more technical linguistic question, which Kwami appears to be rather hung up about but whose significance may not have become very obvious to Cagwinn, is what consequence this has for the status of "Continental". While most sources do evidently use "Continental Celtic" as a convenient cover term, it is technically true that it may not be a genetic unit in the narrow sense, even if "Insular Celtic" is one. According to classifications I've seen, including the one from Eska that Cagwinn cited on Talk:Gaulish language, the "Continental" languages may be divided among each other by splits that run deeper (are genetically older) than that between any one of them and the "Insular" ones. In that case, "Continental" would be merely a grab-bag category for "everything that isn't Insular", but it wouldn't itself by a node in the tree. So there may still be room for argument whether we should include "Continental" in the infoboxes for Gaulish etc. (The third question appears to be how to to treat the relationshsips between Transalpine Gaulish, Cisalpine Gaulish and Lepontic and whether there is a linguistically valid sense in which Lepontic could be comprised within the scope of what "Gaulish" means; this is something I really have no view about yet.) Fut.Perf. 13:43, 30 May 2014 (UTC)
Interesting point; it means there's even less reason to add "P-/Q-Celtic" to the infoboxes (or to remove Insular Celtic).--Cúchullain t/c 16:55, 30 May 2014 (UTC)
I'm not asking that we only report universally accepted nodes. That would obviously be impractical. I am asking that we report an ongoing debate on classification that appears in nearly every source on Celtic languages. As some of our sources have reported, P-Celtic was dominant early on, then Insular became the dominant theory, and now people are revisiting P-Celtic. Both theories have problems, and neither appears able to explain the evidence explained by the other. — kwami (talk) 06:44, 31 May 2014 (UTC)
Treat these issues in the text, sure. But overburden the tree diagrams in the infoboxes with them: please don't. Trees are meant to be simplified, and it is quite common for family tree diagrams used for general linguistic reference purposes (rather than narrowly technical, specialist discussion of genealogical issues) to include nodes like "Insular" and "Continental", even in publications that otherwise acknowledge the problematic nature of these under a strict cladistict perspective. I've just quoted several examples on Talk:Gaulish language, please see there. Fut.Perf. 16:19, 31 May 2014 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── It's very little of a burden, considering that we're clarifying a contentious issue. I'm asking two simple and unobtrusive things: We mark Continental as not being a normal node, e.g. by using parentheses. That's unobtrusive. And if we list Insular or P-Celtic, we list both. That's hardly obtrusive either, and if it's deemed too much, we can simply omit that node.

Consider Stifter (2008).[3] Like us here at WP, Stifter is not pushing his own analysis but summarizing the literature. There is no Continental there, and both Insular and P-Celtic ("Gallo-Brittonic") are presented as possibilities. As an encyclopedia, we should also summarize the literature like this. That's little to ask for a significantly more informative classification. — kwami (talk) 18:47, 1 June 2014 (UTC)

It appears there's no agreement that your additions are significantly more informative. In fact, I find them confusing since they imply the P-/Q- construction is much better supported than it really is (is, sources like Eska that mention the construction specifically don't buy it). I think the consensus is to discuss the issues in the articles and leave the infoboxes reflecting the common view.--Cúchullain t/c 20:56, 1 June 2014 (UTC)
What about placing Continental in parentheses to indicate that it is not genealogical? — kwami (talk) 21:09, 1 June 2014 (UTC)
I don't think that's necessary, and has the potential to just further confuse readers.--Cúchullain t/c 18:30, 3 June 2014 (UTC)

Leaflet For Wikiproject Languages At Wikimania 2014

Project Leaflet WikiProject Medicine back and front v1.png

Hi all,

My name is Adi Khajuria and I am helping out with Wikimania 2014 in London.

One of our initiatives is to create leaflets to increase the discoverability of various wikimedia projects, and showcase the breadth of activity within wikimedia. Any kind of project can have a physical paper leaflet designed - for free - as a tool to help recruit new contributors. These leaflets will be printed at Wikimania 2014, and the designs can be re-used in the future at other events and locations.

This is particularly aimed at highlighting less discoverable but successful projects, e.g:

• Active Wikiprojects: Wikiproject Medicine, WikiProject Video Games, Wikiproject Film

• Tech projects/Tools, which may be looking for either users or developers.

• Less known major projects: Wikinews, Wikidata, Wikivoyage, etc.

• Wiki Loves Parliaments, Wiki Loves Monuments, Wiki Loves ____

• Wikimedia thematic organisations, Wikiwomen’s Collaborative, The Signpost

For more information or to sign up for one for your project, go to:
Project leaflets
Adikhajuria (talk) 14:38, 18 June 2014 (UTC)

African languages presentations at WikiIndaba 2014 are now online

The WikiIndaba 2014 sessions are now online at YouTube. This includes presentations on African Language Wikipedia and South African Language Wikipedia. Informative and recommended. -- Djembayz (talk) 18:23, 6 July 2014 (UTC)

Edit wars

Edit wars I don't have time to deal with properly:

  • Newari language and Classical Newari. Claim that "Nepal Bhasa" is also known as "Nepal Bhasa" [sic], and repeated moves to non-English names such as "Nepal language" and "Nepalbhasa". AFAICT, "Newari" is still the common name in English. No good ref that "Newari" is pejorative. Ethnologue says that, but they are not a RS in that regard. Most sources of both lang and culture continue to use "Newari". Needs an actual discussion rather than ad hominem attacks.
  • Iranian languages, maybe still Indo-Iranian languages. Dispute over alt names by anon. IP.
  • Azerbaijani language. Insistance on saying it's spoken in Eastern Europe and Western Asia rather than the Caucasus. Technically true (Daghestan is in Europe), but highly misleading. Also insisting on inferior ref (E17) for population. Not different enough that it should matter.
  • Maybe review the additions to Mishing language and see what's worth saving, if refs can be found.

kwami (talk) 18:29, 16 July 2014 (UTC)

Please discuss the issue with me in the talk page of the Azerbaijani language article, as I've asked you to do for the past month or so. Once again, you placed a dubious tag next to the term "Eastern Europe" even after I provided the sources to show you that Azerbaijani is spoken in parts of Eastern Europe, which I'll have you know not only includes Dagestan but also portions of Azerbaijan. Previously, the article only stated that Azerbaijani is spoken in Western Asia. It is in fact spoken in parts of Eastern Europe (the Caucasus, including Dagestan and Azerbaijan) and Western Asia (South Caucasus, northern Iran and Armenia). The fact you have a problem with this is baffling. Please discuss your issues in the article talk page. --Nadia (Kutsuit) (talk) 23:11, 16 July 2014 (UTC)

Move request at Newari language

AFAICT, the current name is the one used in the preponderance of the linguistic lit. — kwami (talk) 23:27, 23 July 2014 (UTC)

Language family color table

Hi there. I've raised the issue at Template talk:Infobox language/language family color table and can anyone answer? Jaqeli 23:48, 24 July 2014 (UTC)

Reliable sources

Hey, all, I mainly work on video game-related articles and have gotten a number of GAs and FAs in that area. I think it'd be interesting to take a language article on as a project, though, as I've been interested in languages for a long time and, compared to the Video games project, this one has surprisingly few recognized articles (no offense, of course - it just depends on what users want to work on). Coming up with reliable sources is by far what concerns me the most about this endeavor. Any tips or general guidelines? Tezero (talk) 17:37, 7 April 2014 (UTC)

Hi, you should probably look at the few GAs and FAs about languages to get an idea of what the standard is. You will find that it is quite different from articles on videogames in that we rely almost exclusively on academic publications such as articles from reviewed journals, published grammars and sometimes unpublished dissertations for some languages that have little published coverage. I think it will be very difficult to be able to bring an article to FA without having a sound knowledge of both linguistics in general and the linguistics of the language you plan to write on in general. That is the main reason there are few high quality articles on languages, they tend to be written by topic experts who specialize in one or two languages, but they generally dont venture to write about other ones. But you should definitely give it a try if you find the work interesting. User:Maunus ·ʍaunus·snunɐw· 17:44, 7 April 2014 (UTC)
Well, that's the thing. I know they have to be from academic publications; I'm just not sure how academic, but I suppose just looking might really be the best idea. For example, biology articles, particularly those on diseases, seem to have rather confusing standards for what constitutes an acceptable source. I'm not hugely well-versed in linguistics, but I know my share about certain individual languages and families. Tezero (talk) 18:23, 7 April 2014 (UTC)
As for "how academic" the current standard is pretty much as academic as they get. You would for example not be able to build a GA article on a phrasebook, and newspaper articles about the language. I think the standard here tends to be stricter than in some other projects, and for example I think it is commonly accepted in the project that certain languages will never make it to GA because there are not enough high quality sources about them. This is a different philosophy to those who consider that if the article duly reflect the literature then it should be a GA inspite of the deficiencies of the extant literature. If you have a particular article in mind I would be happy to suggest some adequate sources. User:Maunus ·ʍaunus·snunɐw· 19:06, 7 April 2014 (UTC)
I'm thinking of Czech language, which I expect there to be plenty about. I'm in college now, too, and my university library has a decent foreign language section. Tezero (talk) 23:55, 7 April 2014 (UTC)
How good is its linguistics section? The article on Czech would need academic sources discussing its syntax, morphology, and phonology rather than language-learning materials. And yes, for Czech we would expect high-quality sources, because Czech is a language about which a whole lot has been written in academic journals and monographs. Angr (talk) 07:08, 8 April 2014 (UTC)
I wonder if relying almost exclusively on academic sources for everything, even very basic information, is best for our readers. Certainly we would want everything to agree with the academic sources, but providing some "lay-accessible" sources for basic information, might be very helpful to our readers. Consider the case of a teenager trying to write a paper for school. Is that student better off with an impressive-sounding list of highly technical graduate-level sources, or is that student better off with a couple of non-academic sources—sources that the student could actually understand—being thrown in the mix? For example, I think you could use a book like this one, perhaps to support specific examples about the slight, but important, difference in pronunciation between hot chocolate and bitter chocolate, without damaging an article on the Czech language.
I lean towards providing an occasional accessible source myself, even for medicine-related articles. I think that the reference list for a well-developed article should normally include a couple of sources that typical readers can easily access and read. WhatamIdoing (talk) 22:05, 24 April 2014 (UTC)
That's actually a college text book. I used an earlier edition to teach a course for third-year anthropology majors. If that's the sort of material you have in mind, I think you'll be fine. It's work like this that is more of a problem, I think. Relying on Oppenheimer's general linguistic anthropology text for the grammar of specific languages is not likely to get you all the way to good article status, though. Cnilep (talk) 01:51, 25 April 2014 (UTC)
By itself, I wouldn't expect it to contain all of the relevant information. WhatamIdoing (talk) 03:13, 26 April 2014 (UTC)
I don't think you give teenagers enough credit. The student is better off with, and in fact deserves, the best possible sources we can provide. "...(N)on-academic source that the student could actually understand" makes it sound as if a high-school student is incapable of muddling through an academic text and I don't believe that is true. They may have to put down that iPhone or sacrifice some video game time, but with proper effort, I believe they will benefit from high-quality sourcing. If a particular student can't make heads nor tails of an academic source, perhaps they shouldn't be starting their research with an encyclopedia. In that case, a simple google search can provide them with plenty of "non-academic" sources. In order to make writing that school paper worthwhile, the student must put some work into it, which starts with tackling the fact, that's usually the whole point of such assignments: to prepare them for the type of work they will be doing in college. That being said, as Cnilep points out, the source you specified is fine for our encyclopedia but using "non-academic" sources would amount to dumbing down and that's never a good idea. We should expect and demand the best from our young students.--William Thweatt TalkContribs 03:45, 25 April 2014 (UTC)
  • If post-graduate-level academic papers can be understood by average fourteen year olds, then one wonders why their authors spent another dozen years in school, instead of just putting in "proper effort" themselves and skipping all that expense and bother with finishing not only high school but also several university degrees.
  • Less than half of our teenager readers are headed to university, so that's probably not the point behind their assignments. (Even if it were, Wikipedia is a point of entry for most university students when they're dealing with an unfamiliar subject.)
  • "Dumbing down" is when you oversimplify or omit material because you don't think it's possible for the reader to understand it. Appropriately using a variety of types of reliable sources to support material is not dumbing down. Using academic journal articles to support basic information might look impressive at a glance, but it doesn't make the article any more verifiable or any better written. As WP:RS puts it, the source needs to be strong enough to support the material. Basic material needs only a basically reliable source. WhatamIdoing (talk) 03:13, 26 April 2014 (UTC)
  • comment It says very clearly in TITLE that the interests of the general readership.should be put before those of specialists. While that concerns titling the same principle is inherent re content. While in a highly technical field like linguistices or pharmacolgy there is a necessary emphasis on academic-type sources, and articles are often highly technical in content and flavour, one issue towards FA for language articles could be more general interest content - common phrases, unique words and concepts etc. Instead of just phonology tables and points of grammar in technical-speak which are obscure to laymen (teenagers or not).Skookum1 (talk) 04:19, 25 April 2014 (UTC)
  • Much later comment: For what it's worth, Czech language is at GAN now. I was worried about the use of things like Czech: An Essential Grammar (a widely known, commercially released Czech book that is nonetheless academic in tone and classed as a "grammar"), but now I've seen that Swedish language, an FA, uses a similar source. Everything else I think is suitably academic; I wouldn't have dreamed of anything like the Minnesotan book. Tezero (talk) 01:19, 28 July 2014 (UTC)

en:Languages in censuses

I invite you to help write Languages this article.--Kaiyr (talk) 13:50, 4 June 2014 (UTC)

I wouldn't mind helping, but it's an extremely long article, and that's just with the bare headings. Why not group it by something like region or language collection style, or turn it into a prose-based article rather than a list (i.e. grouped by topics related to language collecting rather than by country)? Tezero (talk) 01:13, 28 July 2014 (UTC)
    • It is like Race and ethnicity in censuses--Kaiyr (talk) 13:22, 28 July 2014 (UTC)
      • Oh. Well, that's an extremely long page and barely goes into any detail. Are you okay with that being the case for this article? Tezero (talk) 21:20, 28 July 2014 (UTC)

Please help improve five articles that a few university students wrote about language use in Singapore

In 2012, a group of Nanyang Technological University students wrote five articles about language use in Singapore for an assignment. The articles (Languages of Singapore, Language education in Singapore, Language planning and policy in Singapore, Speak Good English Movement and Speak Mandarin Campaign) contain a wealth of well-referenced information, but need considerable cleanup. Would any members of WikiProject Languages be keen to collaborate with me to bring these articles to GA status? --Hildanknight (talk) 08:04, 21 June 2014 (UTC)

I've looked at the articles a couple of times thinking about reviewing them. But they are just too far outside of my area of immediate expertise, while not interesting enough for me to start delving into the literature. They've been GA candidates for a really long time and no one seems to be wanting to review them. This is sad.User:Maunus ·ʍaunus·snunɐw· 18:45, 6 July 2014 (UTC)
@Maunus: Instead of reviewing the articles, how about helping to copyedit them and clean them up (which would counter systemic bias)? You would learn more about a multilingual society and Asian cultures. Although I have made some headway into Language education in Singapore, the work is too much for a single editor to handle. --Hildanknight (talk) 11:13, 7 July 2014 (UTC)
Nah, that is not my thing. I am a researcher and content writer. For that you could try the copyeditors guild. Or listing it for peer review.User:Maunus ·ʍaunus·snunɐw· 22:00, 7 July 2014 (UTC)
I reviewed Speak Mandarin Campaign a few months ago. It was seriously deficient in citations, structure, and formatting. Not that it didn't represent good work in any way, but it wasn't GA material by any stretch. Tezero (talk) 21:43, 28 July 2014 (UTC)

Language templates

FYI, several lang templates are up for deletion at Wikipedia:Templates for discussion/Log/2014 August 13 -- (talk) 08:05, 14 August 2014 (UTC)

US English Dialect Page Titles (revived?)

Disclaimer: I'm not sure how to revive an archived page--which "template: archive" says is feasible, recommending that course of action, yet then gives no explanation on how to do--but here is what I'll assume the template means...

Users in the past were having discussions (and the exact same kind of discussion appears on several other pages) about perhaps coming up with a uniform way to title pages related to English dialects. mnewmanqc, for example, posed "Should there be a common form for pages dealing with the varieties of English spoken in a US region? [...] Note the two predominant forms are PLACENAME dialect as in Baltimore dialect and PLACENAME English as in Pittsburgh English. My own impression is that the latter is tending to predominate in the dialectological literature."

Users tended to agree (as on some specific dialects' talk pages, etc.) that we should use a standard format but that it would be an arduous task to attempt as people would have an array of opinions. Apparently, the topic was then soon forgotten. However, I think the point of the general agreement that we could try for a standard format is that we should continue having this conversation, and see if we can get to any productive conclusions. Let's see if we can make that standard form, even if it requires some more in-depth discussion.

I'd personally argue in favor of the use of the format “PLACENAME English.” The term “English” covers a broader scope than “dialect” (just as “dialect” covers a broader scope than “accent”). For example, there are instances (such as with New York City English) where it is not clear to linguists that a language variety or dialect category can be considered a single, clear-cut, uniform-throughout dialect when there is such a great deal of intra-local variation (sometimes known as sub-varieties, sub-dialects, etc.) in terms of class, ethnicity, and so on. According to mnewmanqc on the NYC English talk page (many of whose points I’m repeating here), “NYC English” is the primary term preferred by all recent research on the topic due to its ability to cover such an expansive relevant area. The term “English” neatly includes either or both “accent” and “dialect,” allowing a greater diversity of ideas on the page, and bringing seekers of the accent and the dialect all to one convenient article, without excluding either topic. Since “English” allows for more or less broadness, it can be used to characterize what may still be defined uncertainly or without consensus by linguists, considered by some linguists a single dialect and by others a whole broad class or category of dialects, such as Inland Northern American English. The term "English" is also the predominating term as it now stands.

The user who moved the article “New Jersey English” to “New Jersey English dialects” (in order to, in good faith, emphasize the plurality of dialects in the State of New Jersey) seemed to miss the point that the original title already allowed the article to encompass multiple dialects and sub-dialects. Other thoughts? I would love the idea that we could agree on a standard format, rather than seemingly arbitrarily having articles with inconsistent names like these: Philadelphia accent, Central Pennsylvania dialect, Pittsburgh English, Boston accent, New York City English, Tidewater accent, etc. Wolfdog (talk) 14:37, 1 August 2014 (UTC)

I agree and reiterate that the convention among specialists is increasingly PLACENAME English, which avoids confusion over of dialect status vs. accent. mnewmanqc (talk) 15:08, 1 August 2014 (UTC)
The older discussion is here. Wolfdog's argument that "PLACENAME English" is a useful cover term for "PLACENAME dialect" and "PLACENAME accent" is well-taken. An objection raised in the 2010 discussion relates to cases such as Scouse, where the nickname is fairly standard both in local and scholarly usage. I have no objection to standardizing "PLACENAME accent/dialect/English/etc." to "PLACENAME English", but would find it hard to support moving pages such as Scouse to Merseyside English (or what have you; Liverpool English is currently a redirect) or Geordie to Tyneside English. I don't have the same scruple about Baltimorese, but on the third hand I think Pittsburghese is becoming more widely used (e.g. Johnstone 2013). Cnilep (talk) 04:47, 7 August 2014 (UTC)
I would absolutely agree that we could make exceptions for names such as Scouse, Geordie, Received Pronunciation, General American, etc. that virtually all the literature already recognizes by a standard name that has no need for a clarifying tag of "dialect/accent/English," etc. However, I would say that while terms like Pittsburghese may be becoming more common, again, it is certainly not the indisputable standard and so "Pittsburgh English" is better for covering all facets of that variety. Even if we do agree on this, however, it does of course cause some hassle. How could we standardize this as a guideline for Wikipedians to follow in the future? Even now, it will (apparently) take revived talk-page discussions to revert/move back, for example, "Boston accent" to "Boston English" or "New Jersey English dialects" back to "New Jersey English," etc. since such articles have already been moved from the "PLACENAME English" titles in the past. Anyone more well-versed than I am in the Wikipedia policy-making arena? I think we could bring this discussion to that level, if feasible. Wolfdog (talk) 15:07, 7 August 2014 (UTC)

Navajo language

By any chance, is anyone knowledgeable about this language? I've been working on it a little in hopes of GAN while Czech's article gets reviewed, but the Grammar section (and particularly its Verbs subsection) is absolutely enormous and I don't feel I'm educated enough to know what's vital and what can go. Tezero (talk) 22:01, 3 August 2014 (UTC)

Eh, never mind. I deleted the huge majority of it yesterday (well, migrated to a separate article) and I'll be building it from the ground up. Tezero (talk) 15:17, 7 August 2014 (UTC)

Linguistics vs. Languages

I've just realized something: Barring articles on individual languages and families, is there any rhyme or reason for what's in our scope as opposed to WikiProject Linguistics' scope - or both? For example, Fuck (film) is in both, while Chinese classifier is only in theirs. Tezero (talk) 16:31, 30 July 2014 (UTC)

I suppose the original intent was that "languages" would cover individual languages, while "linguistics" would cover the scientific field itself? There is a lot of overlap of course, so maybe they should be merged. CodeCat (talk) 17:44, 30 July 2014 (UTC)
CodeCat is right, but a suggestion to merge the two is misguided. Those interested in languages might not be interested at all in HPSG or chain shifts. If it's a suggestion to try and increase participation, trust me, merging won't work, that's a problem of WPs in general. ALTON .ıl 15:07, 5 August 2014 (UTC)
No, it's not a suggestion of that. But look at the GA lists for Languages and Linguistics - what hard-and-fast patterns do you see there? This is about the projects' seemingly arbitrary scopes; it has nothing to do with participation. Tezero (talk) 16:00, 5 August 2014 (UTC)
Since no one's given a suggestion, I propose that articles not explicitly related to individual languages or language families be removed from this project's scope, e.g. Fuck (film), and that articles related to individual languages or language families be added, e.g. Chinese classifier. Is that alright with everyone? CodeCat? Alton? Tezero (talk) 01:59, 7 August 2014 (UTC)
I agree with your change but I would still rather see the projects merged. CodeCat (talk) 11:50, 7 August 2014 (UTC)
I would, too, but that depends chiefly on whether they're okay with such a merger. Tezero (talk) 14:52, 7 August 2014 (UTC)
I like the change. As a long dead user I have no dog in the fight w.r.t. merging, just sharing my reaction. ALTON .ıl 16:40, 7 August 2014 (UTC)
Alright, I'll begin to switch project banners accordingly. If someone objects later on, we can always reopen the discussion. Tezero (talk) 17:16, 7 August 2014 (UTC) == Notification of a TFA nomination ==

In the past, there have been requests that discussions about potentially controversial TFAs are brought to the attention of more than just those who have WP:TFAR on their watchlist. With that in mind: Fuck: Word Taboo and Protecting Our First Amendment Liberties has been nominated for an appearance as Today's Featured Article. If you have any views, please comment at Wikipedia:Today's featured article/requests. — Cirt (talk) 02:37, 30 October 2014 (UTC) == Peer review for Czech language ==

Hey, everyone, I'm not sure if anyone here keeps up with these things, but the Czech language article, which passed GAN last month and which I'd like to take to FAC before the end of the year (I have one FAC going on now, and I'm not sure if this article or a different one will be my next), has been at peer review for a while now with no comments so far. Here's the review; please give a few comments if you have time as it's the first language article I've seriously worked on. Tezero (talk) 19:28, 9 October 2014 (UTC)

Could you maybe add the article milestones at the top of the talk page. Else, it is very difficult to access the GA review, which will be interesting to look at for any prospective reviewer. G Purevdorj (talk) 09:15, 12 October 2014 (UTC) == Japoñol ==

Opinions or comments would be welcome at Wikipedia:Articles for deletion/Japoñol (2nd nomination). Only two people – one is me – have contributed to the discussion, and we disagree. Cnilep (talk) 00:57, 15 October 2014 (UTC)