Wikipedia talk:WikiProject Languages

From Wikipedia, the free encyclopedia
Jump to: navigation, search

The largest section in the Modern Hebrew article describes the language as "non-semitic"[edit]

The largest section in the Modern Hebrew article describes the language as "non-semitic". This view is WP:Fringe, as confirmed by reliable sources, yet this view currently occupies the largest space in the article, going into extreme detail including a table for individual opinions, while everything else is presented at a broad/high level. In my opinion, and in the opinion of the majority of editors on the talk page, this is WP:Undue. Over the past year, six editors have expressed their view that the section should be removed or minimized, while only two have supported it. Despite this consensus, the section remains in the article in its current state, likely due to the slow nature of the subject. Any editors wishing to contribute are welcome. Drsmoo (talk) 17:17, 17 May 2016 (UTC)

Accents in language titles[edit]

Hi folks. Is there guidance on whether or not the article title for a language should be the latinized name (e.g. Xaracuu language) vs. the accented name (Xârâcùù language)? I looked in Wikipedia:Naming conventions (languages) but did not find anything dispositive. I suspect (based on other examples) the practice is not to latinize the name but I'd like some confirmation one way or the other. Thanks! Adam (Wiki Ed) (talk) 17:10, 21 June 2016 (UTC)

I see some suggestion here that combining marks are to be avoided (though I don't know if the title includes them or they're on a keyboard somewhere). Adam (Wiki Ed) (talk) 17:39, 21 June 2016 (UTC)

Maybe someone more knowledgeable should be able to point us to some guidelines somewhere, but until that happens here are my two pennies. I don't think there can be universally applicable guidelines about non-ASCII symbols in language names (and accents are just a subset of these). I think it's best if consensus about the title of a given language is reached on a case-by-case basis, as attitudes vary between (and occasionally within) countries and broad cultural areas. I think in any case, the wikiproject for the country the language is spoken in might be the most relevant. I can think of a couple of linguistic/geographic areas where special symbols are commonplace. In the Salishan languages of the US Northwest (see 1 and 2), none of the special symbols from the language names seem to have made it into the article titles. Maybe for all of them there just happens to be a common name in English and it's predictably a simple one. In another area: Category:languages of Brazil, the acute accent is ubiquitous in language names, but this is the case because it's ordinarily used in the dominant Portuguese language, which has probably been the direct source of the established names in English for these languages.
Do you have any specific examples in mind? Uanfala (talk) 15:16, 22 June 2016 (UTC)
The specific question came from a student working on Xaracuu language, which has used the unaccented title since the article was created in 2011, though the text of the article has (near as I can tell) always used Xârâcùù. Looking at Template:Languages of New Caledonia it appears that most of the southern language titles use accents in titles where present in the language. Adam (Wiki Ed) (talk) 16:04, 22 June 2016 (UTC)
Xârâcùù seems to be overwhelmingly more prevalent in both French and English sources ([1] [2]). I've started a requested move, see Talk:Xaracuu language#Requested move 22 June 2016 (I'm prevented from moving it straight away by the extra edit in the history of Xârâcùù language). Uanfala (talk) 16:37, 22 June 2016 (UTC)
Thanks for gophering. I can move it on Protonk. Adam (Wiki Ed) (talk) 19:45, 22 June 2016 (UTC)

Linking of Open Access publications about linguistics[edit]

Open Access publications about a particular topic are a useful addition to articles as they are available to people outside of academia as well. I have held that conviction for a long time, but now I work for Language Science Press, which happens to produce Open Access monographs. This means on the one hand that I am very well informed about new open access books, on the other hand, it means I have a WP:COIN.

I have added some of these monographs to articles where I was sure that it was relevant (Gramars of Yakkha, Mauwake, Pite Saami); for others I have suggested inclusion on the relevant talk pages. Most of the smaller languages receive few edits and might not even have anybody watching them to whom I could suggest inclusion.

I am not very happy about this state of affairs. Technically, I am violating policies about conflicts of interest and paid contribution. I still think that for the coverage of linguistics, the inclusion of these books is useful, so I ignored all rules.

I would appreciate discussion about this issue and would be happy if someone could suggest a good course of action.

For the record and FWIW, my former job was the creation of Glottolog. This might or might not lend me some credibility

Jasy jatere (talk) 18:48, 23 June 2016 (UTC)

It looks to me like you are behaving properly: editing judiciously and being upfront about your potential conflict. If you plan to continue adding LSP citations or links, you might want to disclose the relationship on your user page. But I don't see any advantage in demanding strict adherence to the letter of the law. Cnilep (talk) 01:18, 27 June 2016 (UTC)

ISO 639 redirects & project tagging[edit]

I see there are ~8662 redirects (#Rs) of the form ISO 639:[a-z]{2,3} (like ISO 639:aa, etc.) which are missing a talk page, and so missing the {{WikiProject Languages}} banner and the corresponding talk-page #R (like Talk:ISO 639:aa to Talk:Afar language, etc.). Is there any desire by WP:LANG to tag these existing #Rs and to create talk-page #Rs? I did something very similar to this at WP:AST with our plethora of minor planet #Rs and can do the same here, if there's interest.   ~ Tom.Reding (talkdgaf)  17:52, 29 June 2016 (UTC)

Please verify[edit]

Old Štokavian Xx236 (talk) 11:12, 30 June 2016 (UTC)

Thanks for pointing this out. I've redirected it to Shtokavian. – Uanfala (talk) 11:27, 30 June 2016 (UTC)

Using UNESCO open license text to create Wikipedia articles about endangered languages and language groups[edit]

Hi all

I'm currently working with UNESCO to help find ways to make their content more useful for Wikipedia. I'm developing a way for text from UNESCO publications to be easily usable on Wikipeda, please see here for more details and instructions.

I think a very useful publication for Wikiproject Languages would be Atlas of the World Languages in Danger which is provides an overview of endangered languages within each region, perhaps the desriptions could be used to create Wikipedia articles for endangered languages within each area and/or endangered languages within language groups?

Please let me know what you think and if you need any more information, I'm also currently indexing all the languages listed in the world atlas into Wikidata which would provide an overview of what languages are not covered on Wikipedia already. I'm currently doing a project to create Wikipedia articles from official descriptions of Biosphere Reserves, here is a map of all the Biosphere Reserves in the world without English language Wikipedia articles generated live by Wikidata, something similar could possibly be created for languages.

Many thanks

John Cummings (talk) 20:22, 17 July 2016 (UTC)

That's a great project! You might want to also post at Wikipedia Talk:WikiProject Endangered languages (which has admittedly been rather quiet lately). Uanfala (talk) 22:34, 17 July 2016 (UTC)
Thanks very much @Uanfala:, I will do that now. --John Cummings (talk) 15:41, 18 July 2016 (UTC)

Establishments/disestablishment categories[edit]

Should languages be organized with establishments/disestablishment categories? It would likely be vague, by centuries. They aren't created but a page like Meroitic language would be included in something like Category:Languages attested in the 3rd-century BC, Category:3rd-century BC establishments in Africa, Languages extinct in the 4th-century and finally Category:4th-century disestablishments in Africa. -- Ricky81682 (talk) 00:52, 24 July 2016 (UTC)

If at all, we should use these categories only for constructed languages where the year of first publication can be verified, and for extinct languages where the death year of the last speaker has been recorded. De728631 (talk) 01:05, 24 July 2016 (UTC)
That's kind of limiting isn't it? It'd basically be 20th century with specific years, wouldn't it be? If reliable sourced linguists can give an estimate on both the start and end period (within a century), why not include it? -- Ricky81682 (talk) 20:47, 24 July 2016 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────That's a big, somewhat dubious if. It's difficult and more-or-less arbitrary to date the establishment of a natural language. For example, Middle English is generally thought of as having arisen in the 11th century, but that is because that's the date of the Norman Conquest. I think it was Ed Finegan who pointed out that there never was a moment when Middle English speaking children could not understand their Old English speaking grandparents and vice-versa, so even that vague dating is something of an abstraction. It's virtually impossible to pinpoint when almost any natural language diverged from an ancestral form. Given that fact, I'm not convinced of the utility of categorizing articles in terms that are necessarily vague and somewhat arbitrary. Cnilep (talk) 01:40, 25 July 2016 (UTC)

  • Absolutely agree with Cnilep: languages don't normally have beginnings that are pinpointable within any degree of vagueness. But there are small groups of exceptions: artificial languages, pidgins, maybe mixed languages and independently arisen sign languages (like Nicaraguan Sign Language). At any rate, I don't think this is what the OP's proposal is about. It's about having categories for languages that have been attested or extinct since a certain point in time. This would be a helpful category, wouldn't it? It's another matter if such categories will ultimately be placed within the subcategories of Category:Establishments by time, but it's worth pointing out that "establishment" seems to have a very broad meaning here: for example Category:5th-millennium BC establishments includes archaeological cultures. Uanfala (talk) 08:53, 25 July 2016 (UTC)
    • Yeah, I suggested "attested" and "extinct" (which probably isn't the best grammatically) as opposed to "established"/"disestablished" because those two only really work for things that were created. I think it would be interesting to have a category of all languages that attested worldwide around say the 3rd century AD. Again, this is something that there are reliable sources about and since Template:Infobox language uses "era" and "extinct" it's not like the information isn't out there. If it's disputed, that's one issue but it's just a question of whether the categorization seems useful. -- Ricky81682 (talk) 20:53, 25 July 2016 (UTC)
List of languages by time of extinction might be helpful if you haven't come across it yet. Given the sheer number of languages that have been going extinct in the last century, and the relative specificity of recent dates, I think it might be a good idea to have Languages extinct in the 20th-century broken up by decades. Uanfala (talk) 21:33, 25 July 2016 (UTC)
Ok, that's good. The size of mergers and splitting is always a WP:SMALLCAT debate that can happen at WP:CFD in the future. It's an ebb and flow but it seems like the idea is at least understandable. -- Ricky81682 (talk) 04:35, 26 July 2016 (UTC)
@Ricky81682: I see that now Classical Arabic has both Category:Languages attested from the 4th century and Category:4th-century establishments. The latter already has the former as a subcategory, so I'm not really seeing the point of it here. Uanfala (talk) 21:58, 27 July 2016 (UTC)
@Uanfala: Normally, it's by continent or country but I didn't review it in detail (probably Asia I think). Yana language for example is both Category:Languages extinct in the 1910s and Category:1916 disestablishments in California. The second one, as you get into 1910s, etc., is an interesting way to look at it. The first not so much because it's empty right now. It's the same dual categorization that banks, organizations, political parties and others are done. -- Ricky81682 (talk) 22:48, 27 July 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Ricky81682: Language-specific categories like Category:Languages attested in the 3rd-century BC are squarely uncontroversial but I'm wondering if it isn't a good idea to see input from more editors and wait for consensus to develop before applying the broader ones like Category:3rd-century BC establishments in Africa, because they they might cause confusion as they only apply narrowly to the written tradition of the language and not any other aspect of it. Uanfala (talk) 07:24, 28 July 2016 (UTC)

Is there a specific language there I'm confusing or something? I see that Proto languages by definition do not actually have attestation and I think Meroitic language is implying that the languages' usage is the traditional attestation time period but I think I'll organize something more here to have maybe some guideline. Would it better to develop something here and then incorporate it into say Wikipedia:WikiProject Languages/Template as a sort of local consensus on how the categories should apply? -- Ricky81682 (talk) 07:32, 28 July 2016 (UTC)
You aren't confusing anything, I just wanted to see some more input by other editors on the applicability of the broader "establishment" categories. Uanfala (talk) 08:26, 28 July 2016 (UTC)
Ok, I revised Wikipedia:WikiProject Languages/Template to suggest including those categories. It could probably use some revising. -- Ricky81682 (talk) 17:09, 29 July 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Uanfala: In the alternative, Esperanto uses Category:1887 introductions but Category:Introductions by century only really goes to the 14th-century and as for television series debuts and products. -- Ricky81682 (talk) 20:31, 30 July 2016 (UTC)

That's interesting. The "introductions" categories seem to also contain inventions, so including Esperanto there makes sense. I think the establishment categories are more suitable for languages (an obvious, and orthogonal to the matter, exception would be including English in Category:17th-century introductions in America, but I don't see any geographical subdivisions there). Anyway, I remain a bit suspicious about seeing language articles placed directly in categories that contain the word "establishment" in their names. Uanfala (talk) 22:15, 30 July 2016 (UTC)
Uanfala, RFC is below. -- Ricky81682 (talk) 03:19, 3 August 2016 (UTC)
I agree with Uanfala, but my reservations go beyond being "a bit suspicious". I'm firmly against the use of "establishment" and "disestablishment" categories for natural languages. With the exception of the very few situations mentioned by Uanfala above, natural languages aren't ever "established", they evolve slowly over generations. The oft-quoted statement that Cnilep paraphrased above applies here: there is normally no point in any language where younger generations and older generations are unable to understand each other. A language isn't established, but rather it is defined. When we assign labels, such as Middle English to English after 1066, for example, it is then Middle English only by definition. So to then say that Middle English should be considered an 11th century establishment is circular reasoning. Also, by the time a language is written down (i.e. "attested") it has already evolved and been spoken for an unknown number of generations. To say that the date of attestation is the date the language was "established" is just wrong and misleading. As for the "disestablishment" categories, I'm not in favor of these even for extinct languages. "Disestablishment" implies an affirmative action (such as closing a business or dissolving a state). A language becomes extinct because the last speaker dies, not because of any action taken towards the language.--William Thweatt TalkContribs 05:47, 3 August 2016 (UTC)
Is that really any different than when countries sort of start? Besides, it's not like, other than constructed languages, you are going to get a definitive period of like "established in 1925 BC" rather than "established during the 2nd century BC". We have evidence from specific dates, that's obvious, but I'm presuming that, like archaeologists, there is a guess to when the actual language developed separate from when the writings exist. Still the language is extinct on that date either because the last speaker has died or because all speakers have converted to a newer language. I mean it is said that Middle English became modern English because someone who used to speak Middle English died and thus no one did any longer (I know, it's transitional so no literal person exists) which occurred vaguely during the 11th century. I don't expect this to be defined to the year (the 3rd-millennium BC ones are obviously going to be vague) but is it useful you think to know which languages and other things were developing together in the 3rd-millennium BC in Africa, so to speak? It seems like a useful and accurate categorization. -- Ricky81682 (talk) 19:26, 3 August 2016 (UTC)

How should languages be categorized[edit]

As discussed above, I think we categorize languages in a number of ways. I propose both categories for when the language is first attested and when it is extinct. If there's issues on the titling, that can be discussed later. I also propose that the same language articles be put into the general establishments/disestablishment categories based on locations. Please either support/oppose the concept of these categories and renaming can be done later. For example, Meroitic language is in Category:Languages attested from the 3rd century BC, Category:3rd-century BC establishments in Africa, Category:Languages extinct in the 4th century and Category:4th-century disestablishments in Africa. As noted above, the introductions category can be used as is done for Esperanto. I put separate discussion sections for each portion. -- 03:17, 3 August 2016 (UTC)

Attestation categories[edit]

As discussed above, there were some suggestions to limiting this to constructed or artificial languages. If supported, please indicate whether you support for all languages or just for a subset of some sort. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)

Support (attestation)[edit]

  • Support for all languages. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)
  • Support attestation categories, but not sure where exactly these should go into the higher category structure. "Establishment" categories are problematic, as has become clear in the discussion right before this RfC. Uanfala (talk)
  • Support for attestation of a language. The first known record, or a somewhat accurate guess based on language comparison for proto-languages, could be used here. Landroving Linguist (talk) 11:28, 10 August 2016 (UTC)
  • Support for all; this is pretty basic encyclopedic information, and is not frequently overturned by new revelations or new hypotheses becoming widely accepted, so it will be stable.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  11:04, 13 August 2016 (UTC)

Oppose (attestation)[edit]

Discussion (attestation)[edit]

Attestation is a somewhat problematic category in itself, because there are sometimes widely diverging views about when a language is first attested. As an example, there is currently an academic debate whether the Arabic language was first attested shortly before Islamic times, coming from Yemen, or whether it was attested in much older inscriptions in the Western Syrian desert. I envision some endless edit-wars resulting from that. Landroving Linguist (talk) 11:33, 10 August 2016 (UTC)

  • I agree but that's a content issue related to the specific case. There are individual disputes about when countries for example but the actual concept isn't at issue. I think it's fair to say that for the most part (and a lot of our pages lack this) reliable sources exist about it (even if it's usually about the "Early/Middle/Modern" versions with very vague (3rd-century-type) sourcing. -- Ricky81682 (talk) 16:59, 10 August 2016 (UTC)

Extinction categories[edit]

I think this one is more obvious but just in case. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)

Support (extinction)[edit]

  • Support for all languages. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)
The new category is a neat idea, but it doesn't address the main issue and that is the vagueness and arbitrariness of drawing the line between one stage in the evolution of a language and another. Uanfala (talk) 17:55, 10 August 2016 (UTC)
I'm not sure I agree here. Certainly the lines are imprecise and arbitrary, but quite often still a matter of academic consensus. Most historical linguists sort of agree about the temporal limits of Middle English or many other out-of-use language varieties. When such a consensus exists, it is also usually published, and therefore a matter of careful categorization on Wikipedia. A lot about languages is fuzzy by nature, such as dialect and language definitions. Many of the language entities placed in Wikipedia are somewhat arbitrary and even a matter of dispute. Still we work with them, because enough people agree on them and have published their agreement. Landroving Linguist (talk) 18:28, 10 August 2016 (UTC)
My view is that if reasonable sources can definitely say that the languages are different, they can be able to identify when the languages differs. Now our actually information here is quite lacking but that just means there's much more to do here in terms of citations. These exist. Linguists and archaeologists write on these things. That's not the issue. For example a number of our South American language articles can definitely define basic extinction periods (not evolution but change upon colonialization) but don't have a starting point since (a) the archaeological record is sparse and (b) most of these are spoken only and not written. Nevertheless that doesn't mean if I pinned someone down, they couldn't at least say a millennium for when it started (likely based on the first people beings there) and if that's our best guess, that's the best estimate for now. Remember, we are a wiki, most detail means this can be revised later. -- Ricky81682 (talk) 23:09, 10 August 2016 (UTC)
Hmm, Latin is indeed a bad example for an extinct language, because a) it developed into other languages and b) it is still used as a second language today. Other similar languages may be Ge'ez language or Coptic language. I wouldn't know what to do about them regarding the categories you suggest. The languages shown by the already existing categories you mentioned are accordingly all of a very different nature - languages that have never served as a language of any official standing, and that have usually never been written. They died out by not being transmitted by the last generation of speakers, and quite naturally it is often very difficult to pinpoint the language death in time. Actually, there are even different definitions of language death, as some linguists insist that a language is dead when exactly one speaker is still alive, as s/he cannot use it any longer for communication. There are some prominent cases when the death date of the last speaker of a language is known and announced, but in most cases language death looks a lot more messy. Be that as it may, I think for most extinct languages the point of death can placed within a 50-year bracket, but then most authors don't bother to mention this, because compared to the span of a life time 50 years are rather imprecise. For ancient extinct languages, instead, 50 years would be pretty good. It may still be enough to build up a set of categories on Wikipedia. I wonder how much we would succeed in populating these categories based on the information at hand. We can certainly try and see how far it gets us. I find the two category pages you mention somewhat disheartening in this respect, as they seem to contain almost all known extinct languages. Maybe the information is just missing by neglect, and not by ignorance. I like your 'out-of-use'-idea, which might avoid some of the problems of 'extinct'. Middle English is certainly not extinct, but decidedly out of use. Latin is not even out of use, however. Landroving Linguist (talk) 18:21, 10 August 2016 (UTC)
Latin isn't just Latin though. As Template:Latin periods notes, there's Old Latin, Classical Latin, Late Latin, Medieval Latin, Renaissance Latin, New Latin and finally Contemporary Latin as the major separates I'm certain. Medieval Latinn for example is in Category:Languages extinct in the 15th century (or better yet renamed as Category:Languages out of use by the 15th century) which includes Anglo-Norman language, Greenlandic Norse and other "Medieval" languages (largely a lack of citations with more specifics). On that basis, I think the specific dialects and variations are a useful categorization. My concern isn't the major languages, anyone can see those, but categorizing all the minor languages and dialects like found in something like Category:Languages attested from the 14th century or all the Alaska and Native American languages with some information. Again, I've touched may 1% of all languages by hand so imagine a fully-fleshed out categorization scheme and people will see how languages are evolving simultaneously at the same times. -- Ricky81682 (talk) 22:59, 10 August 2016 (UTC)
  • Prefer the "out of use by [era]" approach. But as second choice, support extinction categories for languages that have become literally extinct, without support for including languages that have evolved into later languages.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  11:07, 13 August 2016 (UTC)
    • Can we at least call this a "support the concept" category and then perhaps do another RFA on the actual naming? I'm aware I didn't fully think this through on the first round but I'm not moving forward until there's some clarity here. If people want to separate "extinct" from "evolved out" (I find that unnecessary) but we can. I think there's a possible name we can agree on that covers both. -- Ricky81682 (talk) 22:30, 13 August 2016 (UTC)

Oppose (extinction)[edit]

Discussion (extinction)[edit]

Establishment and disestablishment categories[edit]

This of course is separate from the main categories but these would put the language in both a "subject by time" category and a "time by location" category. Rather than a separate section, if people prefer the "introductions" structure, that can be suggested in the oppose or something. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)

Support (establishment/disestablishment)[edit]

  • Support for all languages, if that's debated. -- Ricky81682 (talk) 03:17, 3 August 2016 (UTC)

Oppose (establishment/disestablishment)[edit]

  • Oppose natural languages to be included directly under a category with "(dis)establishment" in its name. Uanfala (talk) 08:02, 9 August 2016 (UTC)
  • Oppose natural languages to be included directly under a category with "(dis)establishment" in its name. Uanfala has made a good case why these categories cannot be reasonably applied to natural languages. Landroving Linguist (talk) 11:36, 10 August 2016 (UTC)
  • @Uanfala and Landroving Linguist: Do you support then artificial or constructed languages in these categories? Or a blanket objection to all languages? -- Ricky81682 (talk) 20:55, 10 August 2016 (UTC)
I think the "introduction" categories (like Category:20th-century introductions), which you suggested earlier, might be more suitable as they contain inventions and products – things that were brought into existence by a deliberate action, like constructed languages. Uanfala (talk) 21:29, 10 August 2016 (UTC)
  • Oppose - with the exception of artificial languages, we can almost never have a provable time of establishment; for one thing, the question of when it actually became that language, as opposed to a predecessor language, is unclear; for an other thing, it may have been spoken for a long time before it was written down. And it's quite possible that only a sub-population of the speakers of the language actually used the writing system, so even a time of its "disestablishment" is unclear. I would support placing any language which is provably artificial/constructed (e.g Esperanto) in an establishment category, though. עוד מישהו Od Mishehu 13:51, 12 August 2016 (UTC)
  • Oppose except for artificial languages, per all the above. It doesn't make sense as applied to natural ones.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  11:08, 13 August 2016 (UTC)
  • Strong Oppose with the exception of artificial languages, as mentioned above. Because languages are constantly evolving it is incredibly difficult (and controversial) to establish exactly when a language was established and disestablished. Take Sumerian for example, it was probably spoken long before it was written down, and even after it ceased being spoken it was still used liturgically and had a strong influence on Akkadian for hundreds of years. It is nearly impossible to say when it was established or disestablished, (not to mention that many extinct languages are going under revitalization efforts) so my vote is a strong oppose. Inter&anthro (talk) 16:29, 18 August 2016 (UTC)

Discussion (establishment/disestablishment)[edit]

  • Couldn't the "subject by time" and "time by location" cross-categorisation for language extinction happen under a new category (say, Category:Obsolescence) that would also include archaeological cultures, obsolete scientific theories and cultural traditions, as well as maybe extinct species? I imagine this could go under the rather broad Category:Former entities (where Category:Disestablishments is located too). Uanfala (talk) 09:15, 3 August 2016 (UTC)
    • Yeah I guess but seems unnecessary to have a whole new structure for just languages. -- Ricky81682 (talk) 19:17, 3 August 2016 (UTC)
Not just for languages, but also archaeological cultures, obsolete scientific theories etc. Uanfala (talk) 08:02, 9 August 2016 (UTC)
  • I think that these categories clearly do not apply to languages and should simply be removed. They are meant for states and organizations. ·maunus · snunɐɯ· 08:52, 9 August 2016 (UTC)
  • The very idea of establishment and disestablishment seems to view natural languages only from the perspective of developed languages or even official languages. For most languages in the world, this is not applicable. They exist often alongside languages of wider communication and serve their communities in restricted environments, even if this community is fully bilingual in the other language. Would you call a language disestablished, just because it is only used at home? Probably not. But then we have a huge boundary problem with this category, and again lots of edit wars, because some people's disestablished language is another person's established language. The criteria of extinction is much more clear-cut and useful when it comes to languages. I would also not hesitate to call an artificial language extinct when no-one is using it any more. Landroving Linguist (talk) 11:48, 10 August 2016 (UTC)

Based Upon Status[edit]

Have all languages be in a category of their status, such as extinct, meaning no one knows how to speak it. Perhaps "obsolete" or else "outdated" for latin or old english, and "endangered", which if added, an amount of speakers required to be considered a status would have to be agreed upon. One of the categories would be "Active", to include all languages declared by at least one country (or subsection of a country) an official language. The categories would contain all of the languages inside themselves, and also inside the subcategories, the subcategories would work by date of becoming the status of the category, for instance in the extinct category, would have "languages that became extinct in ___ century". Tiers of categories:

  • 1. Status of language (Example: Extinct)
  • 2. Timeframe of that status (Example: Languages that became extinct in the 10th century)

The languages themselves would be in both categories, for example Saka would be in both the extinct category and the Languages that became extinct in the 10th century category.


  • These statuses were made based upon EGIDS, with some being modified and some added, and also some removed that I thought were superfluous, and all are up for debate/change. Names changed will be marked with bold, statuses I made will have the number bolded.
  • 1. "International" The language is widely used between nations in trade, knowledge exchange, and international policy.
  • 2. *"National" The language is used in education, work, mass media, and government at the national level.
  • 3. "Provincial" The language is used in education, work, mass media, and government within major administrative subdivisions of a nation.
  • 4. "Lingua Franca" Used in work and mass media without official status to transcend language differences across a region.
  • 5. "Educational" The language is in vigorous use, with standardization and literature being sustained through a widespread system of institutionally supported education.
  • 6. "Vigorous" The language is used for face-to-face communication by all generations and the situation is sustainable.
  • 7. "Threatened" The language is used for face-to-face communication within all generations, but it is losing users.
  • 8 "Shifting" The child-bearing generation use the language among themselves, but it is not being transmitted to children.
  • 9 "Moribund" The only remaining active users of the language are members of the grandparent generation and older.
  • 10 "Near Extinct" The only remaining users of the language are members of the grandparent generation or older who have little opportunity to use the language.
  • 11."Obsolete" A language that has evolved into something else and the language itself is rarely used, Examples such as old english or old latin.
  • 12. "Dead" No one speaks it outside of translating it academically. (I.E. only spoken by archeologists who learned it to translate it.)
  • 13."Extinct" No one speaks it.
  • 14. "Lost" No one knows how to speak it, it cannot be learned.

Comment your thoughts. Iazyges (talk) 00:05, 16 August 2016 (UTC)


  1. User:Iazyges (OP)


  • Tentative oppose. I'm confused on whether we are rely exclusively on Ethnologue as the source here or it's kind of WP:OR which language belongs where. I don't see the evidence that Ethnologue is on the same level of consensus in terms of classification as something like IUCN Red List is for species. The article itself cites some opposition from the editor at Glottolog. This seems more like a discussion for Template talk:Infobox language first and then from there to create categories as I'd expect use to include that information in the infoboxes as standard practice. If there's evidence that actual linguists uses these criteria and there's at least some ability for us to find reliable sources to classify each language here, then I'm fine with it. -- Ricky81682 (talk) 08:36, 21 August 2016 (UTC)
@Ricky81682: I agree, it needs work, I put it here as a form of peer review of sorts, to get feedback on how to improve it. Iazyges (talk) 13:14, 21 August 2016 (UTC)


  • If something like this is adopted, it had better be based on a classification system that is already in place. One such system is the EGIDS scale. Data (probably not always entirely reliable) about each living language's position on that scale is published by ethnologue [3]. I'm wondering if a somewhat less fine-grained version can adopted here.
Whatever system is adopted, it's worth pointing out that we can't ever hope to achieve total categorisation, as for the majority of languages there isn't (and there can't be) any historical data about when their status changed.
There are two elements in the OP's proposal that don't appear in the EGIDS. One is the distinction between extinct languages and ones that have evolved (like Old English), and we run again into the question (which I remain agnostic about) of the applicability of chronological categories to the latter: there is discussion about that earlier in the RfC. Another distinction is within the category of extinct languages: between ones that are extinct but studied academically ("extinct by speakers") and ones that no-one knows how to speak ("extinct"). If this is retained then I'd suggest the more transparent labels "extinct" and "unattested". Uanfala (talk) 09:56, 16 August 2016 (UTC)
@Uanfala: I think that while we cannot fit every language into a time, it can fit in the status folder until such a time as we know what time it became that status, perhaps even have a folder like "Unknown date of status" to fit in any language we dont know when it became that. Iazyges (talk) 12:36, 16 August 2016 (UTC)
@Uanfala: I have checked out the link you gave and changed the names to it. Iazyges (talk) 12:47, 16 August 2016 (UTC)
Iazyges, you base your updated descriptions on speaker numbers. While the absolute number of speakers is sometimes a good indication of a language's vitality (and it is vitality that we want to indicate with this new categorisaion, right?), in many many instances it is not. A community of 500 speakers can be a vigorous one, using the language in all domains and passing it to the next generation, while there are communities of hundreds of thousands of speakers that have stopped teaching their childred how to speak the language. Either the EGIDS or UNESCO's scales take account of speaker numbers, but they also register other relevant factors. Uanfala (talk) 15:44, 16 August 2016 (UTC)
@Uanfala: Hm, how would you propose we fix that? Should we mix it, or perhaps have two seperate categories, one for the amount, and one for whether or not it is actively taught? Iazyges (talk) 15:53, 16 August 2016 (UTC)
It depends on what we want to categorise for. You original proposal suggested that it was language vitality – i.e. whether a language is endangered or how actively it is used or supported. We can base such a categorisation on one of the two vitality scales in use. If we choose to also categorise by number of speakers, that's a completely different matter. I don't know if such categorisation would be helpful (but other people might disagree), and if we go that way, we'll have to use precise category names ("Languages spoken by more than 100,000 people" ....) rather than fuzzy labels like "uncommon". Uanfala (talk) 16:41, 16 August 2016 (UTC)
@Uanfala: I suppose your right, ill try to find the right blend of UNESCO and EGIDS
  • UNESCO also has a language vitality scale which is also used by Glottolog. I would suggest using this instead of EGIDS.·maunus · snunɐɯ· 12:58, 16 August 2016 (UTC)
@Maunus: Yes but the UNESCO one mostly about generational levels of speaking, ie if one generation speaks it but their children don't, the system I propose works of either a nation or state recognizing it officially, or based upon the number of speakers, while the UNESCO system appears based upon generations. Iazyges (talk) 15:14, 16 August 2016 (UTC)
Your suggested categories moves us into OR territory for some of the categories. It is much better to strictly follow one of the established systems either EGIDS or UNESCO (or both) that way we dont have to do interpretation from sources ourselves. The UNESCO system by the way is not based only on generations, but on vitality and it includes aspects of language policy such as official status. Pure speaker numbers don't really give any useful information about vitality.·maunus · snunɐɯ· 18:45, 16 August 2016 (UTC)
@Maunus: I will go with EGIDS for now, As I find it better.

Maunus (talk · contribs), Uanfala (talk · contribs) I have edited the statuses so most fit the EGIDS scale, I changed the name of the "Wider Communication" category to Lingua Franca because i thought it was more encyclopedic, I also kept the Obsolete, Academic only, and extinct category, I just added the "Lost" category, for languages no one knows how to translate to any other language. Iazyges (talk) 20:01, 16 August 2016 (UTC)

Is Gyani Maiya notable?[edit]

Is Gyani Maiya, one of the last speaker of the Kusunda language notable? We have a lot of articles on the last speakers of a language (see Category:Last known speakers of a language). See more information here. Thanks :) Inter&anthro (talk) 03:51, 10 August 2016 (UTC)

Inter&anthro I'd review Wikipedia:Articles for deletion/Boa Sr. and how Wikipedia:Articles for deletion/Roscinda Nolasquez is going. -- Ricky81682 (talk) 05:10, 10 August 2016 (UTC)
They both closed as keep, and there's a suggestion that last speakers of a language, covered as such in RS, are inherently notable or at least notable by default absent a strong showing to the contrary. Some extended debate in one of them in interesting for reminding (rather strenuously) that even if WP:GNG was not met, the information in a stub on such a person should be merged into the article on the language.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  11:02, 13 August 2016 (UTC)

Ozark English → Appalachian English?[edit]

I know that this discussion (Talk:Southern_American_English#Merger from "Ozark English"?) has been open for a very long time, but it seems to me that very little actual discussing is occurring. Can we please have more voices/opinions on whether the measly stub Ozark English could be incorporated as a section of Appalachian English, due to its being a subset of this variety. So far, two have opposed, but when I try to continue discussions with them or counter their arguments... silence. I also recently found some new information to bolster my argument, but there have been no responses. Obviously I'm in favor of the merger, but I'm fine if it's shot down so long as people actually carry on a true dialogue. Please agree/disagree/comment/etc. Thanks! Wolfdog (talk) 15:11, 19 August 2016 (UTC)