Wikipedia:WikiProject Languages

From Wikipedia, the free encyclopedia
Jump to: navigation, search
"Wikipedia:Languages" redirects here. You may be looking for Wikipedia:Language or a list of Wikipedias in different languages.

This WikiProject aims primarily to provide a consistent treatment of each human language on Wikipedia. Many languages already have extensive pages, and the systematic information on those pages is not presented in a consistent way. The purpose of this WikiProject is to present that information consistently, and to ensure that each of the major areas is covered at least briefly for each language.

These are only suggestions, things to give you focus and to get you going, and you shouldn't feel obligated in the least to follow them. However, try to stick to the format for the Infobox for each language. See the template for an example Infobox.

The easiest way to get started writing for a language that doesn't already have an article or to convert an article to the WikiProject format is to start with the template.


Categories for discussion
Featured article candidates
Good article nominees
Requested moves

Quality articles[edit]

Featured articles marked in bold have appeared on the Main Page.

Article assessment[edit]

Place the {{WikiProject Languages}} project banner template on the talk pages of any language-related articles. To rate the article on the quality scale, add one of the following parameters:

  • class=FA for featured articles
  • class=A for A-class articles
  • class=GA for good articles
  • class=B for B-class articles
  • class=start for Start-class articles
  • class=stub for Stub-class articles (which may not necessarily have a "stub" message on them!)
  • class=NA for non-articles (templates, images, etc.)

See WP:GRADES for pointers on classification.


Index · Statistics · Log

Article names[edit]

Most language articles should be on a page titled XXX language. Reasons for this recommendation:

  1. Ambiguity. While some language have special forms that refer unambiguously to the language, English is inherently ambiguous about language names. Having a standard of "XXX language" ensures that it's always unambiguous. There is always the possibility of "XXX literature", "XXX grammar", but these cannot be referred to simply as "XXX", and so are not a reason for disambiguation.
  2. Precedent. This is how Encyclopædia Britannica and many other English-language encyclopedias name their articles.
Please note that when there is nothing to disambiguate a language name from, such as Hindi, Esperanto or Inuktitut, there is no need for the "language". See Wikipedia:Naming conventions#Languages, both spoken and programming and Wikipedia:Naming conventions (languages) for the relevant naming policy.

Whether the varieties of Arabic and Chinese should be called "languages" or "dialects" continues to be a highly controversial issue. The current convention is: use NAME + Arabic for Arabic varieties (e.g. Egyptian Arabic) and NAME + Chinese for Chinese varieties (e.g. Mandarin Chinese). Infoboxes are put at both Arabic language and Chinese language and at their first-level subdivisions.

Even in cases in which there is a consensus that varieties of a language have a dialect status, the number and divisions between such dialects are often vaguely-defined, and controversies exist among dialectologists over whether certain varieties should be treated in a unified way or are best understood as separate though related varieties. Separate articles should only be written on varieties (e.g., Estuary English) or related groups of varieties (e.g., Hispanic English) that have been well-enough studied by linguists that at least a minimal body of literature exists about that variety or group of varieties, as a distinct dialect or group of dialects. Phonological, morphosyntactic, or lexical variation that may be considered subdialectal should be noted as "differences within X dialect,", where X is a dialect as discussed in the relevant literature. Controversies over dialect status can be noted in articles as such, but should also be based on citable work. Names used to refer to that dialect in the title should be preferred over folk-linguistic terms (e.g., Inland North versus Midwestern Accent).

Article structure[edit]

There are templates for the structure of articles about spoken (oral) languages at /Template and for signed languages at /Template (sign language).

Open tasks[edit]


Population data has been mostly updated from Ethnologue 16 to 17. However, an unknown number of articles which did not have the ref field set to "e16" slipped through the cracks; an example is Cumanagoto, which did not have a ref'd population figure because E16 had mistakenly listed it as extinct. Articles which are not ref'd to Ethnologue could be checked in case E17 has a more recent figure.

User:PotatoBot helps keep ISO redirects in sync with changing WP articles and ISO standards. The results of the latest run are displayed at ISO 639 log and ISO 639 language articles missing.

Names at Spurious_languages#Spurious_according_to_Glottolog with asterisks have not been addressed.

Articles to be created[edit]

Red links should either be redirected or have their own articles.

99.9% of ISO language names have articles, though not always one-to-one (e.g. Fulani, Zhuang, and Mazatec); the 0.01% which do not are spurious, dubious, or insufficiently attested to justify their own article, and are redirected to an article stating that.

Lists for evaluation

The lists below are of self-links in our articles, language names from various sources which do not have articles or redirects, and suspicious cases to keep track of.

Requests for expansion[edit]

Images for articles in Category:Wikipedia requested photographs of languages.

Requests for attention[edit]

(no article Ashéninka people; Keres functions as the lang article but reads as a family article)

Tagged categories[edit]

Category:Articles lacking sources[edit]

Only language varieties are included here. Subjects such as 'French language in Jordan' and 'Westernized Chinese language', though in bad shape, are not listed because they would not be representative of the many unreferenced articles that are not about specific varieties.

  • 2004–2014: (only articles with 'language', 'dialect', 'creole', or 'pidgin' in name are included; distilled from an insane number of articles)
English: Manningham accent, Jewish English languages
Germanic: Central Franconian dialects, Eastphalian dialect, Hamburgisch dialect, Norwegian dialects, Orsamål dialect, Ripuarian language, Sognamål dialect
Romance: Chipilo Venetian dialect, Comasco-Lecchese dialects, Fornes dialects, Pavese dialect, Sabino dialect, Sutsilvan dialects (Romansh)
Slavic: Debar dialect, Reka dialect, Strumica dialect
Maltese: Qormi dialect, Żejtun dialect
Chinese: Luoyang dialect, Mango dialect, Qihai dialect, Weihai dialect, Ningbo dialect, Ganyu dialect, Fu'an dialect, Xuzhou dialect
other: Kfar Kama Adyghe dialect (Adyghe), Enuani dialect (Igbo), Thanjavur Marathi dialect, South Korean standard language

Category:Orphaned articles[edit]

(same search terms as missing sources)

Ordek-Burnu language (moved to 'stele')

Open ISO issues[edit]

The following ISO change requests from previous years were still open in 2015 Sept. (Open requests for 2015 have been reviewed through 2015-067.) The articles should be updated if they are accepted. (See the current list.)

Articles proposed for deletion[edit]

including WP:AFD, WP:PROD and other processes

Articles to watch[edit]

The following are language articles which come under repeated POV attack, often for ethnic or nationalistic reasons. Feel free to add ones you've noticed, and to remove languages which have not been a problem for some time. That way, if one of us drops out from editing, the articles we've been watching hopefully won't go to pot.

(Note: Ethnologue 17 and the Swedish Nationalencyklopedin use Indian census data, which is not a RS because it does not have a consistent definition of Hindi. For example, part of the Awadhi population is listed under Awadhi, but most is counted as Hindi. This problem is acknowledged in the presentation of the census results, but has gotten lost in 2ary sources.)
  • Serbo-Croatian & Croatian (subject to ARBMAC)
  • Saraiki dialect, Punjabi dialects, and "Panjistani" (requires text searches to purge repeated additions of contradictory claims of "Panjistani" to multiple articles)
  • Southern Luri language. It may be worthwhile splitting the Luri article, but so far the attempts to do so have been incompetent and motivated by OR redefinition of the language. The present description of the two varieties in the Luri article is so intertwined that splitting them would create something close to a content fork. — kwami (talk) 02:32, 4 September 2015 (UTC)
  • Assyrian Neo-Aramaic and Chaldean Neo-Aramaic, along with the ethnic articles. A seemingly chronic ethnic dispute.
  • Luganda and Baganda: deletion of ISO name
  • Misleading maps: Many national languages have had maps with half the world filled in because of emigration, with no apparent standard for what counts as a speaking population. Most of these will be caught by checking the top 100 at List of languages by number of native speakers.

Interpreting Ethnologue data[edit]

Ethnologue is the default source for language data on WP. There are several obvious advantages to Ethnologue, beside its universal accessibility: For many languages, it's all we have. For others, it provides a check on the politicization and population inflation that we experience when we allow advocates of the language to cherry-pick sources. Nonetheless, Ethnologue data needs to be carefully evaluated, and if possible, their sources should be verified and cited directly, or better sources used instead of Ethnologue where these are known. There are a few common and serious problems:

Such problems are understandable: Ethnologue is an enormous project with a very small editorial team and budget. For years, Ethnologue had a reputation for being unresponsive, so many linguists do not bother to correct the errors they find, but since ca. 2012 they have been appreciative of feedback.

Linguist List / Multitree includes a large number of language names not found in Ethnologue, but their identification is highly unreliable, and can often be seen to be spurious with even a cursory glance at the literature. Glottolog[5] often does a better job than either of these sources, for instance in verifying and updating classifications, in marking languages as 'spurious' when they cannot be verified to exist, and in specifying their sources, but cannot be relied on for dialects, where they just copy Multitree.



Project banner[edit]

Please add {{WikiProject Languages}} to talk pages of relevant articles. Articles with this template are put into Category:WikiProject Languages articles.


Language stubs should be tagged with the most appropriate template of these:


After you sign up, you can add the project userbox to your user page by adding the following: {{User WikiProject Languages}}. Your username will then automatically be added to the Category:WikiProject Language members.

Related WikiProjects[edit]

This WikiProject is a descendant of WikiProject Linguistics. It has descendants of its own:

See also:

Project volunteers[edit]

If you'd like to help out, be contacted by others interested in this WikiProject's subject, and receive task assignments and project-related updates on your talk page, please add your name here:

(scroll down)


Click the "►" below to see all subcategories: