User:Iricigor/UE

From Wikipedia, the free encyclopedia

This is my proposal how to modify WP:UE. Please comment it at discussion page for this proposal.

Proposal[edit]

This is just suggested text and open for suggestions and modifications.

Case when person's name contains letters from Latin non-English alphabet[edit]

This rule applies mostly to peoples' names that contain letters with diacritics.

Case 1[edit]

If English version exists and is different than original name, we should use English version.

Case 2[edit]

When English version does not exist and in most English media is used original version with stripped diacritics, we should use original version. In this case there should be redirect from stripped version and info about this in article's introduction. Also, default sort should use stripped version.

Scope[edit]

In most of the cases these are names from countries that use Latin alphabet, but do have letters not contained in English alphabet. This rule does not applies to names that are not originally written in Latin alphabet.

Related Wikipedia guidelines[edit]

This is list of Wikipedia guidelines related to this topic in order of importance and generalization.

Explanation[edit]

I will put here why I think this is good proposal. If some explanations require more details, it would be put in separate section.

De facto standard[edit]

I think that it is obvious that this is de facto standard on Wikipedia. In separate section below I have listed some of the pages that are named like this. I have also put these few pages I found that are not according to this rule.

Contraversy[edit]

Guideline says that if there is no establish usage then we should use original version. And when someone becomes popular and English language media start using stripped version, we should strip article.

No rules in English language[edit]

In my language (Serbian) we have strict grammar rules how we should write foreign names. English language does not have such rules. So the editors in media have on their choice what they want to use. Common practice is just to strip diacritics in cases like this.

Also we should be aware that Wikipedia is defining what is most common English usage. First result in most search engines would be Wikipedia page. Try testing in search engines for some of the names from the lists below!

Relation with other rules[edit]

I believe that this is also in accordance with first of two basic rules on naming conventions. That rule says that name should be most generally recognizable. I do not think that anyone would get confused if (s)he types Kimi Raikkonen and gets the page Kimi Räikkönen.

No information or wrong information[edit]

Also I am against using stripped names because they have no meaning at all and no informational character. Also those versions sometimes can simply be wrong. For example in Finland there is last name Raikkonen, which is different from Räikkönen!

Reason that most common version on Internet is without diacritics is within ASCII standard. Still lots of media have this limitation (different keyboard layouts, different devices used for submitting news or information, etc). Or it can be that their editor use stripped version due to laziness. Also there are still some Internet standards that use only ASCII letters, like standard for naming of web sites.

Rules with lots of exceptions...[edit]

...are not good rules. As listed above there are already some accepted Wikipedia guidelines that do follow this rule and not WP:UE as it is now. Also there were some other conventions that are not official guidelines, but also ARE accepted among editors.

This modification would make that there's no need for special profession specific (hockey players) or country specific rules (Irish, Finnish).

Further clarifications[edit]

This is list of letters with diacritics and how they are usually substituted in English media. Of course, correct me if I am wrong with some substitutions! List might not be complete!

Vowels[edit]

  • a - ä, à, á, ă, å, ã, â, ą, ā
  • e - ę, é, ě, ē, ė
  • i - í, î, ī, į
  • o - ö, ō, ó, ø, õ
  • u - ü, ú, ů, ū, ų
  • y - ý

Consonants[edit]

  • c - č, ć, ç
  • d - ð, ď
  • g - ģ
  • k - ķ
  • l - ł, ļ
  • n - ñ, ň, ń, ņ
  • r - ř, ŗ
  • s - š, ś
  • t - ť
  • z - ž, ź, ż

Double characters[edit]

Some letters are substituted with two letters

  • ae - æ
  • dj - đ
  • th - þ

Serbian language[edit]

Serbian language uses in parallel Latin and Cyrillic alphabet. This rule should be applied also to Serbian names, but of course written using Latin alphabet. More precisely, Cyrillic single letters њ, љ should be written using two letters nj, lj like it is written in Latin letters according to Serbian grammar. Letter ђ (Cyrillic) / đ (Latin) should not be written like dj, since this is not according to Serbian grammar.

Similar rules are applicable to Bosnian and Croatian language.

Disputed issues[edit]

I am not sure how Wikipedia should deal with German letter ß. Should it be substituted with ss?

Also, I am not sure for Hawaiian names or for the names from Far East (i.e. Vietnamese names).

Also, maybe this rule can be expanded to other names, not only to persons? But, we should be careful with this expansion!

Lists of pages that DO NOT follow this rule[edit]

These are THE ONLY pages I found that do not follow this rule.

What's even more interesting is that only one, and I would repeat, ONLY ONE administrator created pages like this (i.e. he moved them from original name or closed the discussion on moving to original name with rejected).

All other editors and move request closers support this rule, what can be seen below.

Lists of pages that already DO follow this rule[edit]

I would start this section with list of pages that can be grouped in two important grroups:

  1. pages that once were using most common English name (stripped one) and
  2. pages for which there was a request to move it to most common English name (stripped version) and that request was rejected.

After each name I have put percentage of English language Internet pages that use original version with diacritics (see explanation below).

Moved to version with diacritics:

Move to stripped version rejected:

Pages that were moved, but even before moving they were not using most common English name:

Although, that testing may not be relevant, but it is very good sign that this rule is already Wikipedia standard! Names that use 4-29% are not most common names.

And now, this is a list of names that were created according to this proposed rule and no one wanted to move them to most common English name. This list would be separated by some categories (nationalities, professions). In fact there are much more articles than those listed here. At the end of this list, one can find list of lists with articles that do not follow WP:UE as it is, but with this proposed modification.

I would start with very famous Serbs.

  1. Zoran Đinđić
  2. Slobodan Milošević
  3. Radovan Karadžić
  4. Ana Ivanović
  5. Jelena Janković

Top 20 or other famous tennis players

  1. Tomáš Berdych
  2. Fernando González
  3. Carlos Moyà
  4. Ivo Karlović
  5. Guillermo Cañas
  6. Björn Borg
  7. Mario Ančić
  8. Ilie Năstase

Names using Đ or đ

  1. Milo Đukanović
  2. Dino Rađa
  3. Srđan Lakić
  4. Milko Đurovski
  5. Duško Đurišić

Mentioned on Walesa's page

  1. Gerhard Schröder
  2. Göran Persson
  3. Wolfgang Schüssel
  4. Dag Hammarskjöld
  5. Halldór Ásgrímsson

Mentioned by User:Kubura on Talk:Franjo Tuđman
Very nice list (important persons)

  1. Bedřich Smetana
  2. Besançon
  3. Sissel Kyrkjebø
  4. Plácido Domingo
  5. Lech Kaczyński
  6. São Paulo
  7. João César Monteiro
  8. László Sólyom
  9. Călin Popescu-Tăriceanu
  10. José Luis Rodríguez Zapatero
  11. José Trinidad Cabañas
  12. Ivan Gašparovič
  13. Václav Havel
  14. Vladimír Špidla
  15. Adil Çarçani

Lists with big number of peoples' names with diacritics[edit]

  1. List of Serbs
  2. List of Germans
  3. List_of_Spaniards
  4. List_of_Poles
  5. List of Czechs
  6. List of Swedes
  7. List of Croatians
  8. List of Bosnians and Herzegovinians
  9. List of Montenegrins
  10. List of Macedonians (ethnic group)
  11. List_of_Hungarians

In fact, you may try some lists from Category:Lists_of_people_by_nationality

Google search testing[edit]

To test which name is more common I performed this test using Google search engine. This test is absolutely not enough to give you answer which name is more common in English, but it can give pretty good suggestion. Test results also may not be the same all the time.

If you have Sömé Nãme, I compared it to Some Name. In Advanced Google Search page, in exact wording field I have put "Sömé Nãme" (with quotation marks) and in unwanted words I have put "Some Name" (again with quotation marks). In Language field I have selected English. For the second test, I just change positions of "Sömé Nãme" and "Some Name". i.e. "Sömé Nãme" becomes unwanted. After searching, I take the Google approximation on number of pages that have one, but not the other name.

Or for the fun you may try Googlefight via this link!

Go to the top or back.