Wikipedia:Manual of Style/Arabic

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Shortcuts:

This page proposes a guideline regarding the transliteration from the Arabic alphabet to Roman letters in the English Wikipedia.

The transliteration of Arabic used by Wikipedia is based on the ALA-LC Romanization method, with a few simple changes that make it easier to manage and read. The strict transliteration uses accents, underscores, and underdots, and is only used for etymology in the beginning of the article. All other cases of Arabic words rendered into English will use the same standard, but without accents, underscores, and underdots. Some exceptions to this rule may apply.

Definitions[edit]

Arabic[edit]

For the purposes of this convention, an Arabic word is a name or phrase that is most commonly originally rendered in the Arabic alphabet, and that in English is not usually translated into a common English word. These could be in any language that uses this script, such as Arabic, Persian, or Ottoman Turkish.

Examples of transliterations from Arabic script:

Examples of titles not transliterated from Arabic script:

Primary transcription[edit]

A word has a primary transcription (anglicization) if at least 75% of all references in English use the same transcription, or if a reference shows that the individual self-identified with a particular transcription, and if that transcription does not contain any non-printable characters (including underscores). Some primary transcriptions are not transliterations because they may be ambiguous as to the original spelling.

Examples of references include the FBI, the NY Times, CNN, the Washington Post, Al-Jazeera, Encarta, Britannica, Library of Congress, and other academic sources. Examples of self-identification include a driver's license or passport in which the individual personally chose a particular form of transcription.

Google searches can be useful in determining the most common usage, but should not be heavily relied upon. The content of large searches may not be relevant to the subject being discussed. For example, the ISO transliteration of القائم is "al-Qaʾim", but the transcription "al-Qaim" receives five times as many hits. This word is used in the names of three historical Caliphs and a town in Iraq, and is also another name for the Mahdi in Shi'a Islam. Since Google searches do not discriminate between them, other sources must be used to determine if a primary transcription exists for any particular usage. Google search counts are also biased toward syndicated news articles; a single syndicated reference may generate hundreds or thousands of hits, amplifying the weight of whatever spelling happens to be used by that one reference.

If there is no primary transcription, a standard transliteration is used (see below).

Examples:

  • There is no single most-popular transcription for the name of the prophet of Islam. "Mohammed", "Mohammad", "Muhammad", and "Mohamed" are all commonly used. The standard transliteration of Muhammad is used.
  • The capital of Egypt is most widely known as Cairo. The standard transliteration of "al-Qahira" is not used.
  • The primary transcription of the leader of al-Qaeda (itself a primary transcription of standard form al-Qa`ida) is "Osama bin Laden". The standard transliteration of Usama ibn Ladin is not used.
Note: the Arabic word بن/ابن (English: son of) should be transcribed ibn unless a primary transcription requires the colloquial bin.

Standard transliteration[edit]

The standard transliteration uses a systematic convention of rendering Arabic scripts. The standard transliteration from Arabic to Roman letters is found below.

The standard transliteration does not carry enough information to accurately write or pronounce the original Arabic script. For example, it does not differentiate between certain pairs of distinct letters (س vs. ص), or between long and short vowels. It does, however, increase the readability of the article to those not familiar with Arabic transliteration, and avoids characters that may be unreadable to browsers.

Strict transliteration[edit]

A strict transliteration is completely reversible, allowing the original writing to be faithfully restored. A strict transliteration need not be a 1:1 mapping of characters as long as there are clear rules for choosing one character over another. A source character may be mapped (1:n) into a sequence of several target characters without losing sequential reversibility.

A strict transliteration uses a system of accents, underscores, and underdots to render the original Arabic in a form that preserves all the information in the original Arabic.

Other common transliteration standards are ISO 233 and DIN 31635.

Note that several letters proposed in the strict transliteration system below do not render correctly for some widespread software configurations (e.g. ḥ, ṣ, ḍ, ṭ, ṛ, ẓ and ṁ). Using the {{transl}} template to enclose transliterations will use CSS classes to address these issues.

Examples[edit]

Arabic Primary transcr. Standard transcr. Strict translit.
القاهرة Cairo al-Qahira al-Qāhirah
السلف الصالح Salaf as-Salaf as-Salih as-Salaf aṣ-Ṣāliḥ
قرآن n/a Qur'an Qur’ān
صدام حسين Saddam Hussein Saddam Husayn Ṣaddām Ḥusayn
العبّاسيّون Abbasid al-`Abbasiyun al-‘Abbāsīyūn
كربلاء Karbala Karbala' Karbalā’
محمد n/a Muhammad Muḥammad
القاعدة al-Qaeda al-Qa`ida al-Qā‘idah

Proposed standard[edit]

Article titles[edit]

See: Wikipedia:Naming conventions (Arabic)

Lead paragraphs[edit]

All Arabic articles should have a lead paragraph which includes the article title, along with the original Arabic script and the strict transliteration in parentheses, preferably in the lead sentence.

This is in accordance with the official Wikipedia policy at Wikipedia:Naming conventions (use English). Many articles that are missing this information are listed at Category:Articles needing Arabic script or text.

The standard format, with, pursuant to Template:Transl, the transliteration system indicated, is given in the following examples:

  • Cairo (Arabic: القاهرة‎ / ALA-LC: al-Qāhirah) is ...
  • Gamal Abdel Nasser (Arabic: جمال عبد الناصر‎ / ALA-LC: Jamāl ‘Abd an-Nāṣir; January 15, 1918 – September 28, 1970) was the second President of Egypt ...

Some cases will require variations on this format. If the name is extremely long, the first appearance of the name is suitable to provide the strict transliteration. Likewise, if a strict transliteration appears overly repetitious, it should be in place of the page title in the lead paragraph.

Example:

  • Abū al-‘Abbās ‘Abd Allāh ibn Muḥammad as-Saffāḥ (Arabic: أبو العباس عبد الله بن محمد السفاح‎) (721–754) was the first Abbasid caliph. Abu al-`Abbas was the head of...

Redirects[edit]

All common transliterations should redirect to the article. There will often be many redirects, but this is intentional and does not represent a problem.

Alphabetization[edit]

  • Alphabetize by family name in modern cases where there is one, otherwise by the first component in the commonly used name
  • For alphabetization, the definite article "al-" and its variants (ash-, ad-, etc.) should not be ignored.
    • Example: Al-Qaeda should be alphabetized as "Al-Qaeda".
  • For alphabetization, the family name designators ibn (or, colloquially, bin) and bint should be ignored, unless the primary transliteration makes it a part of the name (as in the Saudi Binladin Group).
  • For alphabetization, the apostrophe (representing hamza and ‘ain) should be ignored, and letters with diacritics should be alphabetized as if they did not have their diacritics.
    • Example: Ibn Sa'ūd should be alphabetized as "Saud".

Transliteration[edit]

The strict transliteration is based on the ALA-LC Romanization method (1997), and standards from the United Nations Group of Experts on Geographical Names. The standard transliteration is the same, without accents, underscores and underdots.

Consonants[edit]

Arabic Name Standard translit. Strict translit. Notes
ب bā’ b b
ت tā’ t t
ث thā’ th th the sequence ته is written t′h
ج jīm j j pronounced [g] in Egyptian Arabic
ح ḥā’ h
خ khā’ kh kh the sequence كه is written k′h
د dāl d d
ذ dhāl dh dh the sequence ده is written d′h
ر rā’ r r
ز zāy z z
س sīn s s
ش shīn sh sh the sequence سه is written s′h
ص ṣād s
ض ḍād d
ط ṭā’ t
ظ ẓā’ z
ع ‘ayn ` different from hamza
غ ghayn gh gh
ف fā’ f f
ق qāf q q sometimes transliterated as "g"
ك kāf k k
ل lām l l
م mīm m m
ن nūn n n
ه hā’ h h
ء hamzah ' omitted in initial position[1]
ة tā’ marbūṭah ah or at or atan ah or at or atan usually as ah, but sometimes as at or atan.[2]
و wāw w w See also long vowels
ي ya’ y y See also long vowels
ِيّ (yā’) iy or i īy or ī romanized īy except in final position[3]
آ ’alif maddah a, 'a ā, ’ā Initially ā, medially ’ā
  1. ^ "In initial position, whether at the beginning of a word, following a prefixed preposition or conjunction, or following the definite article, hamza is not represented in romanization. When medial or final, hamza is romanized." [4]
  2. ^ (Same pdf as note 1) "When the word ending in ة is in the construct state, ة is romanized t. [...] When the word ending in ة is used adverbially, ة (vocalized ةً) is romanized tan."
  3. ^ (Same pdf as note 1) "Final ِىّ is romanized ī."

Short vowels[edit]

Short vowels Name Translit.
(standard and strict)
064E
َ
fat′ḥa a
064F
ُ
ḍamma u
0650
ِ
kasra i

Long vowels[edit]

Long vowels Name Standard Trans. Strict Trans.
064E 0627
َا
fatḥa ʼalif a ā
064E 0649
َى
fatḥa ʼalif maqṣūra (Arabic) a á
064E 06CC
َی
fatḥa yeh (Farsi, Urdu) ā / aỳ
064F 0648
ُو
ḍamma wāw u ū
0650 064A
ِي
kasra yāʼ i ī

Definite article[edit]

Solar
letters
Standard
translit.
Strict
translit.
ت t t
ث th th
د d d
ذ dh dh
ر r r
ز z z
س s s
ش sh sh
ص s
ض d
ط t
ظ z
ن n n

Arabic has only one definite article, "ال" ("al-"). However, if it is followed by a solar letter (listed in the table right), the "L" is assimilated in pronunciation with this solar letter and the solar letter is doubled.

  • Examples: تقي الدين (Taqi al-Din) is pronounced and transliterated as "Taqi ad-Din"

Both the non-assimilated ("al-") or the assimilated ("ad-") form appear in various standards of transliteration, and both allow the recreation of the original Arabic. For this manual of style, assimilated letters will be used, as it aids readers in the correct pronunciation.

The definite article "al-" and its variants (ash-, ad-, ar-, etc.) are always written in lower case (unless beginning a sentence), and a hyphen separates it from the following word.

  • Examples: "al-Qaeda"

Dynastic "Al "[edit]

Some Arabic names, especially in Saudi Arabia for the House of Saud dynasty, start one of their names with آل, which seems to be an altered form of أهل. This means something like "family" or "dynasty", and is distinct from the definite article ال. If a reliably-sourced version of the Arabic spelling includes آل and the person is clearly a member of a dynasty, then this is not a case of the definite article, so "Al " (capitalised and followed by a space, not a hyphen) should be used. "Ahl " should be used if the Arabic spelling is أهل. Dynasty membership alone does not necessarily imply that the dynastic آل is used - e.g. Bashar al-Assad.

Arabic meaning transcription example
ال the al- Suliman al-Reshoudi
آل family/dynasty Al Bandar bin Abdulaziz Al Saud
أهل family/dynasty Ahl Ahl al-Bayt

Capitalization[edit]

Rules for the capitalization of English should be followed, except for the definite article, as explained above.

Names[edit]

Main article: Arabic name

The standard transliteration of Arabic names comprises a variation on the following structure:

  • the given name (ism)
  • multiple patronymics (nasab), as appropriate, each preceded by the particle ibn (son) or bint (daughter).
Note: the Arabic particle بن (English: son of) should be transcribed ibn unless a primary transcription requires the colloquial form bin (e.g. Osama bin Laden)

Examples[edit]

  • Example: "Bandar ibn Sultan as-Sa`ud"
  • Counter-example: "Bandar ibn Sultan", "Bandar as-Saud", or "Bandar bin Sultan bin `Abd al-Aziz as-Sa`ud".
  • Example: "Turki ibn Faisal as-Sa`ud"
  • Counter-example: "Turki al-Faisal".
  • Example: "Saddam Hussein at-Tikrit"
  • Counter-example: "Saddam bin Hussein at-Tikrit" (bin is not typically used in Iraq)
  • Example: "Waleed ash-Shehri"
  • Counter-example: "Waleed ibn Ahmed ash-Shehri" (he was not known to use his father's name)

If the word Abū is preceded by ibn, the correct grammatical format is ibn Abī, and not ibn Abū.[5]

  • Example: "`Ali ibn Abi Talib"
  • Counter-example: "`Ali ibn Abu Talib"

Persian[edit]

When the Arabic script was adopted for the Persian language, there were letters pronounced in Persian which did not have a representation in the Arabic alphabet, and vice versa. The Persian alphabet adds letters to the Arabic alphabet, and changes the pronunciation of some Arabic letters which are not pronounced in Persian. In addition, Persian does not use a definite article ("al-"). All vowels, long or short, remain transliterated the same as in Arabic.

Urdu[edit]

Shortcuts:

Urdu adds additional letters, and some existing letters are transliterated differently. The strict transliteration is based on the ALA-LC Romanization method for Urdu (2012). The standard transliteration is the same, without accents, underscores and underdots.

Consonants[edit]

Urdu Standard translit. Strict translit. Notes
ب b b
پ p p
ت t t
ٹ t
ث s "s", combining macron below: s̱
ج j j
ch c
ح h
خ kh k͟h "k", combining double macron below, "h": k͟h
د d d
ڈ d
ذ z
ر r r
ڑ r
ز z z
zh zh
س s s
ش sh sh
ص s
ض z
ط t "t", combining diaeresis below: t̤
ظ z "z", combining diaeresis below: z̤
ع ` different from hamza
غ gh g͟h "g", combining double macron below, "h": g͟h
ف f f
ق q q
ک k k
g g
ل l l
م m m
ن n n
ں n "n", combining macron below: ṉ
و w or v w or v
ه h h
ة t t
ء ' omitted in initial position
y y

Aspirates[edit]

Urdu Standard translit. Strict translit.
ﺑﻬ bh bh
ﭘﻬ ph ph
ﺗﻬ th th
ﭨﻬ th ṭh
ﺟﻬ jh jh
ﭼﻬ chh chh
دﻫ dh dh
ڈﻫ dh ḍh
ڑﻫ rh ṛh
ﻛﻬ kh kh
ﮔﻬ gh gh

Vowels[edit]

Vowels Standard Trans. Strict Trans.
َ a a
ِ i i
ُ u u
ـَا a ā
ـَى ـَیٰ a á
ـِی ـِي ـِيـ i ī
ـُو u ū
ـو o o
ـی ـے ـيـ e e
ـَوْ au au
ـَیْ ـَيْـ ai ai

Ottoman Turkish[edit]

The Ottoman Turkish language differs from the above languages in that, since 1928, words that were once written with a Persian-influenced version of the Arabic abjad have been written using the Latin alphabet. As such, there is a long established set of standards for writing the language in a standard transliteration; however, in a strict transliteration, the language adheres closely to the standards for strict transliteration described above.

Guidelines for writing Ottoman Turkish words according to the standard transliteration can be found at the website of the Turkish Language Association (Türk Dil Kurumu): here for the majority of words, and here for names of people.

In the following table, only those letters which differ in either their strict or their standard transliteration from the Arabic-oriented table above are shown; all others are transliterated according to that table.

Script Standard translit. Strict translit. IPA Notes
ا a, â, e ā, e [ɑ:], [e] This represents a, â, or e in initial position, and â in medial or final position.
آ a, â ā [ɑ:] This is only written in initial position.
s s [s]
ج c, ç c [dʒ], [tʃ] When choosing between c and ç in the standard transliteration, modern Turkish orthography should be followed.
ç ç [tʃ]
خ h [h]
ذ z z [z]
j j [ʒ]
ش ş ş [ʃ]
ض z, d ż, [z], [d] When choosing between ż and in the strict transliteration, and z and d in the standard transliteration, modern Turkish orthography should be followed.
ع a, 'a, ', â `a, `ā, [ɑ], [ɑ:], ø
غ g, ğ ġ [ɣ], [g], [k], [h] When choosing between g and ğ in the standard transliteration, modern Turkish orthography should be followed.
ق k [k]
ك k, g, ğ, n k, g, ñ [k], [n], [ɲ], [ŋ] When choosing between k, g, ğ, and n in the standard transliteration, modern Turkish orthography should be followed.
g, ğ g [g], [k] When choosing between g and ğ in the standard transliteration, modern Turkish orthography should be followed.
n ñ [n], [ɲ], [ŋ]
ه h, e, a, i h, e, a, i [h], [ɑ], [e], [i] When choosing between e and a in the transliteration, the Turkish rules of vowel harmony should be followed. This is only transliterated as h at the end of a word in proper nouns.
ء ', ø ø
و v, o, ö, u, ü v, o, ō, ö, u, ū, ü [v], [o], [o:], [œ], [u], [u:], [y] When making the transliteration, modern Turkish orthography should be followed.
ي y, i, ı, a y, i, ī, ı, ā [j], [i], [i:], [ɯ], [ej], [ɑ:] When making the transliteration, modern Turkish orthography should be followed.
la, lâ [lɑ:]
ة et et [et]

Definite article[edit]

In words that use the Arabic definite article ال, the article always follows the assimilation of solar letters. However, the vowel ا can be transliterated in a number of ways.

  1. For a definite article in initial position, the definite article is written as el- in both the standard and the strict transliterations; e.g. الوهاب el-Vehhāb, الرمضان er-Ramażān.
  2. For a definite article in medial position, such as is found in many names of Arabic origin, the vowel in the strict transliteration can be written in a variety of ways; e.g. u’l, ü’l, i’l, ’l, etc. In such cases, the diacritic representing the hamza or `ayin (e.g. ) is always used, and the choice of vowel should follow modern Turkish orthography; e.g. عبد الله `Abdu’llah, عبد العزيز `Abdü’l-`Azīz, بالخاصه bi’l-ḫaṣṣa.
  3. For a definite article in medial position in the standard transliteration, is not used, and the choice of vowel and spelling should follow modern Turkish orthography; e.g. عبد الله Abdullah, عبد العزيز Abdülâziz, بالخاصه bilhassa.

External links[edit]