Romanization of Arabic

From Wikipedia, the free encyclopedia
  (Redirected from Arabic transliteration)
Jump to: navigation, search
Arabic alphabet
ا    ب    ت    ث    ج    ح
خ    د    ذ    ر    ز    س
ش    ص    ض    ط    ظ    ع
غ    ف    ق    ك    ل
م    ن    ه    و    ي
History · Transliteration
Diacritics · Hamza ء
Numerals · Numeration

Different approaches and methods for the romanization of Arabic exist. They vary in the way that they address the inherent problems of rendering written and spoken Arabic in the Latin alphabet; they also use different symbols for Arabic phonemes that do not exist in English or other European languages.

Contents

[edit] Method

Romanization is often termed "transliteration", but this is not technically correct. Transliteration is the direct representation of foreign letters using Latin symbols, while most systems for romanizing Arabic are actually transcription systems, which represent the sound of the language. As an example, the above rendering munāẓarat al-ḥurūf al-ʿarabiyyah of the Arabic: مناظرة الحروف العربية‎ is a transcription, indicating the pronunciation; an example transliteration would be mnaẓrḧ alḥrwf alʿrbyḧ.

[edit] Romanization standards and systems

This list is sorted chronologically. Bold face indicates column headlines as they appear in the table below.

  • IPA: International Phonetic Alphabet (1886)
  • Deutsche Morgenländische Gesellschaft (1936): Adopted by the International Convention of Orientalist Scholars in Rome. It is the basis for the very influential Hans Wehr dictionary (ISBN 0-87950-003-4). [1]
  • BS 4280 (1968): Developed by the British Standards Institution. [2]
  • SATTS: One-to-one mapping to Latin Morse equivalents.
  • UNGEGN (1972): United Nations Group of Experts on Geographical Names, or Variant A of the Amended Beirut System [3]
  • IGN System 1973 or Variant B of the Amended Beirut System, which conforms to French orthography and is preferred to the Variant A in french-speaking countries as in Maghreb and Syria [4]
  • DIN 31635 (1982): Developed by the Deutsches Institut für Normung (German Institute for Standardization).
  • ISO 233 (1984).
  • Qalam (1985): A system that focuses upon preserving the spelling, rather than the pronunciation, and uses mixed case. [5]
  • ArabTeX (since 1992) its "native" input is 7-bit ASCII: "has been modelled closely after the transliteration standards ISO/R 233 and DIN 31635"
  • ISO 233-2 (1993). Simplified transliteration.
  • Buckwalter Transliteration (1990s): Developed at Xerox by Tim Buckwalter [6]; doesn't require unusual diacritics. [7]
  • Bikdash Transliteration (BATR): A system [8] which is a compromise between Qalam and Buckwalter Transilterations. It represents consonants with one letter and possibly the single quotation mark as a modifier, and uses one or several Latin vowels to represent short and long Arabic vowels. It strives for minimality as well as phonetic expressiveness. It does not distinguish between the different shapes of the hamza since it assumes that a software implementation can resolve the differences through the standard rules of spelling of Arabic [9].
  • ALA-LC (1997). [10]
  • SAS: Spanish Arabists School (José Antonio Conde and others, early 19th century onwards). [11]
  • Arabic chat alphabet: Not a system; listed here merely for completeness. In some situations, such as online communication, users need a way to enter Arabic text only with the keys immediately available on a keyboard. As an ad hoc solution, such letters can be replaced with Arabic numerals of similar appearance.

A (non-normative) table comparing romanizations using DIN 31635, ISO 233, ISO/R 233, UN, ALA-LC and Encyclopaedia of Islam systems is available here: [12].

[edit] Comparison table

Letter Unicode Name IPA UNGEGN ALA-LC DIN ISO SAS -2 BATR ArabTeX chat 1 Ergorabic
ء2 0621 hamzah ʔ ʼ [note 3] ʾ ˈˌ ʾ ' e ' 2 c
ا 0627 ʾalif ā ʾ ā aa aa / A a a/e/é â
ب 0628 ʾ b b
ت 062A ʾ t t
ث 062B ṯāʾ θ th ç c _t s/th ŧ
ج 062C ǧīm d͡ʒ~ɡ~ʒ j ǧ ŷ j j ^g j/g/dj j
ح 062D ḥāʾ ħ H .h 7 ħ
خ 062E ḫāʾ x kh j x K _h kh/7'/5 x
د 062F dāl d d
ذ 0630 ḏāl ð dh đ z' _d z/dh/th đ
ر 0631 ʾ r r
ز 0632 zayn/zāy z z
س 0633 sīn s s
ش 0634 šīn ʃ sh š x ^s sh/ch ş
ص 0635 ṣād ş S .s s/9 s'
ض 0636 ḍād D .d d/9' d'
ط 0637 ṭāʾ ţ T .t t/6 t'
ظ 0638 ẓāʾ ðˤ~ đ̣ Z .z z/dh/6' z'
ع 0639 ʿayn ʕ ʻ [note 3] ʿ ř E ` 3
غ 063A ġayn ɣ gh ġ g ğ g .g gh/3' gh
ف4 0641 ʾ f f
ق4 0642 qāf q q 2/g/q q
ك 0643 kāf k k
ل 0644 lām l l
م 0645 mīm m m
ن 0646 nūn n n
ه 0647 ʾ h h
و 0648 wāw w, w w; ū w; o w; uu w w; o; ou/u/oo w; û
ي5 064A ʾ j, y y; ī y; e y; ii y y; i/ee; ei/ai y; î
آ 0622 ʾalif maddah ʔaː ā ā, ʼā ʾā ʾâ ā 'aa eaa 'A 2a/aa câ/ã
ة 0629 ʾ marbūṭah a, at h, t t; — ŧ t' T a/e(h); et/at e
ى5 0649 ʾalif maqṣūrah y á ā à aaa _A a; i/y
ال ʾalif lām (var.) al- ʾal al- al-; ál- Al- al- el âl
  • ^1 The chat table is only a demonstration and is based on the spoken varieties which vary considerably from Literary Arabic on which the IPA table and the rest of the transliterations are based.
  • ^2 Review hamzah for its various forms.
  • ^3 The original standard symbols for these schemes for transliterating hamzah and ʿayn is by Modifier letter apostrophe (ʼ) and Modifier letter turned comma (ʻ), respectively. However, there is a common practice to instead, use Right single quotation mark () and Left single quotation mark (), respectively.
  • ^4 Fāʾ and qāf are traditionally written in North Eastern Africa as ڢ‎ and ڧـ ـڧـ ـٯ‎, respectively, while the latter's dot is only added initially or medially.
  • ^5 In Egypt, Sudan and sometimes in other regions, the standard form for final-yāʾ is only ى (without dots) in handwriting and print, for both final /-iː/ and final /-aː/. ى for the latter pronunciation, is called ألف ليّنة ʾalif layyinah [ˈʔælef læjˈjenæ], "flexible alif".

[edit] Romanization issues

Any romanization system has to make a number of decisions which are dependent on its intended field of application.

[edit] Vowels

One basic problem is that written Arabic is normally unvocalized, i.e., many of the vowels are not written out, and must be supplied by a reader familiar with the language. Hence unvocalized Arabic writing does not give a reader unfamiliar with the language sufficient information for accurate pronunciation. As a result, a pure transliteration, e.g. rendering قطر as qṭr, is meaningless to an untrained reader. For this reason, transcriptions are generally used that add vowels, e.g. qaṭar.

[edit] Transliteration vs. transcription

Most uses of romanization call for transcription rather than transliteration: Instead of transliterating each written letter, they try to reproduce the sound of the words according to the orthography rules of the target language: Qatar. This applies equally to scientific and popular applications. A pure transliteration, for example, would need to omit vowels (e.g. qtr), making the result difficult to interpret except for a subset of trained readers fluent in Arabic. Even if vowels are added, a transliteration system would still need to distinguish between multiple ways of spelling the same sound in the Arabic script, e.g. ʾalif vs. ʾalif maqṣurah for the sound ā, and the six different ways (ء إ أ آ ؤ ئ) of writing the glottal stop (hamza, usually transcribed ʾ ). This sort of detail is unneeded and needlessly confusing except in a very few situations (e.g. typesetting text in the Arabic script).

Most issues related to the romanization of Arabic are about transliterating vs. transcribing – others, about what should be romanized:

  • transliteration ignores assimilation (sandhi) of the article before the "sun letters", and may be easily misread by non-Arabs. For instance an-nur (or an-nuur, or an-noor) would be more correctly transliterated along the lines of alnur. In the transcription an-nur, a hyphen is added and the unpronounced 'l' removed for the convenience of the uninformed non-Arab reader, who would otherwise pronounce an 'l', probably not understand the word to be nur, pronounce only one 'n', and be confused by the role of the double 'n'. Alternatively, if the shadda is not transliterated (since it is strictly not a letter), a hypercorrect transliteration would be alnur, which presents similar problems for the uninformed non-Arab reader.
  • a transliteration must render the "closed tā" (ta marbuta ة) faithfully, a transcription must render the sound ("a" like any other "a" or "t" like any other "at" – or in a vocalized text nothing vs. t)
    • ISO 233 has a unique symbol, .
  • "short alif" (ʾalif maqṣurah, ى) must be transliterated with a special symbol, like Iª,ıª, but is transcribed like standing alif, when it stands for a long a (ā)
  • Nunation: what is true elsewhere is also true for nunation: transliteration renders what is seen, transcription what is heard.

A transcription may reflect the language as spoken, for example, by the people of Baghdad, or the official standard as spoken by a preacher in the mosque or a TV news reader. A transcription is free to add phonological (such as vowels) or morphological (such as word boundaries) information. Transcriptions will also vary depending on the writing conventions of the target language; compare English Omar Khayyam with German Omar Chajjam, both for عمر خيام (unvocalized ʿmr ḫyʾm, vocalized ʿumar ḫayyām).

A transliteration is ideally fully reversible: a machine must be able to transliterate it into Arabic and back. A transliteration can be considered as flawed for any one of the following reasons:

  • A "loose" transliteration is ambiguous, rendering several Arabic phonemes with an identical transliteration, or digraphs for a single phoneme (such as sh) may be confused with two adjacent phonemes;
  • Symbols representing phonemes may be considered too similar (e.g., ` and ' or ʿ and ʾ for ayin and hamza);
  • ASCII transliterations using capital letters to disambiguate phonemes are easy to type but may be considered unaesthetic.

A fully accurate transcription may not be necessary for native Arabic speakers as they would be able to pronounce names and sentences correctly anyway, but it can be very useful for those not fully familiar with spoken Arabic and who are familiar with the Roman alphabet. An accurate transliteration serves as a valuable stepping stone for learning, pronouncing correctly, and distinguishing phonemes. It is a useful tool for anyone familiar with the sounds of Arabic but who are not fully conversant in the language.

One criticism is that a fully accurate system would require special learning that most do not have to actually pronounce names correctly, and that with a lack of a universal romanization system they will not be pronounced correctly by non-native speakers anyway. The precision will be lost if special characters are not replicated and if someone is not familiar with Arabic pronunciation.

[edit] Further difficulties

During the creating of a romanization system some problems can occur:

  • Repeated symbols, like h in traditional English-styled transcription:

th=ث ; kh=خ ; dh=ذ ; sh=ش ; gh=غ ; ah=ة , but h is used also for ه letter. For example, it is difficult to read the combination th in the word mitha:l – (meaning: an example). It is to be read as interdental t (ث), but some people can read it t, then h. The best way here is to underline the combination to avoid dual reading.

  • Problems with diacritic marks (dots, commas, round-ups and under the letters as well as macrons for long vowels).

Example: ḲṪABun (meaning "book") accurate transliteration) becomes kitabun in simplified one;

  • Coincidence of the meaning of symbol in your transciption system with standard meaning (I.e.: h in some variants means ħ, but not the same sound, like in English and German; c sounds like j in jam, but not as in standard way; e sounds like English h and so on).
  • Coincidence of meanings of a combination. E.g.: الرياض – Ar-Riyyadh in standard literature representation.(dh = ض). But dh

is also in use for ذ: dhikr – ذكر – memory.

  • Bad distinguishment of some symbols. I.e.: you can read two or more symbols like one sound because of similar graphical elements. For example, ' – hamza and ` – `ayn in some systems.

[edit] Examples

Examples in Literary Arabic:

Arabic خليفة كان له قصر إلى المملكة المغربية
Arabic with diacritics
(normally omitted)
خَلِيفَة كَانَ لَهُ قَصْر إِلَى الْمَمْلَكَة الْمَغْرِبِيَّة
IPA [/xaliːfa kaːna lahu qasˤr/] [/ʔila l mamlaka al maɣribijja/]
DIN 31635 Ḫalīfah kāna lahu qaṣr ʾIlā l-mamlakah al-Maġribiyyah
ALA-LC Khalīfah kāna lahu qaṣr Ilá l-mamlakah al-Maghribīyah
UNGEGN Khalyfah kana lahu qaşr ʼIly al-mamlakah al-maghribiyyah
BATR Kaliifat' kaana lahu qaSr ilaaa almamlakat' almagribiyyat'
ArabTeX _halyfaT kana lahu qa.sr il_A almamlakaT alma.gribiyyaT
English A Caliph had a palace To the kingdom of Morocco

[edit] See also

Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages