Jump to content

Arabic phonology

From Wikipedia, the free encyclopedia

While many languages have numerous dialects that differ in phonology, contemporary spoken Arabic is more properly described as a continuum of varieties.[1] This article deals primarily with Modern Standard Arabic (MSA), which is the standard variety shared by educated speakers throughout Arabic-speaking regions. MSA is used in writing in formal print media and orally in newscasts, speeches and formal declarations of numerous types.[2]

Modern Standard Arabic has 28 consonant phonemes and 6 vowel phonemes. All phonemes contrast between "emphatic" (pharyngealized) consonants and non-emphatic ones. Some of these phonemes have coalesced in the various modern dialects, while new phonemes have been introduced through borrowing or phonemic splits. A "phonemic quality of length" applies to consonants as well as vowels.[3]


Of the 29 Proto-Semitic consonants, only one has been lost: */ʃ/, which merged with /s/, while /ɬ/ became /ʃ/ (see Semitic languages).[4] Various other consonants have changed their sound too, but have remained distinct. An original */p/ lenited to /f/, and */ɡ/ – consistently attested in pre-Islamic Greek transcription of Arabic languages[5] – became palatalized to /ɡʲ/ or /ɟ/ by the time of the Quran and /d͡ʒ/, /ɡ/, /ʒ/ or /ɟ/ after early Muslim conquests and in MSA (see Arabic phonology#Local variations for more detail).[6] An original voiceless alveolar lateral fricative */ɬ/ became /ʃ/.[7]

Its emphatic counterpart /ɬˠ~ɮˤ/ was considered by Arabs to be the most unusual sound in Arabic (Hence the Classical Arabic's appellation لُغَةُ ٱلضَّادِ luɣatu‿ḍ-ḍād or "language of the ḍād"). For most modern dialects, it has become an emphatic stop /dˤ/ with loss of the laterality[7] or with complete loss of any pharyngealization or velarization, /d/. The classical ḍād pronunciation of pharyngealization /ɮˤ/ still occurs in the Mehri language, and the similar sound without velarization, /ɮ/, exists in other Modern South Arabian languages.

The first known book printed in Arabic: Kitābu ṣalāti s-sawā'ī (كتاب صلاة السواعي), a book of hours printed with movable type in 1514.[8]

Other changes may also have happened. Classical Arabic pronunciation is not thoroughly recorded and different reconstructions of the sound system of Proto-Semitic propose different phonetic values. One example is the emphatic consonants, which are pharyngealized in modern pronunciations but may have been velarized in the eighth century and glottalized in Proto-Semitic.[7]

Reduction of /j/ and /w/ between vowels occurs in a number of circumstances and is responsible for much of the complexity of third-weak ("defective") verbs. Early Akkadian transcriptions of Arabic names show that this reduction had not yet occurred as of the early part of the 1st millennium BC.[citation needed]

The Classical Arabic language as recorded was a poetic koine that reflected a consciously archaizing dialect, chosen based on the tribes of the western part of the Arabian Peninsula, who spoke the most conservative variants of Arabic. Even at the time of Muhammed and before, other dialects existed with many more changes, including the loss of most glottal stops, the loss of case endings, the reduction of the diphthongs /aj/ and /aw/ into monophthongs /eː, oː/, etc. Most of these changes are present in most or all modern varieties of Arabic.[citation needed]

An interesting feature of the writing system of the Quran (and hence of Classical Arabic) is that it contains certain features of Muhammad's native dialect of Mecca, corrected through diacritics into the forms of standard Classical Arabic. Among these features visible under the corrections are the loss of the glottal stop and a differing development of the reduction of certain final sequences containing /j/: Evidently, the final /-awa/ became /aː/ as in the Classical language, but final /-aja/ became a different sound, possibly /eː/ (rather than again /aː/ in the Classical language). This is the apparent source of the alif maqṣūrah 'restricted alif' where a final /-aja/ is reconstructed: a letter that would normally indicate /j/ or some similar high-vowel sound, but is taken in this context to be a logical variant of alif and represent the sound /aː/.[citation needed]

Literary Arabic[edit]

Recording of a poem by Al-Ma'arri titled "I no longer steal from nature"

The "colloquial" spoken dialects of Arabic are learned at home and constitute the native languages of Arabic speakers. "Formal" Modern Standard Arabic is learned at school; although many speakers have a native-like command of the language, it is technically not the native language of any speakers. Both varieties can be both written and spoken, although the colloquial varieties are rarely written down and the formal variety is spoken mostly in formal circumstances, e.g., in radio and TV broadcasts, formal lectures, parliamentary discussions and to some extent between speakers of different colloquial dialects.

Even when the literary language is spoken, it is normally only spoken in its pure form when reading a prepared text out loud and communicating between speakers of different colloquial dialects. When speaking extemporaneously (i.e. making up the language on the spot, as in a normal discussion among people), speakers tend to deviate somewhat from the strict literary language in the direction of the colloquial varieties. There is a continuous range of "in-between" spoken varieties: from nearly pure Modern Standard Arabic (MSA), to a form that still uses MSA grammar and vocabulary but with colloquial influence, to a form of the colloquial language that imports a number of words and grammatical constructions in MSA, to a form that is close to pure colloquial but with the "rough edges" (the most noticeably "vulgar" or non-Classical aspects) smoothed out, to pure colloquial.

The particular variant (or register) used depends on the social class and education level of the speakers involved and the level of formality of the speech situation. Often it will vary within a single encounter, e.g., moving from nearly pure MSA to a more mixed language in the process of a radio interview, as the interviewee becomes more comfortable with the interviewer. This type of variation is characteristic of the diglossia that exists throughout the Arabic-speaking world.[citation needed]

Coverage in Al-Ahram in 1934 of the inauguration of the Academy of the Arabic Language in Cairo, an organization of major importance to the modernization of Arabic.

Although Modern Standard Arabic (MSA) is a unitary language, its pronunciation varies somewhat from country to country and from region to region within a country. The variation in individual "accents" of MSA speakers tends to mirror corresponding variations in the colloquial speech of the speakers in question, but with the distinguishing characteristics moderated somewhat. It is important in descriptions of "Arabic" phonology to distinguish between pronunciation of a given colloquial (spoken) dialect and the pronunciation of MSA by these same speakers.

Although they are related, they are not the same. For example, the phoneme that derives from Classical Arabic /ɟ/ has many different pronunciations in the modern spoken varieties, e.g., [d͡ʒ ~ ʒ ~ j ~ ɡʲ ~ ɡ] including the proposed original [ɟ]. Speakers whose native variety has either [d͡ʒ] or [ʒ] will use the same pronunciation when speaking MSA. Even speakers from Cairo, whose native Egyptian Arabic has [ɡ], normally use [ɡ] when speaking MSA. The [j] of Persian Gulf speakers is the only variant pronunciation which is not found in MSA; [d͡ʒ~ʒ] is used instead, but may use [j] in MSA for comfortable pronunciation.

Another reason of different pronunciations is influence of colloquial dialects. The differentiation of pronunciation of colloquial dialects is the influence from other languages previously spoken and some still presently spoken in the regions, such as Coptic in Egypt, Berber, Punic, or Phoenician in North Africa, Himyaritic, Modern South Arabian, and Old South Arabian in Yemen and Oman, and Aramaic and Canaanite languages (including Phoenician) in the Levant and Mesopotamia.[citation needed]

Another example: Many colloquial varieties are known for a type of vowel harmony in which the presence of an "emphatic consonant" triggers backed allophones of nearby vowels (especially of the low vowels /aː/, which are backed to [ɑ(ː)] in these circumstances and very often fronted to [æ(ː)] in all other circumstances). In many spoken varieties, the backed or "emphatic" vowel allophones spread a fair distance in both directions from the triggering consonant. In some varieties, most notably Egyptian Arabic, the "emphatic" allophones spread throughout the entire word, usually including prefixes and suffixes, even at a distance of several syllables from the triggering consonant.

Speakers of colloquial varieties with this vowel harmony tend to introduce it into their MSA pronunciation as well, but usually with a lesser degree of spreading than in the colloquial varieties. For example, speakers of colloquial varieties with extremely long-distance harmony may allow a moderate, but not extreme, amount of spreading of the harmonic allophones in their MSA speech, while speakers of colloquial varieties with moderate-distance harmony may only harmonize immediately adjacent vowels in MSA.[citation needed]


Vowel chart representing the pronunciation of long vowels by a Palestinian speaker educated in Beirut. From Thelwall (1990:38). (These values vary between regions across North Africa and West Asia.)
Vowel chart representing the pronunciation of diphthongs by a Palestinian speaker educated in Beirut. From Thelwall (1990:38)

Modern Standard Arabic has six vowel phonemes forming three pairs of corresponding short and long vowels (/a, aː, i, iː, u, uː/). Many spoken varieties also include /oː/ and /eː/. Modern Standard Arabic has two diphthongs (formed by a combination of short /a/ with the semivowels /j/ and /w/). Allophony in different dialects of Arabic can occur and is partially conditioned by neighboring consonants within the same word. The following are some general rules:

  • /a, aː/
  • /i, iː, u, uː/
    • Across North Africa and West Asia, /i/ may be realized as [ɪ ~ e ~ ɨ] before or adjacent to emphatic consonants and [q], [r], [ħ], [ʕ]. /u/ can also have different realizations, i.e. [ʊ ~ o ~ ʉ]. Sometimes with one value for each vowel in both short and long lengths or two different values for each short and long lengths. They can be distinct phonemes in loanwords for a number of speakers.
    • In Egypt, close vowels have different values; short initial or medial: [e][o] ← instead of /i, u/. /i~ɪ/ and /u~ʊ/ completely become [e] and [o] respectively in some other particular dialects. Unstressed final long /aː, iː, uː/ are most often shortened or reduced: /aː/ →  ~ ɑ], /iː/ → [i], /uː/ → [o~u].
Example words[11]
short long
i عِدْ /ʕid/ "promise!" عِيد /ʕiːd/ "holiday"
u عُدّ /ʕudd/ "count (command)" عُود /ʕuːd/ "lute"
a عَدّ /ʕadd/ "counted" عَاد /ʕaːd/ "came back"
aj عَيْن /ʕajn/ "eye"
aw عَوْد /ʕawd/ "return"

However, the actual rules governing vowel-retraction are a good deal more complex and have relatively little in the way of an agreed-upon standard, as there are often competing notions of what constitutes a "prestige" form.[12] Often, even highly proficient speakers will import the vowel-retraction rules from their native dialects.[13] Thus, for example, in the Arabic of someone from Cairo, emphatic consonants will affect every vowel between word boundaries, whereas certain Saudi speakers exhibit emphasis only on the vowels adjacent to an emphatic consonant.[14] Certain speakers (most notably Levantine speakers) exhibit a degree of asymmetry in leftward vs. rightward spread of vowel-retraction.[14][15]

The final heavy syllable of a root is stressed.[11]

The short vowels [u, ʊ, o, o̞, ɔ] are all possible allophones of /u/ across different dialects; e.g., قُلْت /ˈqult/ ('I said') is pronounced [ˈqʊlt] or [ˈqolt] or [ˈqɔlt], since the difference between the short mid vowels [o, o̞, ɔ] and [u, ʊ] is never phonemic, and they are mostly found in complementary distribution, except for a number of speakers where they can be phonemic but only in foreign words.

The short vowels [i, ɪ, e, e̞, ɛ] are all possible allophones of /i/ across different dialects; e.g., مِن /ˈmin/ ('from') is pronounced [ˈmɪn] or [ˈmen] or [ˈmɛn] since the difference between the short mid vowels [e, e̞, ɛ] and [i, ɪ] is never phonemic, and they are mostly found in complementary distribution, except for a number of speakers where they can be phonemic but only in foreign words.

The long mid vowels /oː/ and /eː/ appear to be phonemic in most varieties of Arabic except in general Maghrebi Arabic, where they merge with /uː/ and /iː/. For example, لون ('color') is generally pronounced /loːn/ in Mashriqi dialects but /luːn/ in most Maghrebi Arabic. The long mid vowels can be used in Modern Standard Arabic in dialectal words or in some stable loanwords or foreign names,[16] as in روما /ˈroːma/ ('Rome') and شيك /ˈʃeːk/ ('cheque').

Foreign words often have a liberal sprinkling of long vowels, as vowels tend to be written as long vowels in foreign loans, under the influence of European-language orthographies which write down every vowel with a letter.[17] The long mid vowels /eː/ and /oː/ are always rendered with the letters ي and و, respectively, accompanied by a preceding hamzah sitting above (أ) and below (إ) an alif (ا) respectively word-initially. In general, the pronunciation of loanwords is highly dependent on the speaker's native variety.


Even in the most formal of conventions, pronunciation depends upon a speaker's background.[18] Nevertheless, the number and phonetic character of most of the 28 consonants has a broad degree of regularity among Arabic-speaking regions. Note that Arabic is particularly rich in uvular, pharyngeal, and pharyngealized ("emphatic") sounds. The emphatic coronals (/sˤ/, /dˤ/, /tˤ/, and /ðˤ/) cause assimilation of emphasis to adjacent non-emphatic coronal consonants.[citation needed] The standard pronunciation of ⟨ج/d͡ʒ/ varies regionally, most prominently [d͡ʒ] in the Arabian Peninsula, parts of the Levant, Iraq, and northern Algeria, it is also considered as the predominant pronunciation of Literary Arabic outside the Arab world, [ʒ] in most of Northwest Africa and the Levant, [ɡ] in Egypt, coastal Yemen, and south coastal Oman, as well as [ɟ] in Sudan.

Note: the table and notes below discuss the phonology of Modern Standard Arabic among Arabic speakers and not regional dialects.

Modern Standard Arabic consonant phonemes
Labial Dental Denti-alveolar Post-alv./
Velar Uvular Pharyngeal Glottal
plain emphatic[a]
Nasal m n
voiceless[b] t[c] k q[d] ʔ
voiced b d[c] [e] d͡ʒ[f] (ɡ)[g]
Fricative voiceless f θ[h] s ʃ x ~ χ[i] ħ[j] h
voiced ð[h] z ðˤ[k] ɣ ~ ʁ[i] ʕ[j]
Trill r[l]
Approximant l (ɫ)[m] j w
  1. ^ Emphatic consonants are pronounced with the back of the tongue approaching the pharynx (see pharyngealization). They are pronounced with velarization by the Iraqi and Arabic Gulf speakers.[citation needed] /q/, /ħ/, and /ʕ/ can be considered the emphatic counterparts to /k/, /h/, and /ʔ/ respectively.[19]
  2. ^ /t/ and /k/ are aspirated [tʰ] and [kʰ], whereas /tˤ/ and /q/ are unaspirated.[20]
  3. ^ a b Depending on the region, the plosives are either alveolar or dental.
  4. ^ The Sudanese usually pronounce /q/ (ق) as [ɢ] even in Literary Arabic.
  5. ^ ض [dˤ] was historically [ɮˤ], a value it retains among older speakers in a few isolated dialects.[21]
  6. ^ When speaking Modern Standard Arabic, the phoneme represented by the Arabic letter ǧīm (ج) is pronounced [d͡ʒ], [ʒ], [ɡ], or [ɟ] depending on the speaker's native dialect.[22] Outside the Arab League, [d͡ʒ] is the preferred taught variant.
  7. ^ In Modern Standard Arabic /ɡ/ is either the standard pronunciation[23] for ǧīm (ج) or is used in foreign words which may be transcribed more commonly with ج, غ, ق or ك or less commonly ݣ‎ (used in Morocco) or ڨ‎ (used in Tunisia and Algeria), mainly depending on the regional spoken variety of Arabic or the commonly diacriticized Arabic letter.
  8. ^ a b /θ/ and /ð/ may be approximated to [t] and [d] or [s] and [z], respectively.
  9. ^ a b In most regions, uvular fricatives of the classical period have become velar or post-velar.[24]
  10. ^ a b The "voiced pharyngeal fricative" /ʕ/ (ع) is described as neither pharyngeal nor fricative, but a creaky-voiced epiglottal approximant.[25] Its unvoiced counterpart /ħ/ (ح) is likewise epiglottal, although it is a true fricative. Thelwall asserts that the sound of ع is actually a pharyngealized glottal stop [ʔˤ].[26] Similarly, McCarthy (1994) points to dialectal and idiolectal variation between stop and continuant variations of /ʕ/ in Iraq and Kuwait, noting that the distinction is superficial for Arabic speakers and carries "no phonological consequences."[27]
  11. ^ The voiced emphatic dental fricative ظ [ðˤ] is mostly pronounced as a voiced emphatic alveolar fricative [zˤ] in Egypt and Lebanon.[28]
  12. ^ Emphatic [rˤ] exists in Northwestern African pronunciations and in Egypt when accompanied by /a/ or /u/ and plain when accompanied by /i/ or /j/; in closed syllables, then it is plain when the first preceding voweled consonant has /i/ or if /j/ is present, but emphatic if the first preceding voweled letter is accompanied by /a/ or /u/. The trill /r/ is sometimes reduced to a single vibration when single, but it remains potentially a trill, not a flap [ɾ]: the pronunciation of this single trill is between a trill [r] and a flap [ɾ]. ⟨r⟩ is in free variation between a trill [r] and a flap [ɾ] in Egypt and the Levant.
  13. ^ In most pronunciations, /ɫ/ as a phoneme occurs in a handful of loanwords. It also occurs in الله Allah /ʔaɫˈɫaːh/, the name of God,[22] except when it follows long or short /i/ when it is not emphatic: بسم الله bismi l-lāh /bis.milˈlaːh/ ("in the name of God").[29] However, /ɫ/ is absent in many regions, such as the Nile Valley, and is more widespread in certain regions, such as Iraq, where the uvulars have velarized surrounding instances of /l/ in the environment of emphatic consonants when the two are not separated by /i/.[30]

Long (geminate or double) consonants are pronounced exactly like short consonants, but last longer. In Arabic, they are called mushaddadah ("strengthened", marked with a shaddah). Between a long consonant and a pause, an epenthetic [ə] occurs,[11] but this is only common across regions in West Asia.

The foreign sounds /p/ and /v/ (usually transcribed as ب /b/ and ف /f/ respectively) are not necessarily pronounced by all Arabic speakers and their usage is optional. As these letters are not present on standard keyboards, they are simply written with ب /b/ and ف /f/, e.g. باكستان or پاکستان /pa(ː)kistaːn, ba(ː)kistaːn/ "Pakistan", فيروس or ڤيروس /vi(ː)ru(ː)s, vajru(ː)s, fi(ː)ru(ː)s, fajru(ː)s/ "virus", etc.[17][31]


Standard Arabic syllables come in only five forms:[32]

  • C V (light)
  • C V V (heavy)
  • C V C (heavy)
  • C V V C (super-heavy)
  • C V C C (super-heavy)

Arabic syllable structure does not allow syllables to start with a vowel or with a consonant cluster.[32] In cases where a word starts with a consonant cluster it is preceded by an epenthetic /ʔi/ utterance initially or /i/ when preceded by a word that ends with a consonant; there are however exceptions like من /min/ and ـهم /-hum/ that connect with a following word-initial consonant cluster with /a/ and /u/ respectively, if the preceding word ends with a long vowel that vowel is then shortened.

Super-heavy syllables are usually not allowed except word finally,[32] with the exception of CVV- before geminates creating non-final CVVC- syllables, these can be found in the active participles of geminate Form I verbs, like in ‏مادة/maːd.da/ ('substance, matter'), ‏كافة/kaːf.fa/ ('entirely'). In the pausal form, the final geminates behave as a single consonant, only when preceding another word or with vocalization, the geminates start appearing, belonging to two separate syllables. E.g.: ‏سام/saːm(.m)/ ('poisonous'), ‏جاف/d͡ʒaːf(.f)/ ('dry'), ‏عام/ʕaːm(.m)/ ('public, general'), ‏خاص/χaːsˤ(.sˤ)/ ('private, special'), and ‏حار/ħaːr(.r)/ ('hot, spicy').[32]

Loanwords can break some phonotactic rules like allowing initial consonant clusters (with an initial epenthetic /i/ or often another repeated vowel from the word being optional inserted after the first consonant) like in پلوتو /pluː.toː, bu.luː.toː "Pluto" and پراج /praːɡ, be.raːɡ/ "Prague" or allowing CVVC syllables non-finally without geminates like in روسيا /ruːs.jaː "Russia" and سوريا /suːr.jaː/ "Syria", which can be modified to /ruː.si.jaː, suː.ri.jaː/ to fit the phonotactics better.[32]

Word stress[edit]

The placement of word stress in Arabic varies considerably from one dialect to another, and has been the focus of extensive research and debate.

In determining stress, Arabic distinguishes three types of syllables:[33]: 2991 

  • Light:
    • An open syllable containing a short vowel (i.e. CV), such as وَ wa 'and'
  • Heavy:
    • An open syllable containing a long vowel (i.e. CVV), such as سَافَرَ sā.fara 'he travelled'
    • A closed syllable containing a short vowel followed by one consonant (i.e. CVC), such as مِن min 'from' or كَتَبْتُ ka.tab.tu 'I wrote'
  • Super-heavy:
    • A closed syllable containing a long vowel followed by one consonant (i.e. CVVC), such as باب bāb 'door' or مادٌّ mād.dun 'stretching (NOM)'
    • A closed syllable containing a short vowel followed by two consonants (i.e. CVCC), such as بِنْت bint 'girl', or a long vowel followed by a geminate consonant (i.e. CVVCiCi), such as مادّ mādd 'stretching'

The word stress of Classical Arabic has been the subject of debate. However, there is consensus as to the general rule, even though there are some exceptions. A simple rule of thumb is that word-stress falls on the penultimate syllable of a word if that syllable is closed, and otherwise on the antepenultimate.[34]

A more precise description is J. C. E. Watson's. Here the stressed syllable follows the marker ' and variant rules are in brackets:[33]: 3003 

  1. Stress a pre-pausal superheavy (CVVC, CVVCC, or CVCC) syllable: كِتاب [kiˈtāb] 'book', مادّ [ˈmādd] 'stretching (MASC SG)', شَرِبْت [ʃaˈribt] 'I/you (MASC SG) drank'.
  2. Otherwise, stress the rightmost non-final heavy (CVV or CVC) syllable: دَرَسْنا [daˈrasnā] 'we learnt', صابُونٌ [ṣāˈbūnun] 'soap (NOM)', مَكْتَبة [ˈmaktabah] 'library', مادٌّ [ˈmāddun] 'stretching (NOM)', مَكْتَبةٌ [ˈmaktabatun] 'library'.
  3. Otherwise, stress the antepenult (or leftmost syllable if there is no antepenult): كَتَبَ [ˈkataba] 'he wrote'.

Modern Arabic dialects all maintain rules (1) and (2). But if there is neither a final superheavy syllable nor a heavy penultimate syllable, their behaviour varies. Thus in Palestinian, rule (3) is instead 'otherwise stress the first syllable (up to the antepenult): كَتَب [ˈkatab] 'he wrote', زَلَمة [ˈzalamah] 'man', whereas the basic rules of Cairene (to which there are exceptions) are:[33]: 2993, 3004 

  1. Stress a superheavy ultima.
  2. Otherwise, stress a heavy penult.
  3. Otherwise, stress the penult or antepenult, whichever is separated by an even number of syllables from the rightmost non-final heavy syllable, or, if there is no non-final heavy syllable, from the left boundary of the word.

Local variations of Modern Standard Arabic[edit]

Spoken varieties differ from Classical Arabic and Modern Standard Arabic not only in grammar but also in pronunciation. Outside of the Arabian peninsula, a major linguistic division is between sedentary, largely urban, varieties and rural varieties. Inside the Arabian peninsula and in Iraq, the two types are less distinct; but the language of the urbanized Hejaz, at least, strongly looks like a conservative sedentary variety.[citation needed]

Some examples of variation:


In Modern Standard Arabic (not in Egypt's use), /ɡ/ is used as a marginal phoneme to pronounce some dialectal and loan words. On the other hand, it is considered a native phoneme or allophone in most modern Arabic dialects, mostly as a variant of ق /q/ (as in Arabian Peninsula and Northwest African dialects) or as a variant of /d͡ʒ/ ج (as in Egyptian and a number of Yemeni and Omani dialects). It is also considered a separate foreign phoneme that appears only in loanwords, as in most urban Levantine dialects where ق is /ʔ/ and ج is /d͡ʒ~ʒ/.

The phoneme represented by the Arabic letter ǧīm (ج) has many standard pronunciations: [d͡ʒ] in most of the Arabian Peninsula and as the predominant pronunciation of Literary Arabic outside the Arab world, [ɡ] in most of Egypt and some regions in southern Yemen and southwestern Oman. This is also a characteristic of colloquial Egyptian and southern Yemeni dialects.[22] In Morocco and western Algeria, it is pronounced as [ɡ] in some words, especially colloquially. In most north Africa and most of the Levant, the standard is pronounced [ʒ], and in certain regions of the Persian Gulf colloquially with [j]. In some Sudanese and Yemeni dialects, it may be either [ɡʲ] or [ɟ] as it used to be in Classical Arabic.

The foreign phonemes /p/ and /v/ are not necessarily pronounced by all Arabic speakers, but are often pronounced in names and loanwords. /p/ and /v/ are usually transcribed with their own letters /p/ and /v/ but as these letters are not present on standard keyboards, they are simply written with ب /b/ and ف /f/, e.g. both نوفمبر and نوڤمبر /nu(ː)fambar/, /novambar, -ber/ or /nofember/ "November", both كاپريس and كابريس /ka(ː)pri(ː)s, ka(ː)bri(ː)s/ "caprice" can be used.[17][31] The use of both sounds may be considered marginal and Arabs may pronounce the words interchangeably; besides, many loanwords have become Arabized, e.g. باكستان or پاکستان /pa(ː)kistaːn, ba(ː)kistaːn/ "Pakistan", فيروس or ڤيروس /vi(ː)ru(ː)s, vajru(ː)s/ "virus".

/t͡ʃ/ is another possible loanword phoneme, as in the word سندوتش‎ or ساندوتش‎ (sandawitš or sāndwitš 'sandwich'), though a number of varieties instead break up the [t] and [ʃ] sounds with an epenthetic vowel.[35] Egyptian Arabic treats /t͡ʃ/ as two consonants ([tʃ]) and inserts [e], as [teʃC] or [Cetʃ], when it occurs before or after another consonant. /t͡ʃ/ is found as normal in Iraqi Arabic and Gulf Arabic.[36] Normally the combination تش (tā’-shīn) is used to transliterate the [tʃ], while in rural Levantine dialects /k/ is usually substituted with /t͡ʃ/ while speaking and would be written as ك. Otherwise Arabic usually substitutes other letters in the transliteration of names and loanwords like the Persian character چ which is used for writing [tʃ].

Other Variations include:

  • Development of highly distinctive allophones of /a/ and /aː/, with highly fronted [a(ː)], [æ(ː)] or [ɛ(ː)] in non-emphatic contexts, and retracted [ɑ(ː)] in emphatic contexts.[citation needed] The more extreme distinctions are characteristic of sedentary varieties, while Bedouin and conservative Arabian-peninsula varieties have much closer allophones. In some of the sedentary varieties, the allophones are gradually splitting into new phonemes under the influence of loanwords, where the allophone closest in sound to the source-language vowel often appears regardless of the presence or absence of nearby emphatic consonants.[citation needed]
  • Spread of "emphasis", visible in the backing of phonemic /a(ː)/. In conservative varieties of the Arabic peninsula, only /a/ adjacent to emphatic consonants is affected, while in Cairo, an emphatic consonant anywhere in a word tends to trigger emphatic allophones throughout the entire word.[citation needed] Dialects of the Levant are somewhere in between. Moroccan Arabic is unusual in that /i/ and /u/ have clear emphatic allophones as well (typically lowered, e.g. to [e] and [o]).[citation needed]
  • Monophthongization of diphthongs such as /aj/ and /aw/ to /eː/ and /oː/, respectively (/iː/ and /uː/ in parts of the Maghrib, such as in Moroccan Arabic). Mid vowels may also be present in loanwords such as ملبورن (/milboːrn/ Melbourne), سكرتير (/sikriteːr/ '(male) secretary') and دكتور (/duktoːr/ 'doctor').[16]
  • Raising of word final /a/ to [e]. In some parts of Levant, also word-medial /aː/ to [eː]. See Lebanese Arabic.
  • Loss of final short vowels (with /i/ sometimes remaining), and shortening of final long vowels. This triggered the loss of most Classical Arabic case and mood distinctions.[citation needed]
  • Collapse and deletion of short vowels. In many varieties, such as North Mesopotamian, many Levantine dialects, many Bedouin dialects of the Maghrib, and Mauritanian, short /i/ and /u/ have collapsed to schwa and exhibit very little distinction so that such dialects have two short vowels, /a/ and /ə/.[citation needed] Many Levantine dialects show partial collapse of /i/ and /u/, which appear as such only in the next-to-last phoneme of a word (i.e. followed by a single word-final consonant), and merge to /ə/ elsewhere.[citation needed] A number of dialects that still allow three short vowels /a/ /i/ /u/ in all positions, such as Egyptian Arabic, nevertheless show little functional contrast between /i/ and /u/ as a result of past sound changes converting one sound into the other.[37] Arabic varieties everywhere have a tendency to delete short vowels (especially other than /a/) in many phonological contexts. When combined with the operation of inflectional morphology, disallowed consonant clusters often result, which are broken up by epenthetic short vowels, automatically inserted by phonological rules. In these respects (as in many others), Moroccan Arabic has the most extreme changes, with all three short vowels /a/, /i/, /u/ collapsing to a schwa /ə/, which is then deleted in nearly all contexts.[citation needed] This variety, in fact, has essentially lost the quantitative distinction between short and long vowels in favor of a new qualitative distinction between unstable "reduced" vowels (especially /ə/) and stable, half-long "full" vowels /a/, /i/, /u/ (the reflexes of original long vowels).[citation needed] Classical Arabic words borrowed into Moroccan Arabic are pronounced entirely with "full" vowels regardless of the length of the original vowel.[citation needed]

Phonologies of different Arabic dialects[edit]

The main dialectal variations in Arabic consonants revolve around the six consonants; ج, ق, ث, ذ, ض and ظ:

Letter Classical Modern Standard Dialectal Main Variations Less Common Variations
ث /θ/ /θ/ [θ] [t] [s] [f]
ج /gʲ/ or /ɟ/ /d͡ʒ/ [d͡ʒ] [ʒ] [ɡ] [ɟ] [j] [d͡z] [d]
ذ /ð/ /ð/ [ð] [d] [z] [v]
ض /ɮˤ/ /dˤ/ [] [ðˤ] [] [d] []
ظ /ðˤ/ /ðˤ/ [ðˤ] [] []
ق /q/ or /ɡ/ /q/ [q] [ɡ] [ʔ] [ɢ] [k] [d͡ʒ] [d͡z] [ɣ]


The Arabic of Cairo (often called "Egyptian Arabic" or more correctly "Cairene Arabic") is a typical sedentary variety and a de facto standard variety among certain segments of the Arabic-speaking population, due to the dominance of Egyptian media. Watson adds emphatic labials [mˤ] and [bˤ][38] and emphatic [rˤ][22] to Cairene Arabic with marginal phonemic status. Cairene has also merged the interdental consonants with the dental plosives (e.g., ثلاثة /θalaːθa/[tæˈlæːtæ] 'three') except in loanwords from Classical Arabic where they are nativized as sibilant fricatives (e.g., ثانوية /θaːnawijja/[sænæˈwejja], 'secondary school'). Cairene speakers pronounce /d͡ʒ/ as [ɡ] and debuccalized /q/ to [ʔ] (again, loanwords from Classical Arabic have reintroduced the earlier sound[37] or approximated to [k] with the front vowel around it [æ] changed to the back vowel [ɑ]). Classical Arabic diphthongs /aj/ and /aw/ became realized as [eː] and [oː] respectively. Still, Egyptian Arabic sometimes has minimal pairs like شايلة [ˈʃæjlæ] 'carrying FEM SG' vs. شيلة [ˈʃeːlæ] 'burden'. جيب [ɡeːb] 'pocket' + -نا [næ] 'our' → collapsing with [ˈɡebnæ] which means (جبنة 'cheese' or جيبنا 'our pocket'),[39] because Cairene phonology cannot have long vowels before two consonants. Cairene also has [ʒ] as a marginal phoneme from loanwords from languages other than Classical Arabic.[40]


Varieties such as that of Sanaa, Yemen, are more conservative and retain most phonemic contrasts of Classical Arabic. Sanaani possesses [ɡ] as a reflex of Classical /q/ (which still functions as an emphatic consonant).[39] In unstressed syllables, Sanaani short vowels may be reduced to [ə].[41] /tˤ/ is voiced to [dˤ] in initial and intervocalic positions.[38]


The most frequent consonant phoneme is /r/, the rarest is /ðˤ/. The frequency distribution of the 28 consonant phonemes, based on the 2,967 triliteral roots listed by Wehr[31] is (with the percentage of roots in which each phoneme occurs):

Phoneme Frequency Phoneme Frequency
/r/ 24% /w/ 18%
/l/ 17% /m/ 17%
/n/ 17% /b/ 16%
/f/ 14% /ʕ/ 13%
/q/ 13% /d/ 13%
/s/ 13% /ħ/ 12%
/j/ 12% /ʃ/ 11%
/d͡ʒ/ 10% /k/ 9%
/h/ 8% /z/ 8%
/tˤ/ 8% /χ/ 8%
/sˤ/ 7% /ʔ/ 7%
/t/ 6% /dˤ/ 5%
/ʁ/ 5% /θ/ 3%
/ð/ 3% /ðˤ/ 1%

This distribution does not necessarily reflect the actual frequency of occurrence of the phonemes in speech, since pronouns, prepositions and suffixes are not taken into account, and the roots themselves will occur with varying frequency. In particular, /t/ occurs in several extremely common affixes (occurring in the marker for second-person or feminine third-person as a prefix, the marker for first-person or feminine third-person as a suffix, and as the second element of Forms VIII and X as an infix) despite being fifth from last on Wehr's list. The list does give, however, an idea of which phonemes are more marginal than others. Note that the five least frequent letters are among the six letters added to those inherited from the Phoenician alphabet, namely, ḍād, ṯāʾ, ḫāʾ, ẓāʾ, ḏāl and ġayn.


The Literary Arabic sample text is a reading of The North Wind and the Sun by a speaker who was born in Safed, lived and was educated in Beirut from age 8 to 15, subsequently studied and taught in Damascus, studied phonetics in Scotland and since then has resided in Scotland and Kuwait.[42]

Normal orthographic version[edit]

كانت ريح الشمال تتجادل والشمس في أي منهما كانت أقوى من الأخرى، وإذ بمسافر يطلع متلفعا بعباءة سميكة. فاتفقتا على اعتبار السابق في إجبار المسافر على خلع عباءته الأقوى. عصفت ريح الشمال بأقصى ما استطاعت من قوة. ولكن كلما ازداد العصف ازداد المسافر تدثرا بعباءته، إلى أن أسقط في يد الريح فتخلت عن محاولتها. بعدئذ سطعت الشمس بدفئها، فما كان من المسافر إلا أن خلع عباءته على التو. وهكذا اضطرت ريح الشمال إلى الاعتراف بأن الشمس كانت هي الأقوى.

Diacriticized orthographic version[edit]

كَانَتْ رِيحُ الشَّمَالِ تَتَجَادَلُ وَالشَّمْسَ فِي أَيٍّ مِنْهُمَا كَانَتْ أَقْوَى مِنَ الأُخْرَى، وَإِذْ بِمُسَافِرٍ يَطْلَعُ مُتَلَفِّعًا بِعَبَاءَةٍ سَمِيكَةٍ. فَاتَّفَقَتَا عَلَى اعْتِبارِ السَّابِقِ فِي إِجْبارِ المُسَافِرِ عَلَى خَلْعِ عَباءَتِهِ الأَقْوى. عَصَفَتْ رِيحُ الشَّمالِ بِأَقْصَى مَا اسْتَطَاعَتْ مِن قُوَّةٍ. وَلٰكِنْ كُلَّمَا ازْدَادَ العَصْفُ ازْدَادَ المُسَافِرُ تَدَثُّرًا بِعَبَاءَتِهِ، إِلَى أَنْ أُسْقِطَ فِي يَدِ الرِّيحِ فَتَخَلَّتْ عَنْ مُحَاوَلَتِهَا. بَعْدَئِذٍ سَطَعَتِ الشَّمْسُ بِدِفْئِهَا، فَمَا كَانَ مِنَ المُسَافِرِ إِلَّا أَنْ خَلَعَ عَبَاءَتَهُ عَلَى التَّوِّ. وَهٰكَذَا اضْطُرَّتْ رِيحُ الشَّمَالِ إِلَى الاِعْتِرَافِ بِأَنَّ الشَّمْسَ كَانَتْ هِيَ الأَقْوَى.[43]

Phonemic transcription (with i‘rāb)[edit]

/kaːnat riːħ uʃ.ʃamaːli tatad͡ʒaːdalu waʃ.ʃamsa fiː ʔaj.jin minhumaː kaːnat ʔaqwaː min al ʔuxraː | wa ʔið bi musaːfirin jatˤlaʕu mutalaf.fiʕan bi ʕabaːʔatin samiːka || fat.tafaqataː ʕala ʕ.tibaːr is.saːbiqi fiː ʔid͡ʒbaːr il.musaːfiri ʕalaː xalʕi ʕabaːʔatihi l.ʔaqwaː || ʕasˤafat riːħ uʃ.ʃamaːli bi ʔaqsˤaː mas.tatˤaːʕat min quw.wa || wa laːkin kul.lama z.daːd al.ʕasˤfu z.daːd al musaːfiru tadaθ.θuran bi ʕabaːʔatih | ʔilaː ʔan ʔusqitˤa fiː jad ir.riːħi fataxal.lat ʕan muħaːwalatihaː || baʕda.ʔiðin satˤaʕat iʃ.ʃamsu bi difʔihaː | fa maː kaːna min al musaːfiri ʔil.laː ʔan xalaʕa ʕabaːʔatahu ʕalat.taw || wa haːkaða t.tˤur.rat riːħ uʃ.ʃamaːli ʔila l.ʔiʕtiraːfi bi ʔan.n aʃ.ʃamsa kaːnat hija l.ʔaqwaː/[43]

Phonemic transcription (without i‘rāb)[edit]

/kaːnat riːħ uʃ.ʃamaːl tatad͡ʒaːdal waʃ.ʃams fiː ʔaj.jin minhumaː kaːnat ʔaqwaː min al ʔuxraː | wa ʔið bi musaːfir jatˤlaʕ mutalaf.fiʕan bi ʕabaːʔa samiːkah || fa t.tafaqataː ʕala ʕ.tibaːri s.saːbiq fiː ʔid͡ʒbaːri l musaːfir ʕalaː xalʕ ʕabaːʔatihi l.ʔaqwaː || ʕasˤafat riːħu ʃ.ʃamaːl bi ʔaqsˤaː ma statˤaːʕat min quw.wa || wa laːkin kul.lama z.daːda l.ʕasˤfu z.daːd al.musaːfir tadaθːuran bi ʕabaːʔatih | ʔilaː ʔan ʔusqitˤ fiː jad ir.riːħ fa taxal.lat ʕan muħaːwalatihaː || baʕdaʔiðin satˤaʕat iʃ.ʃams bi difʔihaː | fa maː kaːn min al musaːfir ʔil.laː ʔan xalaʕa ʕabaːʔatahu ʕala t.taw || wa haːkaða t.tˤur.rat riːħ uʃ.ʃamaːl ʔila l.ʔiʕtiraːf bi ʔan.n aʃ.ʃams kaːnat hija l.ʔaqwaː/

Phonetic transcription (Egypt)[edit]

[ˈkæːnæt riːħ æʃ.ʃæˈmæːl tætæˈɡæːdæl wæʃˈʃæm.se fiː ˈʔæj.jin menˈhomæ ˈkæːnæt ˈʔɑqwɑ mɪn ælˈʔʊxrɑ | ʔɪð bi mʊˈsæːfer ˈjɑtˤlɑʕ mʊtæˈlæf.feʕ bi ʕæˈbæːʔæ sæˈmiːkæ || t.tæfɑqɑˈtæː ˈʕælæ ʕ.teˈbɑːrɪ sˈsɑːbeq fiː ʔeɡbɑːr æl mʊˈsæːfer ˈʕælæ ˈxælʕe ʕæbæːˈʔæt(i)hi lˈʔɑqwɑː || ˈʕɑsˤɑfɑt riːħ æʃ.ʃæˈmæːl bi ˈʔɑqsˤɑ s.tɑˈtˤɑːʕɑt mɪn ˈqow.wɑ || ˈlæːken kʊlˈlæmæ zˈdæːd æl ʕɑsˤf ɪzˈdæːd æl.mʊˈsæːfer tædæθˈθʊræn bi ʕæbæːˈʔætih | ˈʔilæ ʔæn ˈʔosqetˤ fiː jæd ærˈriːħ tæˈxæl.læt ʕæn mʊħæːwæˈlæt(i)hæ || bæʕdæˈʔiðin ˈsɑtˤɑʕɑt æʃˈʃæm.se bi dɪfˈʔihæ | mæː kæːn mɪn æl.mʊˈsæːfer ˈʔil.læ ʔæn ˈxælæʕ ʕæbæːˈʔætæh ʕælætˈtæw || hæːˈkæðæ tˈtˤor.rɑt riːħ æʃ.ʃæˈmæːl ˈʔilæ l.ʔeʕteˈrɑːf biˈʔænn æʃˈʃæm.se ˈkæːnæt ˈhɪ.jæ lˈʔɑqwɑ]

ALA-LC transliteration[edit]

Kānat rīḥ al-shamāl tatajādalu wa-al-shams fī ayyin minhumā kānat aqwá min al-ukhrá, wa-idh bi-musāfir yaṭlaʻu mutalaffiʻ bi-ʻabāʼah samīkah. Fa-ittafaqatā ʻalá iʻtibār al-sābiq fī ijbār al-musāfir ʻalá khalʻ ʻabāʼatihi al-aqwá. ʻAṣafat rīḥ al-shamāl bi-aqṣá mā istaṭāʻat min qūwah. Wa-lākin kullamā izdāda al-ʻaṣf izdāda al-musāfir tadaththuran bi-ʻabāʼatih, ilá an usqiṭ fī yad al-rīḥ fa-takhallat ʻan muḥāwalatihā. Baʻdaʼidhin saṭaʻat al-shams bi-difʼihā, fa-mā kāna min al-musāfir illā an khalaʻa ʻabāʼatahu ʻalá al-taww. Wa-hākadhā iḍṭurrat rīḥ al-shamāl ilá al-iʻtirāf bi-an al-shams kānat hiya al-aqwá.

English Wiktionary transliteration (based on Hans Wehr)[edit]

kānat rīḥu š-šamāli tatajādalu wa-š-šamsa fī ʾayyin minhumā kānat ʾaqwā mina l-ʾuḵrā, wa-ʾiḏ bi-musāfirin yaṭluʿu mutalaffiʿan bi-ʿabāʾatin samīkatin. fa-t-tafaqatā ʿalā ʿtibāri s-sābiqi fī ʾijbāri l-musāfiri ʿalā ḵalʿi ʿabāʾatihi l-ʾaqwā. ʿaṣafat rīḥu š-šamāli bi-ʾaqṣā mā staṭāʿat min quwwatin. walākin kullamā zdāda l-ʿaṣfu zdāda l-musāfiru tadaṯṯuran bi-ʿabāʾatihi, ʾilā ʾan ʾusqiṭa fī yadi r-rīḥi fataḵallat ʿan muḥāwalatihā. baʿdaʾiḏin saṭaʿati š-šamsu bi-difʾihā, famā kāna mina l-musāfiri ʾillā ʾan ḵalaʿa ʿabāʾatahu ʿalā t-tawwi. wa-hakaḏā ḍṭurrat rīḥu š-šamāli ʾilā l-ʾiʿtirāfi biʾanna š-šamsa kānat hiya l-ʾaqwā.

English Translation[edit]

The North Wind and the Sun were disputing which was the stronger, when a traveler came along wrapped in a warm cloak. They agreed that the one who first succeeded in making the traveler take his cloak off should be considered stronger than the other. Then the North Wind blew as hard as he could, but the more he blew the more closely did the traveler fold his cloak around him; and at last the North Wind gave up the attempt. Then the Sun shined out warmly, and immediately the traveler took off his cloak. And so the North Wind was obliged to confess that the Sun was the stronger of the two.


  1. ^ Kirchhoff & Vergyri (2005:38)
  2. ^ Kirchhoff & Vergyri (2005:38–39)
  3. ^ Holes (2004:57)
  4. ^ Lipinski (1997:124)
  5. ^ Al-Jallad, 42
  6. ^ Watson (2002:5, 15–16)
  7. ^ a b c Watson (2002:2)
  8. ^ "Recently catalogued: an enigma in the Senior Library | Lincoln College Oxford". lincoln.ox.ac.uk. Retrieved 2022-04-11.
  9. ^ a b Thelwall (1990:39)
  10. ^ Holes (2004:60)
  11. ^ a b c Thelwall (1990:38)
  12. ^ Abd-El-Jawad (1987:359)
  13. ^ Abd-El-Jawad (1987:361)
  14. ^ a b Watson (1999:290)
  15. ^ Davis (1995:466)
  16. ^ a b Elementary Modern Standard Arabic: Volume 1, by Peter F. Abboud (Editor), Ernest N. McCarus (Editor)
  17. ^ a b c Teach Yourself Arabic, by Jack Smart (Author), Frances Altorfer (Author)
  18. ^ Holes (2004:58)
  19. ^ Watson (2002:44)
  20. ^ Thelwall (1990:38), Al Ani (1970:32, 44–45)
  21. ^ Al-Azraqi. (2019). Delateralisation in Arabic and Mehri. Dialectologia, 23: 1–23. https://raco.cat/index.php/Dialectologia/article/download/366597/460520/
  22. ^ a b c d Watson (2002:16)
  23. ^ al Nassir, Abdulmunʿim Abdulamir (1985). Sibawayh the Phonologist (PDF) (in Arabic). University of New York. p. 80. Retrieved 23 April 2024.
  24. ^ Watson (2002:18)
  25. ^ Ladefoged & Maddieson (1996:167–168)
  26. ^ Thelwall (1990), citing Gairdner (1925), Al Ani (1970), and Kästner (1981).
  27. ^ McCarthy (1994:194–195)
  28. ^ Watson (2002:19)
  29. ^ Holes (2004:95)
  30. ^ Ferguson (1956:449)
  31. ^ a b c Hans Wehr, Dictionary of Modern Written Arabic (transl. of Arabisches Wörterbuch für die Schriftsprache der Gegenwart, 1952)
  32. ^ a b c d e Ryding, Karin C. (2005-08-25). A Reference Grammar of Modern Standard Arabic. Cambridge University Press. ISBN 978-0-521-77151-1.
  33. ^ a b c Watson, Janet C. E. (2011). "Word stress in Arabic". In Marc van Oostendorp (ed.). The Blackwell Companion to Phonology. Vol. 5. Oxford: Wiley-Blackwell. pp. 2990–3019. ISBN 9781405184236.
  34. ^ Versteegh, Kees (1997). The Arabic Language. Edinburgh: Edinburgh University Press. p. 90.
  35. ^ Watson (2002:60–62), citing Ṣan‘ā’ni and Cairene as examples with and without this phoneme, respectively.
  36. ^ Gulf Arabic Sounds
  37. ^ a b Watson (2002:22)
  38. ^ a b Watson (2002:14)
  39. ^ a b Watson (2002:23)
  40. ^ Watson (2002:21)
  41. ^ Watson (2002:40)
  42. ^ Thelwall (1990:37)
  43. ^ a b Thelwall (1990:40)