Urdu alphabet

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Urdu alphabet
اردو تہجی
Urdu example.svg
Example of writing in the Urdu alphabet: Urdu
Languages Urdu, Balti, Burushaski, others
Parent systems

U+0600 to U+06FF
U+0750 to U+077F
U+FB50 to U+FDFF

U+FE70 to U+FEFF

The Urdu alphabet is the right-to-left alphabet used for the Urdu language. It is a modification of the Persian alphabet, which is itself a derivative of the Arabic alphabet. With 38 letters and no distinct letter cases, the Urdu alphabet is typically written in the calligraphic Nastaʿlīq script, whereas Arabic is more commonly in the Naskh style. Usually, bare transliterations of Urdu into Roman letters (called Roman Urdu) omit many phonemic elements that have no equivalent in English or other languages commonly written in the Latin script. The National Language Authority of Pakistan has developed a number of systems with specific notations to signify non-English sounds, but these can only be properly read by someone already familiar with the loan letters.[citation needed]


The Urdu language emerged as a distinct register of Hindustani well before the Partition of India. It is distinguished most by its extensive Persian influences (Persian having been the official language of the Mughal government and the most prominent lingua franca of the Indian subcontinent for several centuries before the solidification of British colonial rule during the 19th century). The standard Urdu script is a modified version of the Perso-Arabic script, expanded to accommodate the phonology of Hindustani.

Despite the invention of the Urdu typewriter in 1911, Urdu newspapers continued to publish prints of handwritten scripts by calligraphers known as katibs or khush-navees until the late 1980s. The Pakistani national newspaper Daily Jang was the first Urdu newspaper to use Nastaʿlīq computer-based composition. There are efforts under way to develop more sophisticated and user-friendly Urdu support on computers and the internet. Nowadays, nearly all Urdu newspapers, magazines, journals, and periodicals are composed on computers with Urdu software programs.

Apart from being more or less Persianate, Urdu and Hindi are mutually intelligible.

Countries where Urdu language has been spoken[edit]

Afghanistan, Bahrain, Bangladesh, Botswana, Burma, France, Fiji, Germany, Guyana, India, Kenya, Malaysia, Malawi, Mauritius, Norway, Oman, Pakistan, Qatar, Saudi Arabia, Singapore, South Africa, Thailand, Tajikistan, the UAE, the UK, Uganda, Uzbekistan, Canada and Zambia.[1]


Main article: Nastaʿlīq script

The Nastaʿlīq calligraphic writing style began as a Persian mixture of scripts Naskh and Ta'liq. After the Mughal conquest, Nasta'liq became the preferred writing style for Urdu. It is the dominant style in Pakistan, and many Urdu writers elsewhere in the world use it. Nastaʿlīq is more cursive and flowing than its Naskh counterpart.


The Urdu alphabet, with names in the Devanagari and Latin alphabets

A list of the letters of the Urdu alphabet and their pronunciation is given below. Urdu contains many historical spellings from Arabic and Persian, and therefore has many irregularities. The Arabic letters yaa and haa both have two variants in Urdu: one of the yaa variants is used at the ends of words for the sound [eː], and one of the haa variants is used to indicate the aspirated consonants. The retroflex consonants needed to be added as well; this was accomplished by placing a small ط (tō'ē) above the corresponding dental consonants. Several letters which represent distinct consonants in Arabic are conflated in Persian, and this has carried over to Urdu. This is the list of the Urdu letters, giving the consonant pronunciation. Some of these letters also represent vowel sounds. ض ط

No. Name Transcription IPA Contextual forms Isolated
Final Medial Initial
1 ʾalif ā, ', – /ɑː, ʔ, ∅/ ـا ـا ا، آ ا، آ
2 b /b/ ـب ـبـ بـ ب
3 p /p/ ـپ ـپـ پـ پ
4 t /t̪/ ـت ـتـ تـ ت
5 ṭā /ʈ/ ـٹ ـٹـ ٹـ ٹ
6 thā th /s/ ـث ـثـ ثـ ث
7 jīm j /d͡ʒ/ ـج ـجـ جـ ج
8 chīm c /t͡ʃ/ ـچ ـچـ چـ چ
9 baṛī hā h /h, ɦ/ ـح ـحـ حـ ح
10 x /x/ ـخ ـخـ خـ خ
11 dāl d /d̪/ ـد ـد د د
12 ḍāl /ɖ/ ـڈ ـڈ ڈ ڈ
13 thāl th /z/ ـذ ـذ ذ ذ
14 r /r/ ـر ـر ر ر
15 ṛā /ɽ/ ـڑ ـڑ ڑ ڑ
16 zāy z /z/ ـز ـز ز ز
17 žāy zh /ʒ/ ـژ ـژ ژ ژ
18 sīn s /s/ ـس ـسـ سـ س
19 šīn sh /ʃ/ ـش ـشـ شـ ش
20 ṡu'ād s /s/ ـص ـصـ صـ ص
21 du'ād d /d/ ـض ـضـ ضـ ض
22 ṫā t /t/ ـط ـطـ طـ ط
23 thā th /z/ ـظ ـظـ ظـ ظ
24 ʿain ā, ō, ē, ', /ɑː, oː, eː, ʔ, ʕ, Ø/ ـع ـعـ عـ ع
25 ğain gh /ɣ/ ـغ ـغـ غـ غ
26 fa f /f/ ـف ـفـ فـ ف
27 qāf q /q/ ـق ـقـ قـ ق
28 kāf k /k/ ـك ـكـ كـ ك
29 gāf g /ɡ/ ـگ ـگـ گـ گ
30 lām l /l/ ـل ـلـ لـ ل
31 mīm m /m/ ـم ـمـ مـ م
32 nūn n /n, ɲ, ɳ, ŋ/ ـن ـنـ نـ ن
33 chōtī hā h /hː/ ـه ـهـ هـ ه
34 wa'ō w /u/ ـو ـو و و
35 lāmalif /la/ or /lā/ ـلا ـلا لا لا
36 hamzah ', – /ʔ/, /Ø/ ء  ء  ء ء
37 chōṭī yā y, ī /j, iː/ ـی ـيـ یـ ی
38 baṛī yā ai or ē /ɛː, eː/ ـے ے
39 y /y:/[dubious ] ـي ـيـ يـ ي

اَ بَ پَ تَ ٹَ ثَ جَ چَ حَ خَ دَ ڈَ ذَ ڎَ رَ ڑَ زَ ژَ سَ شَ صَ ضَ طَ ظَ عَ غَ فَ قَ كَ گَ لَ مَ نَ ںَ وَ هَ ھَ لاَ ءَ ىَ ےَ يَ


Vowels in Urdu are represented by letters that are also considered consonants. Many vowel sounds can be represented by one letter. Confusion can arise, but context is usually enough to figure out the correct sound.

Vowel chart[edit]

This is a list of Urdu vowels found in the initial, medial, and final positions.

Romanization Pronunciation Final Medial Initial
a /ə/ Zabar-malplena.svg Zabar-malplena.svg AlifZabar-komenca-malplena.svg
ā /aː/ Alif-fina-malplena.svg Alif-meza-malplena.svg AlifMadd-komenca-malplena.svg
i /ɪ/ Hamzah-Urdua-fina-malplena.svg Zer-malplena.svg AlifZer-komenca-malplena.svg
ī /iː/ CHoTTiiYe-fina-malplena.svg CHoTTiiYe-meza-malplena.svg CHoTTiiYe-komenca-malplena.svg
u /ʊ/ Pesh-malplena.svg Pesh-malplena.svg AlifPesh-komenca-malplena.svg
ū /uː/ VaaoUlTTaapesh-fina-malplena.svg VaaoPesh-meza-malplena.svg AlifVaao-komenca-malplena.svg
ē /eː/ BaRRiiYe-fina-malplena.svg CHoTTiiYe-meza-malplena.svg AlifCHoTTiiYe-komenca-malplena.svg
ai /ɛː/ BaRRiiYeZabar-fina-malplena.svg CHoTTiiYe-meza-malplena.svg CHoTTiiYe-komenca-malplena.svg
ō /oː/ Vaao-fina-malplena.svg Vaao-meza-malplena.svg AlifVaao-komenca-malplena.svg
au /ɔː/ VaaoZabar-fina-malplena.svg VaaoZabar-meza-malplena.svg AlifZabarVaao-komenca-malplena.svg

Short vowels[edit]

Short vowels ("a", "i", "u") are represented by marks above and below a consonant.

Vowel Name Transcription IPA
اَ zabar aa /ə/
اِ zer ii /ɪ/
اُ pesh oo /ʊ/


Alif (ا) is the first letter of the Urdu alphabet, and it is used exclusively as a vowel. At the beginning of a word, alif can be used to represent any of the short vowels, e.g. اب ab, اسم ism, اردو urdū, آپ āp, آدمی ādmi, بات bāt, آرام ārām, شجژا shajizha, ڈحثضك ḍahithaduki, گغيزصظخا gaghuyazisathakha, باس bas, كهرطولاعيا kahuratawilaaoya, فخزثژٹيوزشاصلهتے fakhizaṭayawizashasahataea, ڑبرچٹگلك ṛabarachiṭagulak, هذيتكتصكا hazyatakitisukia, خثمهكاذجگچ khathimahikadhajiguchi, چٹح chaṭihu, زتژا zatuzha, زجڈڎ zijuḍizha.


Wā'ō is used to render the vowels "ū", "ō", "u" and "au" ([uː], [oː], [ʊ] and [ɔː] respectively), and it is also used to render the labiodental approximant, [ʋ].


Ye is divided into two variants: choṭī ye and baṛi ye.

Choṭī ye (ی) is written in all forms exactly as in Persian. It is used for the long vowel "ī" and the consonant "y".

Baṛī ye (ے) is used to render the vowels "e" and "ai" (/eː/ and /ɛː/ respectively). Baṛī ye is distinguished in writing from choṭī ye only when it comes at the end of a word.

Use of specific letters[edit]

Retroflex letters[edit]

Retroflex consonants were not present in the Persian alphabet, and therefore had to be created specifically for Urdu. This was accomplished by placing a superscript ط (to'e) above the corresponding dental consonants.

Letter Name IPA
ٹ ṫē [ʈ]
ڈ ḋāl [ɖ]
ڑ ṙē [ɽ]

Do chashmī he[edit]

The letter do chashmī he (ھ) is used in native Hindustānī words, for aspiration of certain consonants. The aspirated consonants are sometimes classified as separate letters, although it takes two characters to represent them.

Letter Transcription IPA
بھا bhā [bʱɑː]
پھا phā [pʰɑː]
تھا thā [t̪ʰɑː]
ٹھا ṭhā [ʈʰɑː]
جھا jhā [d͡ʒʱɑː]
چھا chā [t͡ʃʰɑː]
دھا dhā [dʱɑː]
ڈھا ḍhā [ɖʱɑː]
ڑھا ṛhā [ɽʱɑː]
کھا khā [kʰɑː]
گھا ghā [ɡʱɑː]

Uddin and Begum Urdu-Hindustani Romanization[edit]

Uddin and Begum Urdu-Hindustani Romanization is another system for Hindustani. It was proposed by Syed Fasih Uddin (late) and Quader Unissa Begum (late). As such is adopted by The First International Urdu Conference (Chicago) 1992 as "The Modern International Standard Letters of Alphabet for URDU-(HINDUSTANI) - The INDIAN Language script for the purposes of hand written communication, dictionary references, published material and Computerized Linguistic Communications (CLC)".

There are significant advantages to this transcription system:

  • It provides a standard which is based on the original works undertaken at the Fort William College, Calcutta, India (established 1800), under John Borthwick Gilchrist (1789–1841), which has become the de facto standard for Hindustani during the late 1800.
  • There is a one-to-one representation for each of the original Urdu and Hindi characters.
  • Vowel sounds are written rather than being assumed as they are in the Urdu alphabet.
  • Unlike Gilchrist’s alphabet, which used many special non-ASCII characters, the proposed alphabet only uses ASCII.
  • Since it is ASCII based, more resources and tools are available.
  • Liberate Urdu–Hindustani language to be written and communicated using all of the available standards and free us from Unicode conversion drudgery.
  • Urdu – Hindustani with this character set fully uses paper and electronic print media.

See also[edit]


External links[edit]