Persian alphabet

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The Persian alphabet (Persian: الفبای فارسی‎, alefbā-ye fârsi), or Perso-Arabic alphabet, is a writing system used for the Persian language.

The Persian script is a modified version of the Arabic script. It is an abjad, meaning vowels are underrepresented in writing. The writing direction is mostly but not exclusively right-to-left; mathematical expressions, numeric dates and numbers bearing units are embedded from left to right. The script is cursive, meaning most letters in a word connect to each other; when they are typed, contemporary word processors automatically joins adjacent letterforms. However, some Persian compounds do not join, and Persian adds four letters to the basic set for a total of 32 characters.

The replacement of the Pahlavi scripts with the Persian alphabet to write the Persian language was done by the Tahirid dynasty in 9th-century Greater Khorasan.[1][2]


Example showing the Nastaʿlīq calligraphic style's proportion rules

Below are the 32 letters of the modern Persian alphabet. Since the script is cursive, the appearance of a letter changes depending on its position: isolated, initial (joined on the left), medial (joined on both sides) and final (joined on the right) of a word.[3]

The names of the letter are mostly the ones used in Arabic except for the Persian pronunciation. The only ambiguous name is he, which is used for both ح and ه. For clarification, they are often called ḥâ-ye ḥotti or ḥä-ye jimi (literally "jim-like ḥe" after jim, the name for the letter ج that uses the same base form) and hâ-ye havvaz or hâ-ye do-češm (literally "two-eyed he", after the contextual middle letterform ـهـ), respectively.

Overview table[edit]

# Name
(in Persian)
DIN 31635 IPA Unicode Contextual forms
Final Medial Initial Isolated
0 همزه hamzeh[4] ʾ [ʔ] U+0621 N/A N/A N/A ء
U+0623 ـأ أ
U+0626 ـئ ـئـ ئـ ئ
U+0624 ـؤ ؤ
1 الف ʾalef â [ɒ] U+0627 ـا ا
2 به be b [b] U+0628 ـب ـبـ بـ ب
3 په pe p [p] U+067E ـپ ـپـ پـ پ
4 ته te t [t] U+062A ـت ـتـ تـ ت
5 ثه s̱e [s] U+062B ـث ـثـ ثـ ث
6 جیم jim j [d͡ʒ] U+062C ـج ـجـ جـ ج
7 چه che č [t͡ʃ] U+0686 ـچ ـچـ چـ چ
8 حه ḥe (ḥâ-ye ḥotti, ḥâ-ye jimi) [h] U+062D ـح ـحـ حـ ح
9 خه khe kh [x] U+062E ـخ ـخـ خـ خ
10 دال dâl d [d] U+062F ـد د
11 ذال ẕâl [z] U+0630 ـذ ذ
12 ره re r [ɾ] U+0631 ـر ر
13 زه ze z [z] U+0632 ـز ز
14 ژه že ž [ʒ] U+0698 ـژ ژ
15 سین sin s [s] U+0633 ـس ـسـ سـ س
16 شین šin š [ʃ] U+0634 ـش ـشـ شـ ش
17 صاد ṣäd [s] U+0635 ـص ـصـ صـ ص
18 ضاد zâd z [z] U+0636 ـض ـضـ ضـ ض
19 طی، طا tâ, toy (in Dari) ts [s] U+0637 ـط ـطـ طـ ط
20 ظی، ظا ẓâ, ẓoy (in Dari) [z] U+0638 ـظ ـظـ ظـ ظ
21 عین ʿayn ʿ [ʔ] U+0639 ـع ـعـ عـ ع
22 غین ġayn ġ [ɣ] U+063A ـغ ـغـ غـ غ
23 فه fe f [f] U+0641 ـف ـفـ فـ ف
24 قاف q̈âf [ɣ] U+0642 ـق ـقـ قـ ق
25 کاف kâf k [k] U+06A9 ـک ـکـ کـ ک
26 گاف gâf g [ɡ] U+06AF ـگ ـگـ گـ گ
27 لام lâm l [l] U+0644 ـل ـلـ لـ ل
28 میم mim m [m] U+0645 ـم ـمـ مـ م
29 نون nun n [n] U+0646 ـن ـنـ نـ ن
30 واو vâv v / ū / ow / (w / aw / ō in Dari) [v] / [uː] / [o] / [ow] / ([w] / [aw] / [oː] in Dari) U+0648 ـو و
31 هه he (hā-ye havvaz, hā-ye do-češm) h [h] U+0647 ـه ـهـ هـ ه
32 یه ye y / ī / á / (ay / ē in Dari) [j] / [i] / [ɒː] / ([aj] / [eː] in Dari) U+06CC ـی ـیـ یـ ی

Letters that do not link to a following letter[edit]

Seven letters (و, ژ, ز, ر, ذ, د, ا) do not connect to a following letter, unlike the rest of the letters of the alphabet. The seven letters have the same form in isolated and initial position and a second form in medial and final position. For example, when the letter ا alef is at the beginning of a word such as اینجا injâ ("here"), the same form is used as in an isolated alef. In the case of امروز emruz ("today"), the letter ر re takes the final form and the letter و vâv takes the isolated form, but they are in the middle of the word, and ز also has its isolated form, but it occurs at the end of the word.


Persian script has adopted a subset of Arabic diacritics: zebar /æ/ (fatḥah in Arabic), zir /e/ (kasrah in Arabic), and pish /o/ or /o/ (ḍammah in Arabic, pronounced zamme in Western Persian), tanwīn e nasb /æn/ and shadda (gemination). Other Arabic diacritics may be seen in Arabic loanwords.

Short vowels[edit]

Of the four Arabic short vowels, the Persian language has adopted the following three. The last one, sukūn, is not adopted.

Short vowels
(fully vocalized text)
(in Persian)
Trans. Value
zebar/zibar a Ir. /æ/; D. /a/
zer/zir Ir. e; D. i Ir. /e/; D. /ɪ/
pesh/pish Ir. o; D. u Ir. /o/; D. /ʊ/

In Iranian Persian, none of these short vowels may be the initial or final grapheme in an isolated word, although they may appear in the final position as an inflection, when the word is part of a noun group. In a word that starts with a vowel, the first grapheme is a silent alef which carries the short vowel, e.g. اُمید (omid, meaning "hope"). In a word that ends with a vowel, letters ع‎, ه‎ and و respectively become the proxy letters for zebar, zir and pish, e.g. نو (no, meaning "new") or بسته (bas-teh, meaning "package").

Tanvin (nunation)[edit]

(fully vocalized text)
(in Persian)
َاً، ـاً، ءً
تنوین نصب Tanvin e nasb
تنوین جرّ Tanvin e jarr Never used in the Persian language.

Taught in Islamic nations to

complement Quran education.

تنوین رفع Tanvin e rafe


(fully vocalized text)
(in Persian)
تشدید tashdid

Other characters[edit]

The following are not actual letters but different orthographical shapes for letters, a ligature in the case of the lâm alef. As to (hamza), it has only one graphic since it is never tied to a preceding or following letter. However, it is sometimes 'seated' on a vâv, ye or alef, and in that case, the seat behaves like an ordinary vâv, ye or alef respectively. Technically, hamza is not a letter but a diacritic.

Name Pronunciation IPA Unicode Final Medial Initial Stand-alone Notes
alef madde â [ɒ] U+0622 ـآ آ آ The final form is very rare and is freely replaced with ordinary alef.
he ye -eye or -eyeh [eje] U+06C0 ـۀ ۀ Validity of this form depends on region and dialect. Some may use the three-letter ـه‌ی combination instead.
lām alef [lɒ] U+0644 (lām) and U+0627 (alef) ـلا لا
kashida U+0640 ـ This is the medial character which connects other characters

Although at first glance, they may seem similar, there are many differences in the way the different languages use the alphabets. For example, similar words are written differently in Persian and Arabic, as they are used differently.

Novel letters[edit]

The Persian alphabet adds four letters to the Arabic alphabet: /p/, /ɡ/, /t͡ʃ/ (ch in chair), /ʒ/ (s in measure).

Sound Shape Unicode name Unicode code point
/p/ پ peh U+067E
/t͡ʃ/ (ch) چ tcheh U+0686
/ʒ/ (zh) ژ jeh U+0698
/ɡ/ گ gāf U+06AF

Deviations from the Arabic script[edit]

The shapes of the Persian digits four (۴), five (۵), and six (۶) are different from the shapes used in Arabic and the other numbers have different codepoints.[5]

Name Persian Unicode Arabic Unicode
0 ۰ U+06F0 ٠ U+0660
1 ۱ U+06F1 ١ U+0661
2 ۲ U+06F2 ٢ U+0662
3 ۳ U+06F3 ٣ U+0663
4 ۴ U+06F4 ٤ U+0664
5 ۵ U+06F5 ٥ U+0665
6 ۶ U+06F6 ٦ U+0666
7 ۷ U+06F7 ٧ U+0667
8 ۸ U+06F8 ٨ U+0668
9 ۹ U+06F9 ٩ U+0669
ye ی U+06CC ي U+064A
kāf ک U+06A9 ك U+0643

Word boundaries[edit]

Typically, words are separated from each other by a space. Certain morphemes (such as the plural ending '-hâ'), however, are written without a space. On a computer, they are separated from the word using the zero-width non-joiner.

See also[edit]


  1. ^ Ira M. Lapidus (2012). Islamic Societies to the Nineteenth Century: A Global History. Cambridge University Press. pp. 256–. ISBN 978-0-521-51441-5.
  2. ^ Ira M. Lapidus (2002). A History of Islamic Societies. Cambridge University Press. pp. 127–. ISBN 978-0-521-77933-3.
  3. ^ "ویژگى‌هاى خطّ فارسى". Academy of Persian Language and Literature.
  4. ^ "??" (PDF). Retrieved 2015-09-05.
  5. ^ "Unicode Characters in the 'Number, Decimal Digit' Category".

External links[edit]