yaɣnobī́ zivók, яғнобӣ зивок
Native to Tajikistan
Region originally from Yaghnob Valley, in 1970s relocated to Zafarobod, in 1990s some speakers returned to Yaghnob
Ethnicity Yaghnobi people
Native speakers
12,000 (2004)[1]
Early form
  • Eastern Yaghnobi
  • Western Yaghnobi
Cyrillic script
Latin script
Arabic script
Language codes
ISO 639-3 yai
Glottolog yagn1238[4]
Linguasphere 58-ABC-a
This article contains IPA phonetic symbols. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Unicode characters.
Yaghnobi-speaking areas and enclaves of Yaghnobi-speakers among a Tajik majority

The Yaghnobi language[5] is a living Eastern Iranian language (the other living members being Pashto, Ossetic and the Pamir languages). Yaghnobi is spoken in the upper valley of the Yaghnob River in the Zarafshan area of Tajikistan by the Yaghnobi people. It is considered to be a direct descendant of Sogdian and has often been called Neo-Sogdian in academic literature.[6]

There are some 12,500 Yaghnobi speakers. They are divided into several communities. The principal group lives in the Zafarobod area. There are also resettlers in the Yaghnob Valley. Some communities live in the villages of Zumand and Kůkteppa and in Dushanbe or in its vicinity.

Most Yaghnobi speakers are bilingual in the West Iranian Tajik. Yaghnobi is mostly used for daily family communication, and Tajik is used by Yaghnobi-speakers for business and formal transactions. A single Russian ethnographer was told by nearby Tajiks, long hostile to the Yaghnobis, who were late to adopt Islam, that the Yaghnobis used their language as a "secret" mode of communication to confuse the Tajiks. The account led to the belief by some, especially those reliant solely on Russian sources, that Yaghnobi or some derivative of it was used as a code for nefarious purposes.[7]

There are two main dialects: a western and an eastern one. They differ primarily in phonetics. For example, historical corresponds to t in the western dialects and s in the eastern: metmes 'day' from Sogdian mēθ ⟨myθ⟩. Western ay corresponds to Eastern e: wayšweš 'grass' from Sogdian wayš or wēš ⟨wyš⟩. The early Sogdian group θr (later ṣ̌) is reflected as sar in the east but tir in the west: saráytiráy 'three' from Sogdian θrē/θray or ṣ̌ē/ṣ̌ay ⟨δry⟩. There are also some differences in verbal endings and in the lexicon. In between the two main dialects is a transitional dialect that shares some features of both other dialects.


Yaghnobi was unwriten until the 1990s,[8] but according to Andreyev, some of the Yaghnobi mullahs used the Arabic script for writing the language before 1928, mainly when they needed to hide some information from the Tajiks.[9] Nowadays, the language is transcribed by scholars using a modified Latin alphabet, with the following symbols: a (á), ā (ā́), b, č, d, e (é), f, g, ɣ, h, ḥ, i (í), ī (ī́), ǰ, k, q, l, m (m̃), n (ñ), o (ó), p, r, s, š, t, u (ú), ū (ū́), ʏ (ʏ́), v, w (u̯), x, x°, y, z, ž, ع

TITUS transcribes the alphabet as such: a (á), b, č, d, e (é), ĕ (ĕ́), ẹ (ẹ́), ẹ̆ (ẹ̆́), ə (ə́), f, g, ɣ, h, x̣, i (í), ĭ (ĭ́), ī (ī́), ǰ, k, q, l, m (m̃), n (ñ), o (ó), ọ (ọ́) p, r, s, š, t, u (ú), ŭ (ŭ́), ı̥ (í̥), v, u̯, x, x°, y, z, ž, ع

In recent times, Sayfiddīn Mīrzozoda, from the Tajik Academy of Sciences, uses a modified Tajik alphabet for writing Yaghnobi. The alphabet is quite unsuitable for Yaghnobi, as it does not distinguish short and long vowels or v and w and it does not mark stress. Latin equivalents are given in parentheses:

А а (a) Б б (b) В в (v) Ԝ ԝ (w) Г г (g) Ғ ғ (ɣ) Д д (d) Е е (e/ye) Ё ё (yo) Ж ж (ž) З з (z) И и (i, ī) Ӣ ӣ (ī) й (y) К к (k) Қ қ (q) Л л (l) М м (m) Н н (n) О о (o) П п (p) Р р (r) С с (s) Т т (t) У у (u, ū, ʏ) Ӯ ӯ (ū, ʏ) Ф ф (f) Х х (x) Хԝ хԝ (x°) Ҳ ҳ (h, ḥ) Ч ч (č) Ҷ ҷ (ǰ) Ш ш (š) Ъ ъ (ع) Э э (e) Ю ю (yu, yū, yʏ) Я я (ya)

Cyrillic script[edit]

А а Б б В в Ԝ ԝ Г г Ғ ғ
Д д Е е Ё ё Ж ж З з И и
Ӣ ӣ Й й К к Қ қ Л л М м
Н н О о П п Р р С с Т т
У у Ӯ ӯ Ф ф Х х Ҳ ҳ Ч ч
Ҷ ҷ Ш ш Ъ ъ Э э Ю ю Я я

Notes to Cyrillic:

1) The letter й never appears at the beginning of a word. Words beginning with ya-, yo- and yu-/yū-/yʏ- are written as я-, ё- and ю-, and the combinations are written in the middle of the word: viyóra is виёра [vɪ̆ˈjoːra].

2) Use of ӣ and ӯ is uncertain, but they seem to distinguish two similar-sounding words: иранка and ӣранка, рупак and рӯпак. Maybe ӣ is also used as a stress marker as it is also in Tajik, and ӯ can also be used in Tajik loanwords to indicate a Tajik vowel ⟨ů⟩ [ɵː], but it can have some other unknown use.

3) In older texts, the alphabet did not use letters Ъ ъ and Э э. Instead of Tajik ъ, Yaghnobi and е covered both Tajik е and э for /e/. Later, the letters were integrated into the alphabet so the older етк was changed into этк to represent the pronunciation [ˈeːtkʰ] (and not *[ˈjeːtkʰ]). Older ша’мак was changed to шаъмак [ʃʲɑʕˈmak].

4) /ji/ and /je/ are written е and и. Yaghnobi и can be */ji/ after a vowel like in Tajik, and ӣ after a vowel is */jiː/. Also, е has two values: word-initiallynand after a vowel, it is pronounced [jeː], but after a consonant, it is [eː]. /je/ is rare in Yaghnobi and is only in Tajik or Russian loans, the only example for /je/ is a Европа [ˈjeːvrɔpa], a Russian loanword.

5) Russian letters Ц ц, Щ щ, Ы ы and Ь ь, which can be used in Tajik loans from Russian, are not used in Yaghnobi. They are written as they are pronounced by the Yaghnobi speakers, not as they are written originally in Russian: aeroplane is самолет/самолёт in Russian, written самолёт in Tajik and pronounced [səmʌˈʎot] in Russian and in Tajik. In Yaghnobi, it is written as самалиёт and follows the Yaghnobi pronunciation [samalɪˈjoːtʰ] or [samajlˈoːtʰ]. The word concert is borrowed to Yaghnobi from Russian концерт [kʌnˈtse̠rt] in form кансерт [kʰanˈseːrtʰ]). Compare with Tajik консерт.

6) By consultation with Sayfiddīn Mīrzozoda, the distinction between sounds /v/ and /w/ needs to be established. For /v/, в is used, but for /w/, another letter should be adopted. W w would be the best choice. For /x°/, Хw хw should be used. Mīrzozoda uses w in some texts, but notatis that is innconsistent.


Yaghnobi includes 9 vowels (3 short, 6 long) and 27 consonants.


short: i [i-ɪ-e], a [(æ-)a(-ɑ)], u [(y-)u-ʊ-o] (all short vowels might be reduced approximately to [ə] in pretonic positions)

long: ī [i:], e [ɛ:-e:], ā [(a:)-ɑ:], o [(ɒ:-)ɔ:(-o:-u:)], ū [u:], ʏ [(u:-)y:(-i:)]

diphthongs: ay [ai̯] (ay in native words appears only in the western dialects: eastern dialects change it to e, ay except in loanwords), oy [ɔ:i̯], uy [ʊi̯], ūy [u:i̯], ʏy [y:i̯], iy [ɪi̯]; ow [ɔ:u̯], aw [au̯]

Front Near-front Central Near-back Back
Blank vowel trapezoid.svg
ɪ • 
 • ʊ

ɛː • 
 • ɔː

a • 
ɑː • 


1) Long e, o and ʏ are conventionally not written with the lengthening sign.

2) Long ā is recognised, but it appears only as a result of compensatory lengthening (ǰām < ǰaعm < ǰamع).

3) In recent loans from Tajik ů [ɵ:] and/or Uzbek [ɵ, ø] can also appear, but its pronunciation usually merges with ū).

4) A ʏ is recognised only by some authorities. IIt seems that it is an allophone of ū. The origin of ʏ comes from historical stressed *ū, but historical *ō, changed in Yaghnobi to ū, remains unchanged. It seems that the status of ʏ is unstable, and it is not recorded in all varieties of Yaghnobi, and ʏ is often realised as ū, ūy/ūy, uy/uy or ʏ. In summary: *ū́ (under stress) > ū/ūy/uy/ʏ or ū, *ō > ū (vʏz/vūz, goat; Tajik buz, Avestan buza-). By some authorities, ʏ can be transcribed as ü.

5) An o can change to ū in front of a nasal (Toǰīkistón × Toǰīkistū́n, nom × nūm).

6) An e is considered as a long vowel, but in front of h or ع, its pronunciation is somewhat shorter, and e is realised as a half-short (or even short) vowel. Etymologically, the "short" e in front of h or ع comes from older *i (there is an alternation e/i in front of h/ع) if the historical cluster *ih or *iع appears in a closed syllable, and *i changes to e. In open syllables, the change did not take place (that is similar to Tajik. The change can be seen in the verb dih-/deh-: infinitive díhak × 3rd sg. present déhči.

7) In Yaghnobi dialects, there can be seen a different development of historical svarabhakti vowel: in the Western and Transitional dialects, it is rendered as i (or u under certain circumstances) but in the Eastern dialects it changes to a (but also i or u): *θray > *θəráy > W./Tr. tiráy × E. saráy but *βrāt > *vərāt > W./Tr./E. virót;. When the second vowel is a back vowel, usually changes to u in Western or Transitional dialects: *(čə)θβār > *tfār > *təfór > W./Tr. tufór (but also tifór) × E. tafór, *pδūfs- > *bədū́fs > W./Tr./E. budū́fs-. The later change appears also in morphology: verb tifárak (the form is same in all three dialects) has form in 3rd sg. present tufórči < *təfár- < *tfar- < *θβar-. The alternation i/a can be seen also in Tajik loans where an unstressed vowel can undergo this change: W./Tr. širī́k × E. šarī́k < Tajik šarīk /šarīk/, W./Tr. xipár × E. xapár < Tajik xabar /xabar/. The former svarabhakti vowels are often ultra-short or reduced in pronunciation, and they can even disappear in fast speech: xišáp /xišáp × xⁱšáp × xšap/ < *xəšáp < *xšap.

8) The a changes to o in verbal stems of type -Car- if an ending containing historic or *t is added: tifár-, infinitive tifárak, 1st sg. present tifarómišt but 3rd sg. present tufórči (ending -či comes from older -tišt), 2nd pl. present W./Tr. tufórtišt E. tufórsišt, x°ar-: x°árak : x°arómišt : xórči : xórtišt/xórsišt (when a changes to o after , x loses its labilisation). The change takes place with all verbs of Yaghnobi origin and also with older loans from Tajik. For new loans, a remains unchanged.: gudár(ak) : gudórči × pár(ak) : párči: the first verb is an old loan from Tajik guzaštan < guδaštan, the later a recent loan from parrīdan.


Stops: /p/, /b/, /t/, /d/, /k/, /ɡ/, /q/ (/k/ and /ɡ/ are palatalised to [c] and [ɟ] respectively before a front vowel or after a front vowel at the end of a word)

Fricatives: /f/, /v/, /s/, /z/, /ʃʲ/ ⟨š⟩, /ʒʲ/ ⟨ž⟩, /χ/ ⟨x⟩, /ʁ/ ⟨ɣ⟩, /χʷ/ ⟨x°⟩, /h/ ([ɦ] appears as an allophone between vowels or voiced consonants), /ħ/ ⟨ẖ⟩, /ʕ/ ⟨ع⟩

Affricates: // ⟨č⟩, // ⟨ǰ⟩

Nasals: /m/, /n/ (both have allophones /ŋ/ and /ɱ/ before /k, ɡ/ and /f, v/, respectively)

Trill: /r/

Lateral: /l/

Aproximant: /β̞/ ⟨w⟩, /j/ ⟨y⟩

Place of articulation Bilabial Labio‐
Alveolar Post‐

or Palatal
Velar Uvular or Labialised Uvular Pharyn‐
Manner of articulation
Nasal    m        n          
Plosive p b t d c ɟ k ɡ q     
Fricative f v s z ʃʲ ʒʲ χ χʷ ʁ ħ ʕ h
Approximant    β̞    j
Trill    r
Lateral Approximant    l  

All voiced consonants are pronounced voiceless at the end of the word when after an unvoiced consonant comes a voiced one. The unoviced is voiced by assimilation. In voicing q, the voiced opposition is ɣ, not [ɢ].

Also, b, g, h, , ǰ, q, l and ع appear mostly in loanwords, native words with those sounds being rare and mostly onomatopoeic.


W, E and Tr. refer to the Western, Eastern and Transitional dialects.


Case endings:

Case Stem ending is consonant Stem ending is vowel other than -a Stem ending is -a
Sg. Direct (Nominative) -a
Sg. Oblique -i -y -ay (W), -e (E)
Pl. Direct (Nominative) -t -t -ot
Pl. Oblique -ti -ti -oti


  • kat : obl.sg. káti, pl. katt, obl.pl. kátti
  • mayn (W) / men (E) : obl.sg. máyni/méni, pl. maynt/ment, obl.pl. máynti/ménti
  • póda : obl.sg. póday/póde, pl. pódot, obl.pl. pódoti
  • čalló : obl.sg. čallóy, pl. čallót, obl.pl. čallóti
  • zindagī́ : obl.sg. zindagī́y, pl. zindagī́t, obl.pl. zindagī́ti
  • mórti : obl.sg. mórtiy, pl. mórtit, obl.pl. mórtiti
  • Also, the izofa construction is used in Yaghnobi and appears in phrases and constructions adopted from Tajik or with words of Tajik origin.


Person Nominative Singular Oblique Singular Enclitic Singular Nominative Plural Oblique Plural Enclitic Plural
1st man man -(i)m mox mox -(i)mox
2nd tu taw -(i)t šumóx šumóx -šint
3rd ax, áwi, (aw), íti, (īd) -(i)š áxtit, íštit áwtiti, ítiti -šint

The 2nd person plural, šumóx also finds use as the polite form of the 2nd person.


Eastern Yaghnobi Western Yaghnobi Tajik loan
1 ī ī yak, yag, ya
2 du
3 saráy tⁱráy se, say
4 tafór tᵘfór, tⁱfór čor
5 panč panč panǰ
6 uxš uxš šiš, šaš
7 avd aft haft
8 ašt ašt hašt
9 nau̯ nau̯ nuʰ
10 das das daʰ
11 das ī das ī yozdáʰ
12 das dū das dʏ dᵘwozdáʰ
13 das saráy das tⁱráy senzdáʰ
14 das tafór das tᵘfór / tⁱfór čordáʰ
15 das panč das panč ponzdáʰ
16 das uxš das uxš šonzdáʰ
17 das avd das aft habdáʰ, havdáʰ
18 das ašt das ašt haždáʰ
19 das nau̯ das nau̯ nūzdáʰ
20 bīst
30 bī́st-at das bī́st-at das
40 dū bīst dʏ bīst čil
50 dū nī́ma bīst dʏ nī́ma bīst pinǰóʰ, panǰóʰ
60 saráy bīst tⁱráy bīst šast
70 saráy nī́ma bīst tⁱráy nī́ma bīst, tⁱráy bī́st-u das haftód
80 tafór bīst tᵘfór / tⁱfór bīst haštód
90 tafór nī́ma bīst tᵘfór / tⁱfór nī́ma bīst navád
100 sad
1000 hazór


Personal endings – present:

Person Singular Plural
1st -omišt -īmišt
2nd -īšt -tišt (W, Tr.), -sišt (E)
3rd -tišt (W), -či (E, Tr.) -ošt

Personal endings – preterite (with augment a-):

Person Singular Plural
1st a- -im a- -om (W), a- -īm (E, Tr.)
2nd a- a- -ti (W, Tr.), a- -si (E)
3rd a- a- -or

By adding the ending -išt (-št after a vowel; but -or+išt > -ošt) to the preterite, the durative preterite is formed.

The present participle is formed by adding -na to the verbal stem. Past participle (or perfect participle) is formed by addition of -ta to the stem.

The infinitive is formed by addition of ending -ak to the verbal stem.

Negation is formed by prefix na-, in combination with augment in preterite it changes to nē-.

The copula is this:

Person Singular Plural
1st īm om
2nd išt ot (W, Tr.), os (E)
3rd ast, -x, xast, ásti, xásti or


Knowledge of Yaghnobi lexicon comes from three main works: from a Yaghnobi-Russian dictionary presented in Yaghnobi texts by Andreyev and Peščereva and then from a supplementary word list presented in Yaghnobi grammar by Xromov. The last work is Yaghnobi-Tajik Dictionary compiled by Xromov's student, Sayfiddīn Mīrzozoda, himself a Yaghnobi native speaker. Yaghnobi Tajik words represent the majority of lexicon (some 60%), followed ny words of Turkic origin (up to 5%, mainly from Uzbek) and a few Russian words (about 2%; through Russian language also many international words came to Yaghnobi). Only a third of the lexicon is Eastern-Iranian origin and can be easily comparable to those known from Sogdian, Ossetian, Pamir languages or Pashto.

Sample texts[edit]

A group of Yaghnobi-speaking schoolchildren from Tajikistan

"Fálɣar-at Yáɣnob asosī́ láfz-šint ī-x gumū́n, néki áxtit toǰīkī́-pi wó(v)ošt, mox yaɣnobī́-pi. 'Mʏ́štif' wó(v)omišt, áxtit 'Muždív' wó(v)ošt." [ˈfalʁɑratʰ ˈjɑʁnɔˑb asɔˑˈsiː ˈlafzʃʲɪntʰ ˈiːχ ɡʊˈmoːn ˈneːcʰe ˈɑχtʰɪtʰ tʰɔˑdʒʲiˑˈcʰiːpʰe ˈβ̞oːˀɔˑʃʲtʰ moːʁ jɑʁnɔˑˈbiːpʰe ˈmyːʃʲtʰɪf ˈβ̞oːˀɔˑmɪʃʲtʰ ˈɑχtʰɪtʰ mʊʒʲˈdɪv ˈβ̞oːˀɔˑʃʲtʰ]

"In Falghar and in Yaghnob is certainly one basic language, but they speak Tajik and we speak Yaghnobi. We say 'Müštif', they say 'Muždiv'."

In edited Cyrillic orthography it could have been written this way: "Фалғарат Яғноб асосӣ лафзшинт ӣх гумун, неки ахтит тоҷикипӣ ԝоошт, мох яғнобипӣ. 'Мӯштиф' ԝоомишт, ахтит 'Муждив' ԝоошт."

An anecdote about Nasreddin[edit]

Latin version: 1. Nasriddī́n ī xūd či bozór uxš tangái axirī́n. 2. Kaxík woxúrdš avī́, čáwi apursóšt: 3. "Xūd čof pūl axirī́nī?" 4. Nasriddī́n ī́ipiš ǰawób atifár, dúipiš ǰawób atifár, tiráyipiš ǰawób atifár, aɣór: 5. "Hámaipi ǰawób tifaróm, zīq vómišt." 6. Ax xūdš či sarš anós, bozórisa adáu̯, fayród akún: 7. "E odámt! 8. Daràu̯-daráwi maydónisa šau̯t, īyóka ǰām vʏt! 9. Kattóti šumóxpi árkšint ast!" 10. Odámt hamáš maydóni īyóka ǰām avór, áni šáhri hičúxs nàapiráxs. 11. Nasriddī́n balandī́i sári asán, fayród akún: 12. "E odámt, ɣiríft, nihíš xūd man uxš tangái axirī́nim".

IPA Transcription: 1. nasre̝ˈdːiːn ˈiː ˈχuːd ˈtʃɪ̞ bɔˑˈzoːr ˈʋ̘χʃʲ tʰaŋˈɟa̝jĕ̝ ɑχĕ̝ˈriːn. 2. cʰaˈχecʰ β̞ɔˑˈχʋˑrdʃʲ aˈve̝ː, ˈtʃaβ̞e apʰʋrˈsoːɕt: 3. "ˈχuːd ˈtʃoːf ˈpʰuːl ɑχĕ̝ˈriːne̝ˑ?" 4. nasre̝ˈdːiːn ˈiːjĕ̝pʰe̝ʃʲ dʒaˈβ̞oːb atʰĕ̝ˈfar, ˈdʋ̘je̝pʰe̝ʃʲ dʒaˈβ̞oːb atʰĕ̝ˈfar, tʰɪ̆ˈraje̝pʰe̝ʃʲ dʒaˈβ̞oːb atʰĕ̝ˈfar, ɑˈʁoːr: 5. "ˈhama̝jĕ̝pʰe̝ dʒaˈβ̞oːb tʰĕ̝faˈro̝ːm, ˈze̝ˑqʰ ˈvo̝ːmɪʃʲtʰ." 6. ˈaχ ˈχuːdʃʲ ˈtʃɪ̞ ˈsarɪ̆ʃʲ aˈnoːs, bɔˑˈzoːrɪsa aˈdau̯, fai̯ˈroːd aˈkʰʋn: 7. "ˈeː ɔˑˈdamtʰ! 8. darˌau̯-daˈraβ̞e mai̯ˈdoːne̝sa ˈʃʲau̯tʰ, iˑjˈoːcʰa ˈdʒɑːm ˈvyːtʰ! 9. cʰaˈtʰːoːtʰe̝ ʃʲʋ̆ˈmoːχpʰe̝ ˈarcʃʲɪ̞nt ˌastʰ!" 10. ɔˑˈdamtʰ haˈmaʃʲ mai̯ˈdoːne̝ iˑjˈoːcʰa ˈdʒɑːm aˈvoːr, ˈane̝ ˈʃʲahrɪ he̝ˑˈtʃʋ̝χs ˌna̝ˀa̝pʰĕ̝ˈraχs. 11. nasre̝ˈdːiːn balanˈdiːjĕ̝ ˈsare̝ aˈsan, fai̯ˈroːd aˈkʰʋn: 12. "ˈeː ɔˑˈdamtʰ, ʁĕ̝ˈre̝ftʰ, nĕ̝ˈhe̝ˑʃʲ ˈχūd ˈman ˈʋ̘χʃʲ tʰaŋˈɟa̝jĕ̝ ɑχĕ̝ˈriːne̝m".

Cyrillic version: 1. Насриддин ӣ хӯд чи бозор ухш тангаи ахирин. 2. Кахик ԝохурдш авӣ, чаԝи апурсошт: 3. "Худ чоф пул ахиринӣ?" 4. Насриддин ӣипиш ҷаԝоб атифар, дуипиш ҷаԝоб атифар, тирайипиш ҷаԝоб атифар, ағор: 5. "Ҳамаипӣ ҷаԝоб тифаром, зиқ вомишт." 6. Ах хӯдш чи сарш анос, бозориса адаԝ, файрод акун: 7. Э одамт! 8. Дараԝ-дараԝи майдониса шаԝт, ӣёка ҷаъм вӯйт! 9. Каттоти шумохпӣ аркшинт аст." 10. Одамт ҳамаш майдони ӣёка ҷаъм авор, ани шаҳри ҳичухс наапирахс. 11. Насриддин баландии сари асан, файрод акун: 12. "Э одамт, ғирифт, ниҳиш хӯд ман ухш тангаи ахириним."

Translation: 1. Nasreddin has bought a tubeteika at the bazaar for six tangas. 2. Everyone he met, asked him: 3. "For how much money have you bought the tubeteika?" 4. Nasreddin has answered to the first of them, he has answered to the second of them, he has answered to the third of them, than he sow: 5. "If I will answer to everyone, I will go crazy." 6. He has taken the tubeteika of his head, run to the bazaar, cried: 7. "Hey, people! 8. Go quickly to the square, gather somewhere there! 9. The Big-ones have something to deal with you." 10. All the people have gathered somewhere at the square, no one else has remained in the city. 11. Nasreddin came upon a high place, cried: 12. "Hey people, let you know, I bought this tubeteika for six tangas"


