Macedonian language[edit]

Futher to, and in support of the Iranians who lived in Tajikistan, just below me, I want to make a comment on the so-caled 'Macedonian' language. I am Bulgarian and can say the same about the Macedonian language. Namely, it is a Serbianized Bulgarian dialect which was created on a single date, in 1945, and given an official status in Tito's multi-ethnic Yugoslav communist federation, which was designed to follow the model of the 'Big Brother', the USSR. Since then, the (previously ethnic Bulgarian) population in the new People's Republic of Macedonia, was declared 'Macedonian', those who opposed the denationalization were sent to 'correction' labour camps, until the goal of creating a separate 'Macedonian' national identity was more or less achieved. Its primary target was to make Bulgaria weaker and to prevent it from controlling the strategically important road Belgrade - Thessaloniki. The BIG Brother, or USSR, itself was the place of a number of such national experiments, and a series of artificial languages were created for political reasons: Moldavian, a Romanian dialect, Karelian, a Finnish dialect, Tajik, a Persian dialect, Buryatian, a Mongol dialect, and even Belarusian or White Russian, which is a Russian dialect, although with a different historical development (This was an anti-Russian move taken by Lenin and the Bolsheviks in 1918, let's not forget that they feared and hated the Great Russian imperialism and the memory of Great Russia just as much as any foreign imperialism). Georgi Stojchev, Sofia

These cases are not all the same. I agree that Moldavian was a very blatant example, because apart from the writing system (but only after forced Cyrillisation; I guess before that there was no separate standard at all) the Moldavian standard is effectively completely identical to the Romanian standard, and there is hardly any difference on the spoken dialect level, either (Romanian dialect differences are small in general, unless you include Aromanian, Megleno-Romanian and Istro-Romanian), even on the lexical level. I wouldn't even call Moldavian, especially the written language, a Romanian dialect: it is Romanian, pure and simple, just spelt in Cyrillic.
However, other languages initially received written standards (in Latin script!) in the 1920s for idealistic, not strategic reasons, and are appreciably different from the established standards. Karelian, for example, is despite its close relationship to Finnish very distinctive, especially from the written standard. While Northern Karelian is classified as essentially a form of Eastern Finnish by Tiit Rein-Viitso, though not to be mistaken with the "Karelian" dialects spoken in Finland (which are actually Savonian dialects) and rather more divergent, Southern (including Tver) Karelian and especially Olonets Karelian (also known as Livvi) are too different to be counted as Finnish dialects and do not historically descend from Finnish, but from East Ladoga North Finnic. Sure, they could have decided to use Finnish as a written language, but that would have led to a sharp diglossia, as if Bulgarians had to use Serbian or Russian as a written language. And I haven't even mentioned the older Slavic and younger Russian influences in Karelian, which also hamper mutual intelligibility.
My Iranist friend understands Persian from Iran, but Tajik is obscure to her: the language is strongly influenced by Uzbek and in some ways the grammar is very different, having copied a typically Turkic pattern of verb-formation comparable to Slavic aspect, which is very productive. Add the well-known phonological differences and the Turkic and Russian lexical influences, and eventually it turns out that even native speakers of Persian from Iran have trouble, so much that Tajiks use Tajik only among themselves and use Tajik-accented Iranian Persian to communicate with Persian-speaking foreigners.
Buryatian is not a Mongolian dialect, either. Mongolian has numerous dialects such as Khalkha and Chakhar, listed at Mongolian language, but Buryatian is not among them and quite distinct, including in grammar. In fact, it has numerous dialects of its own. Sure, Buryatian and Mongolian are very similar, but all the Mongolic languages are, just like the Slavic languages are all very similar, but no-one says Bulgarian is just a Russian dialect. The "X is just a Y dialect" meme is very tiring and usually linguistically untenable. People use it only to make a political point, and so do you.
When it comes to Macedonian, you have a somewhat stronger case, admittedly. Ultimately, what happened is this: Macedonian and Bulgarian are part of a large dialect continuum which could be called Eastern South Slavic. Standard written Bulgarian is based on the easternmost dialects in this continuum, in eastern Bulgaria. Macedonian instead is based on the dialects spoken in the extreme west of this continuum. This leads to maximum distinctness. However, given that the dialects chosen for Standard Bulgarian were on the eastern end rather than in the centre, which might have been more advantageous, Standard Bulgarian will inevitably strike speakers in the state of Macedonia as rather strange, stranger than would, for example, dialects in western Bulgaria. And the differences between the dialects at the extreme ends, while not huge, are appreciable. In fact, the dialectal diversity within Macedonia itself is quite striking. Standard Bulgarian vs. Standard Macedonian is sort of like Shetland Scots vs. Scouse (the dialect of the Liverpool region), I guess, if not Scots (especially Aberdeen) vs. London English. Or Irish vs. Scottish Gaelic. Or Swedish vs. Danish, or even Icelandic vs. Faroese. Or Czech vs. Slovak, or Ukrainian vs. Russian. That point where you're not quite sure if they should be counted as the same language because they're really close, but on the other hand the differences are really obvious even for the layperson.
Belarusian is a similarly tough case. On the one hand, it's really similar to Southern Russian dialects, though not quite identical. On the other, it's quite distinct from Standard Russian (including in grammar), even if you factor in the fact that akanye is present in Standard Russian, too, just not written. I mean, if Belarusian were just a Russian dialect, how could there be that intermediate form known as Trasyanka? (Basically Belarusian grammar with Russian lexicon pronounced like Belarusian.)
Now when it comes to Moldovan vs. Romanian, or Serbian vs. Croatian vs. Bosnian vs. Montenegrin, or Indonesian vs. Malaysian (Malay) or Hindi vs. Urdu, I'm with you. These are just silly, like declaring "Canadian" and "American" or "Australian" and "New Zealander" independent languages. --Florian Blaschke (talk) 02:06, 14 July 2016 (UTC)


I do wonder; Tajik is virtually only spoken in Tajikistan/Uzbekistan and some other Central Asian Republics, yet, including on this article, the Perso-Arabic script is spammed everywhere, even though not one of aforementioned nations officialy uses this script. I know there are alot of sock IP's/accounts who have been frantically spamming this stuff on virtually every related article (e.g. such as this no-brainer), but I was wondering whether an actual legit user perhaps had a specific reason as to add the script here on this article (given that its relatively well patrolled unlike the vast majority of the other Tajikistan related articles, I assume others have noticed this as well). Bests - LouisAragon (talk) 04:20, 29 June 2016 (UTC)

@LouisAragon: I agree those IPs/sockpuppets spam irrelevant script in Tajikistan-related articles (there is no reason to add Perso-Arabic script while it's not an official and standard writing system), but I don't think Perso-Arabic script is spammed on this article. Lead, grammar, vocabulary and writing system uses it for their own reasons (e.g. comparison, and Perso-Arabic was main writing system prior to Soviet Union). However, I don't understand why the lead section lacks native name in Cyrillic?! Wario-Man (talk) 05:16, 30 June 2016 (UTC)
Thanks for your prompt response Wario ( :-) ). Yeah, you're right in your assessment. I strongly believe however that that as well (removal of Cyrilic everywhere) is all part of the result of this constant, structural, sock-spamming. The Cyrillic script should be re-added on every place there where needed, indeed including in the lede here.
Also, I will try to keep an eye on these IP/sock accounts who are responsible for this long-term nonsense - I'm pretty sure they're all operated by merely a handful of individuals whos IQ ranges somewhere, as they say, "in the doublte digits" (aka; individuals with whom its not really possible to have any logical convo, and are clearly not here to actually build this encyclopaedia). Bests - LouisAragon (talk) 00:20, 1 July 2016 (UTC)
Well, why you removed correct info and added an irrelevant language (Russian)?![1] Plus, if you read Tajik alphabet, you see they still use Persian alphabet (for educational and academic purpose), so there is no reason to remove it.[2] It just needs clarification. If you don't agree with me, I ask other editors to write their opinions. --Wario-Man (talk) 14:31, 13 July 2016 (UTC)
@Wario-Man: Because Russian is an official language of interethnic communication in Tajikistan, per article 2 of the Constitution of the Republic of Tajikistan. (Pavlenko, Aneta (2008). Multilingualism in Post-Soviet Countries Multilingual Matters. ISBN 978-1847690876 page 228; Fishman & Garcia (2010) Handbook of Language and Ethnic Identity Oxford University Press. ISBN 978-0195374926 p 440). Furthermore its widely used, that's why I added it. Anyway, I don't mind removing it either as after all this article is about Tajik language.
Can you provice any sources that state that the Perso-Arabic alphabet is used for educational and academic purposes? I don't think this is verifiable, and I'm for 99% sure it has absolutely no official status. Also, please don't forget to revert the part of the body that was changed -- I know you didn't do it on intention, but that previous info was clear bogus as you most likely realize. Quite a few SPA/sock users in the past have been obsessively trying to cross-article spam claim Medieval Persians/Persian-speakers of Central Asia (e.g. those of Samarqand and Bukhara) outrightly as "Tajiks" or as some kind of "Proto-Tajiks" who spoke "Tajik language", which is utter pseudo-historical bogus at its finest. Bests - LouisAragon (talk) 15:27, 13 July 2016 (UTC)
@LouisAragon: I think you're right. There must be a reason behind this inclusion.
It could be either motivated by Islamism, or an incorrect concern on Persian heritage.
@Wario-Man: Perso-Arabic being the "main writing system prior to Soviet Union" is not a good reason.
And I read the article of Tajik alphabet. No such thing was stated. On the other hand, it was noted that Cyrillic is the official and standard script, and furthermore "only a very small part of the population can read the Persian alphabet."
You can include Perso-Arabic to this article while using it for Dari, as well as for Western Persian, but not for the Tajik variety.
Rye-96 (talk) 20:38, 13 July 2016 (UTC)
@LouisAragon: Well, at least for the modern denizens of Samarkand and Bukhara, it's definitely not bogus; their dialects do appear to be close to the Persian dialects of Tajikistan (especially the northern dialects, which are strongly influenced by Uzbek, as is the standard language, while the southern dialects are closer to the Persian dialects of Afghanistan). The Persian-speakers of Samarkand and Bukhara also identify as Tajiks, and are already called Tajiks in 19th-century literature (and may well already have called themselves Tajiks at the time), according to an Iranist friend. --Florian Blaschke (talk) 21:45, 13 July 2016 (UTC)
  • @Florian Blaschke, LouisAragon, and Rye-96:Okay. So what do you suggest? Remove Perso-Arbic script from lead section? For example, something like this:
    • Tajik or Tajiki (Тоҷикӣ, Toçikī/Tojikî /tɔːdʒɪˈkiː/),[3] also called Tajiki Persian is the variety of Persian...
  • Removing Persian alphabet from infobox?
  • I just restored last accepted revision. My concerns are lead section and infobox. I have no idea about "Geographical distribution" section (POV, biased or NPOV):
LouisAragon's edit is better. It's cleaner English and more understandable. --Taivo (talk) 08:51, 14 July 2016 (UTC)
@Taivo: Note the Iranica article I linked above, it answers many questions:
[After ca. 1000 AD] [s]poken Persian of Central Asia evolved independently of Persian of Iran, and northern dialects in particular were strongly influenced by Turkic speech.
So the "northern dialects" of Tajik are actually those of Samarkand and Bukhara, not northern Tajikistan. I misinterpreted that. No wonder that they are strongly influenced by Uzbek.
Persian speakers of the region came to be called Tajiks, in contradistinction to Turks, but their language was still called fārsi ‘Persian’ until the Soviet period.
See, the ethnic (as opposed to linguistic) designation Tajik for Central Asian Persian speakers is actually quite old.
The Uzbek Emirate of Bukhara, which ruled most of the Persian-speaking regions of the Oxus basin and the Pamirs since the middle of the 18th century, was reduced to a dependency of imperial Russia in 1868. After a Bolshevik-aided revolution at Bukhara in 1920, in accordance with Soviet nationalities policy, an ethnic Tajik Soviet republic was established, and a literary language called “Tajik” was engineered on a vernacular base close to the Uzbekized spoken Persian of Bukhara and Samarqand [!!] (these Tajik cultural centers, ironically, were incorporated into the Uzbek SSR).
So that's the basis of Standard Tajik, actually! The Persian dialects of Bukhara and Samarkand themselves! How ridiculous and wrongheaded LouisAragon's concern looks in retrospect. He got it exactly the wrong way round (no doubt unintentionally)! It's the Persian speakers of Bukhara and Samarkand who are the original Tajiks speaking the original "Tajik Persian". Ha! No nationalistically motivated distortions involved at all.
Arguably, and further ironically, the (early) medieval speakers of these cities are also those who spoke the Late Middle Persian dialect influenced by Arabic, Eastern Iranian and Turkic we now know as Early Modern or Classical Persian and from which Standard Iran Persian, Standard Afghan Persian or Dari and Standard Tajik all descend. Indo-Persian (whose influence is still strongly felt in Urdu especially) was in the 15th century based on essentially an early form of Central Asian "Tajik" Persian. No wonder the inhabitants of these cities have a special cachet in the Persian world!
Under the guidance of writers who were mostly of Bukharan or other northern origin, such as Sadriddin Ayni (Ṣadr-al-Din ʿAyni, 1878–1954; see ʿAYNI, ṢADR-AL-DIN), this language became the vehicle of a considerable native literature and a lively periodical press. From 1926 to 1939 a modified Latin alphabet was in use, and a concerted educational campaign produced impressive gains in adult literacy.
The implication here is that between 1920 and 1926, newly standardised Tajik Persian was actually still written in Arabo-Persian script, as not-yet-really-standardised Central Asian "Tajik" Persian, and of course Classical/Early New Persian, its direct ancestor and immediate local predecessor without a break, had been written all the time before.
[...] During the period of ca. 1948-88, Tajik lost much of its prestige, vocabulary, and domain of use to Russian. With perestroyka and glasnost’ in the 1980s came a sudden revival and re-Persianization of the national language, which continues (at a slower rate) in post-Soviet Tajikistan. Policies legislated by the Language Laws of 1989 and 1992 included the official use of Tajik in government and public domains, replacement of Russian vocabulary by Persian (both native coinages and copies from Persian of Iran), and teaching of the Perso-Arabic writing system in schools (Perry, 1996).
This makes it at least conceivable that Tajik Persian is occasionally written in Arabo-Persian script nowadays, even though this does not seem to be an officially sanctioned practice. Speculation: Maybe it is done for decorative purposes especially, like Modern English is sometimes written in blackletter, or by anti-Russian/Uzbek, pro-Persian/Muslim activists. (Addition: My Iranist friend confirms my conjecture regarding activists.) --Florian Blaschke (talk) 17:02, 14 July 2016 (UTC)
@Florian Blaschke:, uhm, I really appreciate this nice but unrelated elaborate discourse presented by you (I really do), but unfortunately for you, and for the overal coherence of this discussion here, you still did not provide any references that 1) confirm that the Perso-Arabic writing system is one of the officially employed writing systems in the Republic of Tajikistan 2) and that Bukhara/Samarqand are "historical Tajik" cities, while Medieval Persians from these cities are "Ancient Tajiks".[3] That was what the whole matter was about. Now that Wario-Man's only concern stemmed from the matter related to the infobox/lede, I believe we're done here? Oh, yeah, before I forget, regarding the usage of the Perso-Arabic system; perhaps adding "Perso-Arabic (historically)" to the infobox would be the best thing to do? Bests - LouisAragon (talk) 20:42, 14 July 2016 (UTC)
@LouisAragon: My point was that to call the Persian-speakers of Samarkand and Bukhara "Tajiks" is not a recent invention by nationalists, but probably goes back centuries. That does not mean that there was something like "Ancient Tajik" before 1000 AD. There was Early New Persian, and Middle Persian before, and there was Sogdian etc. It makes no sense to speak of Tajik (Persian) that early, I complete agree. I just wasn't sure what those nationalists really claimed. I suppose they do speak of "Ancient Tajik" in the early medieval period or earlier? Of course, that makes no more sense than to speak of "Ancient Scots/Faroese/Corsican/Yiddish/Ukrainian/Māori". That's just jingoistic silliness.
Of course Perso-Arabic is not an official writing system for Tajik Persian. That's exactly what I said. I said it's being used, but unofficially. So I think your proposal is a good solution. --Florian Blaschke (talk) 21:17, 14 July 2016 (UTC)

Vowel length[edit]

The article transcribes three of the vowel phonemes – /eː, ɵː, ɔː/ – with the long-vowel symbol, but are they really distinctively long? They did descend from long vowels, but so did some cases of /i u/ (see Persian phonology § Historical shifts). I wouldn't be surprised if they aren't distinctively long, because they are different enough in quality to be distinguished without vowel length. In that case, it would be better to remove the length symbol: /e, ɵ, ɔ/. — Eru·tuon 07:41, 14 January 2017 (UTC)

I don't think the length is phonemic, and nothing in the article seems to suggest that... Mr KEBAB (talk) 22:54, 14 January 2017 (UTC)
I went ahead and removed length symbols here and on the IPA help page. If sources disagree with this change, someone can easily revert me. — Eru·tuon 08:13, 15 January 2017 (UTC)

