Romani alphabets

The Romani language has for most of its history been an entirely oral language, with no written form in common use. Although the first example of written Romani dates from 1542,^[1] it is not until the twentieth century that vernacular writing by native Romani people arose.

Printed anthologies of Romani folktales and poems began in the 20th century in Eastern Europe, using the respective national scripts (Latin or Cyrillic).^[2] Written Romani in the 20th century used the writing systems of their respective host societies, mostly Latin alphabets (Romanian, Czech, Croatian, etc.).

Standardization

Currently, there is no single standard orthography used by both scholars and native speakers. Efforts of language planners have been hampered by the significant dialectal divisions in Romani: the absence of standard phonology, in turn, makes the selection of a single written form problematic.

In an effort to overcome this, during the 1980s and 1990s Marcel Courthiade proposed a model for orthographic unification based on the adoption of a meta-phonological orthography, which "would allow dialectal variation to be accommodated at the phonological and morpho-phonological level".^[1] This system was presented to the International Romani Union in 1990, who adopted it as the organization's "official alphabet". This recognition by the International Romani Union allowed Courthiade's system to qualify for funding from the European Commission.

Despite being used in several publications, such as the grammar of Romani compiled by Gheorghe Sarău^[3] and the Polish publication Informaciaqo lil,^[4] the IRU standard has yet to find a broad base of support from Romani writers. One reason for the reluctance to adopt this standard, according to Canadian Rom Ronald Lee, is that the proposed orthography contains a number of specialised characters not regularly found on European keyboards, such as θ and ʒ.^[5]

Instead, the most common pattern among native speakers is for individual authors to use an orthography based on the writing system of the dominant contact language: thus Romanian in Romania, Hungarian in Hungary and so on. A currently observable trend, however, appears to be the adoption of a loosely English-oriented orthography, developed spontaneously by native speakers for use online and through email.^[1]

Descriptive linguistics has, however, a long and established tradition of transcription.^[1] Despite small differences between individual linguists in the representation of certain phonemes, most adhere to a system which Hancock terms Pan-Vlax.^[4]

Latin script

The overwhelming majority of academic and non-academic literature produced currently in Romani is written using a Latin-based orthography.^[1] There are three main systems that are likely to be encountered: the Pan-Vlax system, the International Standard and various Anglicised systems.^[4]

Pan-Vlax

In most recent descriptive literature, a variety of orthography which Ian Hancock terms Pan-Vlax will likely be used.^[4] This orthography is not a single standardised form, but rather a set of orthographical practices which exhibit a basic "core" of shared graphemes and a small amount of divergence in several areas. The Pan-Vlax script is based on the Latin script, augmented by the addition of several diacritics common to the languages of eastern Europe, such as the caron. Sometimes stress is indicated with an acute accent.

In the following table, the most common variants of the graphemes are shown. The phonemes used in the table are somewhat arbitrary and are not specifically based on any one dialect (for example, the phoneme denoted /d͡ʒ/ in the table can be realised as /ʒ/, /ʐ/ or /ɟ/, depending on dialect):

Romani "Pan-Vlax" alphabet
Grapheme	Phoneme	Example
A a	/a/	akanik now
B b	/b/	barvalo rich
C c	/ts/	círdel he pulls
Č č	/t͡ʃ/	čačo true
Č čh	/t͡ʃʰ/	čhavo boy
D d	/d/	drom road
Dž dž	/d͡ʒ/	džukel dog
E e	/e/	efta seven
F f	/f/	fóro town
G g	/ɡ/	gadžo non-Rom
H h	/h/	herdelézi Saint George's Day
CH	/x/	chal he eats
I i	/i/	jilo heart
J j	/j/	jag fire
K k	/k/	ka where
Kh kh	/kʰ/	khami sunny
L l	/l/	lašhoj good
M m	/m/	manuš man
N n	/n/	anav name
O o	/o/	oxto eight
P p	/p/	paramísi fairy tale
Ph ph	/pʰ/	phabaj apple
R r	/r/	rakli non-Romani girl
S s	/s/	somnakaj gold
Š š	/ʃ/	šukar beautiful
T t	/t/	pohari cup
Th th	/tʰ/	them land
U u	/u/	vušt lip
V v	/ʋ/	vurdon cart
Z z	/z/	zor power
Ž ž	/ʒ/	žoja Thursday

The use of the above graphemes is relatively stable and universal, taking into account dialectal mergers and so on. However, in certain areas there is somewhat more variation. A typically diverse area is in the representation of sounds not present in most varieties of Romani. For example, the centralised vowel phonemes of several varieties of Vlax and Xaladitka, when they are indicated separately from the non-centralised vowels, can be represented using ə, ъ or ă.^[4] Another particularly variant area is the representation of palatalised consonants, which are absent from a number of dialects. Some variant graphemes for /tʲ/ include tj, ty, ć, čj and t᾿.^[1] Finally, the representation of the second rhotic, which in several dialects has been merged with /r/, tends to vary between ř, rr, and rh, and sometimes even gh, with the first two being the most frequently found variants.^[4]

International Standard

The International Standard orthography, as devised by Marcel Courthiade and adopted by the International Romani Union, uses similar conventions to the Pan-Vlax system outlined above. Several of the differences are simply graphical, such as replacing carons with acute accents, transforming č š ž into ć ś ź, and acute accents with grave accents. However, its most distinctive feature is the use of "meta-notations", which are intended to cover cross-dialectal phonological variation, particularly in degrees of palatalisation; "morpho-graphs", which are used to represent the morphophonological alternation of case suffixes^[6] in different phonological environments;^[7] and a double dot (¨) to indicate a centralized vowel.

The "meta-notations" are ćh, ʒ, and the caron (ˇ; named ćiriklo after the word for bird), the realisation of which varies by dialect. The first two are respectively pronounced as /t͡ʃʰ/ and /d͡ʒ/ in the first stratum but /ɕ/ and /ʑ/ in the third stratum.^[8] The caron on a vowel represents palatalisation; ǒ and ǎ are pronounced /o/ and /a/ in Lovaricka, but /jo/ and /ja/ in Kalderash.^[4]

The three "morpho-graphs" are ç, q. and θ, which represent the initial phonemes of a number of case suffixes, which are realised /s/, /k/ and /t/ after a vowel and /ts/, /ɡ/ and /d/ after a nasal consonant.

Anglicised

The English-based orthography commonly used in North America is, to a degree, an accommodation of the Pan-Vlax orthography to English-language keyboards, replacing those graphemes with diacritics with digraphs, such as the substitution of ts ch sh zh for c č š ž.^[4] This particular orthography seems to have arisen spontaneously as Romani speakers have communicated using email, a medium in which graphemes outside the Latin-1 charset have until recently been difficult to type.^[1] In addition, it is this orthography which is recommended for use by Romani scholar and activist Ronald Lee.^[5]

Romani in Macedonia

Romani in Macedonia is written with the following alphabet:^[9]

This alphabet is used in the educational system in Macedonia for Romani-speaking students.

A a	B b	C c	Ć ć	Č č	D d	Dž dž	E e
F f	G g	GJ gj	H h	I i	J j	K k	Kh kh
L l	Lj lj	M m	N n	Nj nj	O o	P p	Ph ph
R r	S s	Š š	T t	Th th	U u	V v	Y y
X x	Z z

Kepeski & Jusuf (1980) noted that the following alphabet is used by Romani people in Macedonia and Serbia (Kosovo):^[10]

A a	Ä ä	B b	C c	Č č	Kj kj (Ćć)	D d	Gj gj (Ǵǵ)
Dž dž	E e	F f	G g	H h	X x	I i	J j
K k	L l	Lj lj	M m	N n	Nj nj	O o	P p
Q q	R r	S s	Š š	T t	U u	V v	Z z
Ž ž

Finnish Romani

Finnish Romani (or Finnish Kalo) is written with the following alphabet:^[11]

A a	B b	(C c)	D d	E e	F f	G g	H h
Ȟ ȟ	I i	J j	K k	L l	M m	N n	O o
P p	(Q q)	R r	S s	Š š	T t	U u	V v
(W w)	Y y	(Z z)	Ž ž	(Å å)	Ä ä	Ö ö

The letters in parentheses are only used in loanwords and are therefore not always part of the alphabet. The digraphs dž, kh, ph, th, and tš are used, but are not letters of the alphabet. Š and Ž are only used in these digraphs.

Cyrillic script

Cyrillic alphabet of Kalderash dialect^[10]
Upper case	А	Б	В	Г	Ғ	Д	Е	Ё	Ж	З	И	Й	К	Кх	Л	М	Н	О	П	Пх	Р	Рр	С	Т	Тх	У	Ф	Х	Ц	Ч	Ш	Ы	Ь	Э	Ю	Я
Lower case	а	б	в	г	ғ	д	е	ё	ж	з	и	й	к	кх	л	м	н	о	п	пх	р	рр	с	т	тх	у	ф	х	ц	ч	ш	ы	ь	э	ю	я

Cyrillic alphabet of Ruska Roma dialect^[12]
Upper case	А	Б	В	Г	Ґ	Д	Е	Ё	Ж	З	И	Й	К	Л	М	Н	О	П	Р	С	Т	У	Ф	Х	Ц	Ч	Ш	Ы	Ь	Э	Ю	Я
Lower case	а	б	в	г	ґ	д	е	ё	ж	з	и	й	к	л	м	н	о	п	р	с	т	у	ф	х	ц	ч	ш	ы	ь	э	ю	я

Greek script

In Greece, for instance, Romani is mostly written with the Greek alphabet (although very little seems to be written in Romani in Greece).^[13]

Arabic script

The Arabic script has also been used, for example, in Iran.^[13]^[14] More importantly, the first periodical produced by Roma for Roma was printed in the Arabic script in the 1920s in Edirne in Turkey. It was called "Laćo" which means "good".^[13]

Comparison of alphabets

IPA	1971 Romani World Congress	Hungarian Lovari	Hungarian Carpathian Romani	Pan-Vlax	International Romani Union Standard	American Romani	Macedonian Official Teaching Alphabet	Macedonian Folk Alphabet Kepeski & Jusuf (1980)^[10]^[15]	Finnish Romani^[11]	Cyrillic script	Cyrillic alphabet of the Kalderash dialect^[10]	Cyrillic alphabet of the Ruska Roma dialect
[a]	A	A	A	A	A	A	A	A a		А,^[16] Я ^[17]	А,^[16] Я ^[17]	А,^[16] Я ^[17]
[ɑ]									A
[æ]									Ä
[b]	B	B	B	B	B	B	B	B b	B	Б	Б	Б
[ts]	C	C	C	C	C, Ç^[18]	Ts	C	C c		Ц	Ц	Ц
[t͡ʃ]	Ch	Ch	Ch	Č	Ć	Ch	Č	Č č	Tš	Ч	Ч	Ч
[t͡ʃʰ]			Chh	Čh	Ćh^[19]					Чх
[d]	D	D	D	D	D, Θ^[20]	D	D	D d	D	Д	Д	Д
[dz]		Dz	Dz
[d͡ʒ]	J	Dzh	Dzh	Dž	Ʒ^[21]	J	Dž	Dž	Dž	Дж	Дж
[ɟ]	Dy	Dy	Dy				Gj	Gj (Ǵǵ)
[e]	E	E	E	E	E	E	E	E	E	Э,^[16] Е ^[17]	Э,^[16] Е ^[17]	Э,^[16] Е ^[17]
[ə]				Ə,^[22] Ê^[23]	Ë^[24]			Ä^{[clarification needed]}			Ъ
[f]	F	F	F	F	F	F	F	F	F	Ф	Ф	Ф
[ɡ]	G	G	G	G	G, Q^[25]	G	G	G, Q^[25]	G	Г	Ғ	Ґ
[h]	H	H	H	H	H	H	H	H	H	Г^[26]	Г	Г
[x]	X^[27]	X^[27]	X^[27]	X	X	X	X	X^[27]	Ȟ	Х	Х	Х
[i]	I	I	I	I	I	I	I	I	I	Ы,^[16] И ^[17]	Ы,^[16] И ^[17]	Ы,^[16] И ^[17]
[ɨ]					Ä
[j]	Y	J	J	J	J	Y		J	J	Й	Й	Й
[k]	K	K	K	K	K, Q^[25]	K	K	K, Q^[25]	K	К	К	К
[kʰ]	Kh	Kh	Kh	Kh	Kh	Kh	Kh		Kh	Кх	Кх
[l]	L	L	L	L	L	L	L	L l	L	Л	Л	Л
[ʎ]	Ly	Ly	Ly				Lj	Lj
[m]	M	M	M	M	M	M	M	M	M	М	М	М
[n]	N	N	N	N	N	N	N	N	N	Н	Н	Н
[ɲ]	Ny	Ny	Ny				Nj	Nj
[o]	O	O	O	O	O	O	O	O	O	О,^[16] Ё ^[17]	О,^[16] Ё ^[17]	О,^[16] Ё ^[17]
[ø]					Ö^[24]				Ö
[p]	P	P	P	P	P	P	P	P	P	П	П	П
[pʰ]	Ph	Ph	Ph	Ph	Ph	Ph	Ph		Ph	Пх	Пх
[r]	R	R	R	R	R	R	R	R	R	Р	Р	Р
[ɽ], [ɻ], [rː], [ʀ]				Ř, Rr, Rh, Gh^[28]	Rr							Рр
[s]	S	S	S	S	S, Ç^[18]	S	S	S	S	С	С	С
[ʃ]	Sh	Sh	Sh	Š	Ś	Sh	Š	Š		Ш	Ш	Ш
[ɕ]				Ś	Ćh^[19]
[t]	T	T	T	T	T, Θ^[20]	T	T	T	T	Т	Т	Т
[tʰ]	Th	Th	Th	Th	Th	Th	Th		Th	Тх	Тх
[c]	Ty	Ty	Ty	Tj, Ty, Ć, Čj, T’^[28]			Ć	Kj (Ć)
[u]	U	U	U	U	U	U	U	U	U	У,^[16] Ю ^[17]	У,^[16] Ю ^[17]	У,^[16] Ю ^[17]
[y]				Ü	Ü^[24]
[y]									Y
[v]	V	V	V	V	V	V	V	V		В	В	В
[ʋ]									V
[z]	Z	Z	Z	Z	Z	Z	Z	Z		З	З	З
[ʒ]	Zh	Zh	Zh	Ž	Ź	Zh	Ž	Ž		Ж	Ж	Ж
[ʑ]				Ź	Ʒ^[21]

Notes

^ ^a ^b ^c ^d ^e ^f ^g Matras (2002)
^ Bagchi (2016)
^ Sarău (1994)
^ ^a ^b ^c ^d ^e ^f ^g ^h Hancock (1995)
^ ^a ^b Lee (2005:272)
^ Whether these endings are to be analysed as postpositions or case endings is still a matter of debate in Romani linguistics. See, for example, Hancock (1995) and Matras (2002) for varying approaches.
^ Matras (1999)
^ Courthiade (2009:43–44)
^ Petrovski (2021)
^ ^a ^b ^c ^d Everson (2001)
^ ^a ^b Granqvist (2011)
^ Serghievsky & Barannikov (1938)
^ ^a ^b ^c Bakker & Kyuchukov (2000:90)
^ Djonedi (1996)
^ Phonetic assignment provisional (not in source)
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o After hard consonants
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o After soft (palatal) consonants
^ ^a ^b Represents /s/ after vowels and /tˢ/ after nasals.
^ ^a ^b Represents /t͡ʃʰ/ in the first stratum and /ɕ/ in the third stratum.
^ ^a ^b Represents /t/ after vowels and /d/ after nasals.
^ ^a ^b Represents /d͡ʒ/ in the first stratum and /ʑ/ in the third stratum.
^ Boretzky & Igla (1994:XVI)
^ "Writing System Phonemic Values". ROMLEX. Retrieved 2022-01-28.
^ ^a ^b ^c Courthiade (2009:496–499)
^ ^a ^b ^c ^d Represents /k/ after vowels and /ɡ/ after nasals.
^ As in Russian, this orthography does not distinguish between /ɡ/ and /h/.
^ ^a ^b ^c ^d This is the Greek letter Chi and was ordered alphabetically after H.
^ ^a ^b Only exists in some dialects and varies according to dialects.

References

Bagchi, Tista (December 28, 2016). "Romany language". Britannica Online Encyclopedia. Retrieved January 28, 2022.
Bakker, Peter; Kyuchukov, Hristo, eds. (2000), What Is the Romani Language?, Interface Collection, vol. 21, Centre de Recherches Tsiganes; University of Hertfordshire Press, ISBN 1-902806-06-9
Boretzky, Nobert; Igla, Birgit (1994), Wörterbuch Romani-Deutsch-Englisch für den südosteuropäischen Raum : mit einer Grammatik der Dialektvarianten, Wiesbaden: Harrassowitz Verlag, ISBN 3-447-03459-9
Courthiade, Marcel (2009), Rézműves, Melinda (ed.), Morri angluni rromane ćhibǎqi evroputni lavustik (in Romany, Hungarian, English, French, Spanish, German, Ukrainian, Romanian, Croatian, Slovak, and Greek), Budapest: Fővárosi Onkormányzat Cigány Ház--Romano Kher, ISBN 978-963-85408-6-7
Djonedi, Fereydun (1996). "Romano Glossar. Gesammelt von Schir-ali Tehranizade" (PDF). Grazer Linguistische Studien (in German). 46: 31–59. Archived from the original (PDF) on February 5, 2012.
Everson, Michael (October 7, 2001). "Romani" (PDF). Everytype: The Alphabets of Europe. Retrieved January 28, 2022.
Granqvist, Kimmo (2011), Lyhyt Suomen romanikielen kielioppi [Consice grammar of Finnish Romani], Helsinki: Kotimaisten kielten keskus, ISBN 978-952-5446-69-2
Hancock, Ian (1995), A Handbook of Vlax Romani, Columbus: Slavica Publishers, ISBN 0-89357-258-6
Kepeski, Krume; Jusuf, Šaip (1980), Romani gramatika = Ромска граматика (in Macedonian and Romany), Skopje: Naša Kniga
Lee, Ronald (2005), Learn Romani: Das-dúma Rromanes, Hatfield: University of Hertfordshire Press, ISBN 1-902806-44-1
Matras, Yaron (December 1999). "Writing Romani: The pragmatics of codification in a stateless language" (PDF). Applied Linguistics. 20 (4): 481–502. doi:10.1093/applin/20.4.481.
Matras, Yaron (2002), Romani: A Linguistic Introduction, Cambridge: Cambridge University Press, ISBN 0-521-02330-0
Petrovski, Trajko (2021), I čhib thaj i kultura romengiri bašo III klasi (PDF) (in Romany) (2nd ed.), Skopje: Ministry of Education and Science of the Republic of Northern Macedonia, ISBN 978-608-226-933-7, retrieved January 28, 2022
Sarău, Gheorghe (1994), Limba Romani (ţigănească): Manual pentru Clasele de Invățători Romi ale Școlilor Normale, Bucharest: Editura Didactică și Pedagogică
Serghievsky, M. V.; Barannikov, A. P. (1938), Цыганско-русский словарь [Romani-Russian dictionary] (in Russian), Moscow, archived from the original on April 26, 2012{{citation}}: CS1 maint: location missing publisher (link)