Romanisation of Bengali

From Wikipedia, the free encyclopedia
  (Redirected from Romanization of Bengali)
Jump to: navigation, search

The Romanisation of Bengali is the representation of the Bengali language in the Latin script. There are various ways of Romanisation systems of Bengali created in recent years which have failed to represent the true Bengali phonetic sound. While different standards for romanisation have been proposed for Bengali, these have not been adopted with the degree of uniformity seen in languages such as Japanese or Sanskrit.[note 1] The Bengali script has often been included with the group of Indic scripts for romanisation where the true phonetic value of Bengali is never represented. Some of them are the "International Alphabet of Sanskrit Transliteration" or IAST system (based on diacritics),[1] "Indian languages Transliteration" or ITRANS (uses upper case alphabets suited for ASCII keyboards),[2] and the National Library at Calcutta romanisation.[3]

In the context of Bengali Romanisation, it is important to distinguish transliteration from transcription. Transliteration is orthographically accurate (i.e. the original spelling can be recovered), whereas transcription is phonetically accurate (the pronunciation can be reproduced). Since English does not have the sounds of Bengali, and since pronunciation does not completely reflect the spellings, not being faithful to both.

Although it might be desirable to use a transliteration scheme where the original Bengali orthography is recoverable from the Latin text, Bengali words are currently Romanised on Wikipedia using a phonemic transcription, where the true phonetic pronunciation of Bengali is represented with no reference to how it is written. The Wikipedia Romanisation scheme is given in the table below, with the IPA transcriptions as used above.


The Portuguese missionaries stationed in Bengal in the 16th century were the first people to employ the Latin alphabet in writing Bengali books, the most famous of which are the Crepar Xaxtrer Orth, Bhed and the Vocabolario em idioma Bengalla, e Portuguez dividido em duas partes, both written by Manuel da Assumpção. But the Portuguese-based romanisation did not take root. In the late 18th century Augustin Aussant used a romanisation scheme based on the French alphabet. At the same time, Nathaniel Brassey Halhed used a romanisation scheme based on English for his Bengali grammar book. After Halhed, the renowned English philologist and oriental scholar Sir William Jones devised a romanisation scheme for Bengali and for Indian languages in general, and published it in the Asiatick Researches journal in 1801.[4] This scheme came to be known as the "Jonesian System" of romanisation, and served as a model for the next century and a half.

Transliteration vs transcription[edit]

The Romanisation of a language written in a non-Roman script can be based on transliteration (orthographically accurate, i.e. the original spelling can be recovered) or transcription (phonetically accurate, i.e. the pronunciation can be reproduced). This distinction is important in Bengali as its orthography was adopted from Sanskrit, and ignores sound change processes of several millennia. To some degree, all writing systems differ from the way the language is pronounced, but this may be more extreme for languages like Bengali. For example, the three letters শ, ষ, and স had distinct pronunciations in Sanskrit, but over several centuries, the standard pronunciation of Bengali (usually modeled on the Nadia dialect), has lost these phonetic distinctions (all three are usually pronounced as IPA [ʃɔ]) while the spelling distinction nevertheless persists in orthography.

In written texts, it is easy to distinguish between homophones such as শাপ shap "curse" and সাপ shap "snake". Such a distinction could be particularly relevant in searching for the term in an encyclopedia, for example. However, the fact that the words sound identical means that they would be transcribed identically; thus, some important meaning distinctions cannot be rendered in a transcription model. Another issue with transcription systems is that cross-dialectal and cross-register differences are widespread, and thus the same word or lexeme may have many different transcriptions. Even simple words like মন "mind" may be pronounced "mon", "môn", or (in poetry) "mônô" (e.g. the Indian national anthem, Jana Gana Mana).

Often, different phonemes (meaningfully different sounds) are represented by the same symbol or grapheme. Thus, the vowel এ can represent both [e] (এল elo [elɔ] "came"), or [æ] (এক êk [æk] "one"). Occasionally, words written in the same way (homographs) may have different pronunciations for differing meanings: মত can mean "opinion" (pronounced môt), or "similar to" (môtô). Thus, some important phonemic distinctions cannot be rendered in a transliteration model. In addition, when representing a Bengali word to allow speakers of other languages to pronounce it easily, it may be better to use a transcription, which does not include the silent letters and other idiosyncrasies (e.g. স্বাস্থ্য sbasthyô, spelled <swāsthya>, or অজ্ঞান ôggên, spelled <ajñāna>) that make Bengali romanisation so complicated. Those spelled letters are false to phonetic romanisation of Bengali and is a result of often inclusion of the Bengali script with other Indic scripts for romanisations, where the other Indic scripts don't carry the inherited vowel ô, thus making Bengali romanisation a mess.

Comparison of romanisations[edit]

Comparisons of standard romanisation schemes for Bengali are given in the table below. Two standards are commonly used for transliteration of Indic languages including Bengali. Many standards (e.g. NLK / ISO), use diacritic marks and permit case markings for proper nouns. Newer forms (e.g. Harvard-Kyoto) are more suited for ASCII-derivative keyboards, and use upper- and lower-case letters contrastively and forgo normal standards for English capitalization.

  • "NLK" stands for the diacritic-based letter-to-letter transliteration schemes, best represented by the National Library at Kolkata romanisation or the ISO 15919, or IAST. This is the ISO standard, and it uses diacritic marks (e.g. ā) to reflect the additional characters and sounds of Bengali letters.
  • ITRANS is an ASCII representation for Sanskrit; it is one-to-many, i.e. there may be more than one way of transliterating characters, which can make internet searching more complicated. ITRANS representations forgo capitalization norms of English so as to be able to represent the characters using a normal ASCII keyboard.
  • "HK" stands for two other case-sensitive letter-to-letter transliteration schemes: Harvard-Kyoto and XIAST scheme. These are similar to the ITRANS scheme, and use only one form for each character.
  • XHK or Extended Harvard-Kyoto (XHK) stands for the case-sensitive letter-to-letter Extended Harvard-Kyoto transliteration. This adds some specific characters for handling Bengali text to IAST.
  • "Wiki" stands for a phonemic transcription-based romanisation. It is a sound-preserving transcription based on what is perceived[by whom?] to be the standard pronunciation of the Bengali words, with no reference to how it is written in Bengali script.[citation needed] It uses diacritics often used by linguists specializing in Bengali (other than IPA),[citation needed] and is the transcription system used to represent Bengali sounds in Wikipedia articles.[according to whom?][citation needed]


The following table includes examples of Bengali words Romanised using the various systems mentioned above.

Example words
In orthography Meaning NLK XHK ITRANS HK Wiki[original research?] IPA
মন mind mana mana mana mana mon [mon]
সাপ snake sāpa sApa saapa sApa shap [ʃap]
শাপ curse śāpa zApa shaapa zApa shap [ʃap]
মত opinion mata mata mata mata môt [mɔt̪]
মত like mata mata mata mata moto [mɔt̪o]
তেল oil tēla tela tela tela tel [t̪el]
গেল went gēla gela gela gela gêlô [ɡɛlɔ]/[ɡælo]
জ্বর fever jvara jvara jvara jvara jôr [dʒɔr]
স্বাস্থ্য health svāsthya svAsthya svaasthya svAsthya shasththo [ʃast̪ʰːo]
বাংলাদেশ Bangladesh bāṃlādēśa bAMlAdeza baa.mlaadesha bAMlAdeza Bangladesh [baŋlad̪eʃ]
ব্যঞ্জনধ্বনি consonant byañjanadhvani byaJjanadhvani bya~njanadhvani byaJjanadhvani bênjondhoni [bændʒɔnd̪ʱoni]
আত্মহত্যা suicide ātmahatyā AtmahatyA aatmahatyaa AtmahatyA attohotta [at̪ːohɔt̪ːa]

Romanisation reference[edit]

The IPA (International Phonetic Alphabet) transcription is provided in the rightmost column, representing the most common pronunciation of the glyph in Standard Colloquial Bengali, alongside the various romanisations described above.

Vowels & Miscellaneous
a a a a a ô/o [ɔ]/[o]
ā ā ā A~aa A a [a]
i i i i i i [i]
ī ī ī I~ii I i [i]
u u u u u u [u]
ū ū ū U~uu U u [u]
r RRi~R^i R ri [ri]
e ē e e e e/æ [e]/[æ]
ai ai ai ai ai oi [oi]
o ō o o o o [o]
au au au au au ou [ou]
H H varies varies
ng .m M ng [ŋ]
◌̃ ɱ .N ~ ~ [~] (nasalization)
্য y y y y y varies varies
্ব w/v v v v v varies varies
ক্ষ kṣ kṣ kṣ x kS kkhô [kʰːɔ]
জ্ঞ GY jJ ggô [ɡːɔ]
শ্র śr śr śr shr zr shrô [ʃɾɔ]
k k k k k [kɔ]
kh kh kh kh kh khô [kʰɔ]
g g g g g [ɡɔ]
gh gh gh gh gh ghô [ɡʱɔ]
ng ~N G ngô [ŋɔ]/[uõ]
c c c ch c chô [tʃɔ]
ch ch ch Ch ch chhô [tʃʰɔ]
j j j j j [dʒɔ]
jh jh jh jh jh jhô [dʒʱɔ]
ñ ñ ñ ~n J niô [nɔ]
T T ţô [ʈɔ]
ṭh ṭh ṭh Th Th ţhô [ʈʰɔ]
D D đô [ɖɔ]
ড় .D P ŗô [ɽɔ]
ḍh ḍh ḍh Dh Dh đhô [ɖʱɔ]
ঢ় ṛh ḍh ḏh .Dh Ph ŗhô [ɽɔ]
N N [nɔ]
t t t t t [t̪ɔ]
th th th th th thô [t̪ʰɔ]
d d d d d [d̪ɔ]
dh dh dh dh dh dhô [d̪ʱɔ]
n n n n n [nɔ]
p p p p p [pɔ]
ph ph ph ph ph fô/phô [ɸɔ~pʰɔ]
b b b b b [bɔ]
bh bh bh bh bh bhô [bʱɔ]
m m m m m [mɔ]
y/j y y y [dʒɔ]
য় y Y Y yô/e [e̯ɔ]/–
r r r r r [rɔ]
l l l l l [lɔ]
ś/sh ś ś sh z shô [ʃɔ]
ṣ/sh Sh S shô [ʃɔ]
s s s s s [sɔ]
h h h h h [ɦɔ]


  1. ^ In Japanese there exists some debate as to whether to accent certain distinctions, such as Tōhoku vs Tohoku. Sanskrit is well standardized, because the speaking community is relatively small, and sound change is not a large concern


  1. ^ "Learning International Alphabet of Sanskrit Transliteration". Sanskrit 3 - Learning transliteration. Gabriel Pradiipaka & Andrés Muni. Archived from the original on 12 February 2007. Retrieved 2006-11-20. 
  2. ^ "ITRANS — Indian Language Transliteration Package". Avinash Chopde. Retrieved 2006-11-20. 
  3. ^ "Annex-F: Roman Script Transliteration" (PDF). Indian Standard: Indian Script Code for Information Interchange — ISCII. Bureau of Indian Standards. 1 April 1999. p. 32. Retrieved 2006-11-20. 
  4. ^ Jones 1801
  5. ^ a b বাংলা একাডেমী ব্যবহারিক বাংলা অভিধান Bangla Academy Byaboharik Bangla Abhidhan (Bangla Academy Functional Bengali Dictionary) (16th reprint ed.). DHaka 1000, Bangladesh: Bangla Academy. Nov 2012. p. আট্রিশ (তালিকা -৪). ISBN 984-07-5071-2.