Romanization of Bengali
The Romanization of Bengali is the representation of the Bengali language in the Latin script. There are various ways of Romanization systems of Bengali created in recent years which have failed to represent the true Bengali phonetic sound. While different standards for romanization have been proposed for Bengali, these have not been adopted with the degree of uniformity seen in languages such as Japanese or Sanskrit.[note 1] The Bengali script has often been included with the group of Indic scripts for romanization where the true phonetic value of Bengali is never represented. Some of them are the "International Alphabet of Sanskrit Transliteration" or IAST system (based on diacritics), "Indian languages Transliteration" or ITRANS (uses upper case alphabets suited for ASCII keyboards), and the National Library at Calcutta romanization.
In the context of Bengali Romanization, it is important to distinguish transliteration from transcription. Transliteration is orthographically accurate (i.e. the original spelling can be recovered), whereas transcription is phonetically accurate (the pronunciation can be reproduced). Since English does not have the sounds of Bengali, and since pronunciation does not completely reflect the spellings, not being faithful to both.
Although it might be desirable to use a transliteration scheme where the original Bengali orthography is recoverable from the Latin text, Bengali words are currently Romanized on Wikipedia using a phonemic transcription, where the true phonetic pronunciation of Bengali is represented with no reference to how it is written. The Wikipedia Romanization scheme is given in the table below, with the IPA transcriptions as used above.
The Portuguese missionaries stationed in Bengal in the 16th century were the first people to employ the Latin alphabet in writing Bengali books, the most famous of which are the Crepar Xaxtrer Orth, Bhed and the Vocabolario em idioma Bengalla, e Portuguez dividido em duas partes, both written by Manuel da Assumpção. But the Portuguese-based romanization did not take root. In the late 18th century Augustin Aussant used a romanization scheme based on the French alphabet. At the same time, Nathaniel Brassey Halhed used a romanization scheme based on English for his Bengali grammar book. After Halhed, the renowned English philologist and oriental scholar Sir William Jones devised a romanization scheme for Bengali and for Indian languages in general, and published it in the Asiatick Researches journal in 1801. This scheme came to be known as the "Jonesian System" of romanization, and served as a model for the next century and a half.
Transliteration vs transcription
The Romanization of a language written in a non-Roman script can be based on transliteration (orthographically accurate, i.e. the original spelling can be recovered) or transcription (phonetically accurate, i.e. the pronunciation can be reproduced). This distinction is important in Bengali as its orthography was adopted from Sanskrit, and ignores sound change processes of several millennia. To some degree, all writing systems differ from the way the language is pronounced, but this may be more extreme for languages like Bengali. For example, the three letters শ, ষ, and স had distinct pronunciations in Sanskrit, but over several centuries, the standard pronunciation of Bengali (usually modeled on the Nadia dialect), has lost these phonetic distinctions (all three are usually pronounced as IPA [ʃɔ]) while the spelling distinction nevertheless persists in orthography.
In written texts, it is easy to distinguish between homophones such as শাপ shap "curse" and সাপ shap "snake". Such a distinction could be particularly relevant in searching for the term in an encyclopedia, for example. However, the fact that the words sound identical means that they would be transcribed identically; thus, some important meaning distinctions cannot be rendered in a transcription model. Another issue with transcription systems is that cross-dialectal and cross-register differences are widespread, and thus the same word or lexeme may have many different transcriptions. Even simple words like মন "mind" may be pronounced "mon", "môn", or (in poetry) "mônô" (e.g. the Indian national anthem, Jana Gana Mana).
Often, different phonemes (meaningfully different sounds) are represented by the same symbol or grapheme. Thus, the vowel এ can represent both [e] (এল elo [elɔ] "came"), or [æ] (এক êk [æk] "one"). Occasionally, words written in the same way (homographs) may have different pronunciations for differing meanings: মত can mean "opinion" (pronounced môt), or "similar to" (môtô). Thus, some important phonemic distinctions cannot be rendered in a transliteration model. In addition, when representing a Bengali word to allow speakers of other languages to pronounce it easily, it may be better to use a transcription, which does not include the silent letters and other idiosyncrasies (e.g. স্বাস্থ্য sbasthyô, spelled <swāsthya>, or অজ্ঞান ôggên, spelled <ajñāna>) that make Bengali romanization so complicated. Those spelled letters are false to phonetic romanization of Bengali and is a result of often inclusion of the Bengali script with other Indic scripts for romanizations, where the other Incic scripts don't carry the inherited vowel ô, thus making Bengali romanization a mess.
Comparison of romanizations
Comparisons of standard romanization schemes for Bengali are given in the table below. Two standards are commonly used for transliteration of Indic languages including Bengali. Many standards (e.g. NLK / ISO), use diacritic marks and permit case markings for proper nouns. Newer forms (e.g. Harvard-Kyoto) are more suited for ASCII-derivative keyboards, and use upper- and lower-case letters contrastively and forgo normal standards for English capitalization.
- "NLK" stands for the diacritic-based letter-to-letter transliteration schemes, best represented by the National Library at Kolkata romanization or the ISO 15919, or IAST. This is the ISO standard, and it uses diacritic marks (e.g. ā) to reflect the additional characters and sounds of Bengali letters.
- ITRANS is an ASCII representation for Sanskrit; it is one-to-many, i.e. there may be more than one way of transliterating characters, which can make internet searching more complicated. ITRANS representations forgo capitalization norms of English so as to be able to represent the characters using a normal ASCII keyboard.
- "HK" stands for two other case-sensitive letter-to-letter transliteration schemes: Harvard-Kyoto and XIAST scheme. These are similar to the ITRANS scheme, and use only one form for each character.
- XHK or Extended Harvard-Kyoto (XHK) stands for the case-sensitive letter-to-letter Extended Harvard-Kyoto transliteration. This adds some specific characters for handling Bengali text to IAST.
- "Wiki" stands for a phonemic transcription-based romanization. It is a sound-preserving transcription based on what is perceived to be the standard pronunciation of the Bengali words, with no reference to how it is written in Bengali script. It uses diacritics often used by linguists specializing in Bengali (other than IPA), and is the transcription system used to represent Bengali sounds in Wikipedia articles.
The following table includes examples of Bengali words Romanized using the various systems mentioned above.
The IPA (International Phonetic Alphabet) transcription is provided in the rightmost column, representing the most common pronunciation of the glyph in Standard Colloquial Bengali, alongside the various romanizations described above.
- In Japanese there exists some debate as to whether to accent certain distinctions, such as Tōhoku vs Tohoku. Sanskrit is well standardized, because the speaking community is relatively small, and sound change is not a large concern
- "Learning International Alphabet of Sanskrit Transliteration". Sanskrit 3 - Learning transliteration. Gabriel Pradiipaka & Andrés Muni. Archived from the original on 12 February 2007. Retrieved 2006-11-20.
- "ITRANS — Indian Language Transliteration Package". Avinash Chopde. Retrieved 2006-11-20.
- "Annex-F: Roman Script Transliteration" (PDF). Indian Standard: Indian Script Code for Information Interchange — ISCII. Bureau of Indian Standards. 1 April 1999. p. 32. Retrieved 2006-11-20.
- Jones 1801
- বাংলা একাডেমী ব্যবহারিক বাংলা অভিধান Bangla Academy Byaboharik Bangla Abhidhan (Bangla Academy Functional Bengali Dictionary) (16th reprint ed.). DHaka 1000, Bangladesh: Bangla Academy. Nov 2012. p. আট্রিশ (তালিকা -৪). ISBN 984-07-5071-2.