|ISO 15924||Khmr, 355
|The Brahmic script and its descendants|
The Khmer script (Khmer: អក្សរខ្មែរ; IPA: [ʔaʔsɑː kʰmaːe])  is an abugida (alphasyllabary) script used to write the Khmer language (the official language of Cambodia). It is also used to write Pali in the Buddhist liturgy of Cambodia and Thailand.
It was adapted from the Pallava script, a variant of the Grantha alphabet descended from the Brahmi script, which was used in southern India and South East Asia during the 5th and 6th centuries AD. The oldest dated inscription in Khmer was found at Angkor Borei District in Takéo Province south of Phnom Penh and dates from 611. The modern Khmer script differs somewhat from precedent forms seen on the inscriptions of the ruins of Angkor.
Khmer is written from left to right. Words within the same sentence or phrase are generally run together with no spaces between them. Consonant clusters within a word are "stacked", with the second (and occasionally third) consonant being written in reduced form under the main consonant. Originally there were 35 consonant characters, but modern Khmer uses only 33. Each such character in fact represents a consonant sound together with an inherent vowel – either â or ô.
There are some independent vowel characters, but vowel sounds are more commonly represented as dependent vowels – additional marks accompanying a consonant character, and indicating what vowel sound is to be pronounced after that consonant (or consonant cluster). Most dependent vowels have two different pronunciations, depending in most cases on the inherent vowel of the consonant to which they are added. In some positions, a consonant written with no dependent vowel is taken to be followed by the sound of its inherent vowel. There are also a number of diacritics used to indicate further modifications in pronunciation.
There are 35 Khmer consonant symbols, although modern Khmer only uses 33, two having become obsolete. Each consonant has an inherent vowel: â /ɑː/ or ô /ɔː/; equivalently, each consonant is said to belong to the A-series or O-series. A consonant's series determines the pronunciation of the dependent vowel symbols which may be attached to it, and in some positions the sound of the inherent vowel is itself pronounced, if the consonant is not accompanied by a dependent vowel (for details see the section below on vowels).
Each consonant (with one exception) also has a subscript form. These may also be referred to as "sub-consonants"; the Khmer term is cheung âksâr (ជើងអក្សរ), meaning "foot of a letter". Most subscript consonants resemble the corresponding consonant symbol, but in a smaller and possibly simplified form, although in a few cases there is no obvious resemblance. Most subscript consonants are written directly below other consonants, although subscript r appears before, while a few others have ascending elements which appear after. Subscripts are used in writing consonant clusters (consonants pronounced consecutively in a word with no vowel sound between them). Clusters in Khmer normally consist of two consonants, although occasionally in the middle of a word there will be three. The first consonant in a cluster is written using the main consonant symbol, with the second (and third, if present) attached to it in the form of subscripts. Subscripts were previously also used to write final consonants; this is not done in modern Khmer, but is retained in the word ឲ្យ aôy /aːoj/ ("give").
The consonants and their subscript forms are listed in the following table. Normal phonetic values are given using the International Phonetic Alphabet (IPA); the sound system is described at Khmer phonology. Transliterations are given using the UNGEGN system; for other systems see Romanization of Khmer.
|Full value (with inherent vowel)||Consonant value||Notes|
|ក||្ក||[kɑː]||kâ||[k]||k||Final [k] is reduced to a glottal stop after some vowels.|
|ប||្ប||[ɓɑː]||bâ||[ɓ], [p]||b, p||Pronounced [p] when final or followed by a subscript consonant.|
|រ||្រ||[rɔː]||rô||[r]||r||Silent when final.|
|ស||្ស||[sɑː]||sâ||[s]||s||Pronounced [h] when final.|
|ឡ||-||[lɑː]||lâ||[l]||l||Has no subscript form in standard orthography, but some fonts include one (្ឡ).|
The Khmer writing system includes supplementary consonants, used in certain loanwords, particularly from French and Thai. These mostly represent sounds which do not occur in native words, or for which the native letters are restricted to one of the two vowel series. Most of them are digraphs, formed by stacking a subscript under the letter ហ hâ, with an additional diacritic if required to change the inherent vowel to ô. The character for pâ, however, is formed by placing the musĕkâtônd ("mouse teeth") diacritic over the character ប bâ.
|Description||Full value (with inherent vowel)||Consonant value||Notes|
|ហ្គ||hâ + kô||[gɑː]||gâ||[g]||g|
|ហ្គ៊||hâ + kô + diacritic||[gɔː]||gô||[g]||g|
|ហ្ន||hâ + nô||[nɑː]||nâ||[n]||n|
|ប៉||bâ + diacritic||[pɑː]||pâ||[p]||p||For the normal use of this diacritic, see below.|
|ហ្ម||hâ + mô||[mɑː]||mâ||[m]||m|
|ហ្ល||hâ + lô||[lɑː]||lâ||[l]||l|
|ហ្វ||hâ + vô||[fɑː], [wɑː]||fâ, wâ||[f], [w]||f, w|
|ហ្វ៊||hâ + vô + diacritic||[fɔː], [wɔː]||fô, wô||[f], [w]||f, w|
|ហ្ស||hâ + sâ||[ʒɑː], [zɑː]||žâ, zâ||[ʒ], [z]||ž, z|
|ហ្ស៊||hâ + sâ + diacritic||[ʒɔː], [zɔː]||žô, zô||[ʒ], [z]||ž, z|
Most Khmer vowel sounds are written using dependent, or diacritical, vowel symbols, known in Khmer as srăk nissăy (ស្រៈនិស្ស័យ) or srăk phsâm (ស្រៈផ្សំ) ("connecting vowel"). These can only be written in combination with a consonant (or consonant cluster). The vowel is pronounced after the consonant (or cluster), even though some of the symbols have graphical elements which appear above, below or to the left of the consonant character. Most of the vowel symbols have two possible pronunciations, depending on the inherent vowel of the consonant to which it is added. Their pronunciations may also be different in weak syllables, and when they are shortened (e.g. by means of a diacritic). Absence of a dependent vowel implies that the consonant is followed by the sound of its inherent vowel, unless it is a final consonant, when there is no vowel sound after it.
In determining the inherent vowel of a consonant cluster (i.e. how a following dependent vowel will be pronounced), stops and fricatives are dominant over sonorants. For any consonant cluster including a combination of these sounds, a following dependent vowel is pronounced according to the dominant consonant, regardless of its position in the cluster. When both members of a cluster are dominant, the subscript consonant determines the pronunciation of a following dependent vowel. A non-dominant consonant may also have its inherent vowel changed by a preceding dominant consonant in the same word, even when there is a vowel between them.
The dependent vowels are listed below, in conventional form with an ellipse as a dummy consonant symbol (these may not display correctly on all browsers), and in combination with the a-series letter អ ’â.
|(none)||អ||[ɑː]||[ɔː]||â||ô||When shortened: [ɑ] (a-series), [ʊə] (o-series)|
|ា||អា||[aː]||[iːə]||a||éa||When shortened: [a] (a-series), [oə] (o-series)|
|ិ||អិ||[ə], [e]||[ɨ], [i]||ĕ||ĭ|
The following table shows combinations of dependent vowels with the diacritics representing final [m] and [h]. They are shown with the a-series consonant អ ’â.
Independent vowels are non-diacritical vowel characters that stand alone (i.e. without being attached to a consonant symbol). In Khmer they are called srăk pénhtuŏ (ស្រៈពេញតួ), which means "complete vowels". They are used in some words to represent vowels or sonorant–vowel combinations that come at the start of a word or syllable, preceded only by a glottal stop. The independent vowels are used in a small number of words, mostly of Indic origin, and consequently there is some inconsistency in their use and pronunciations. However, a few words in which they occur are used quite frequently; these include: ឥឡូវ [ʔəjləw] "now", ឪពុក [ʔəwpuk] "father", ឬ [ʔrɨː] ~ [rɨː] "or", ឮ [lɨː] "hear", ឲ្យ [ʔaoj] "give, let", ឯង [ʔaeŋ] "oneself, I, you", ឯណា [ʔaenaː] "where".
(other variations may occur)
|ឧ||[ʔu], [ʔo]||ŭ, ŏ|
|ឯ||[ʔae], [ʔɛː], [ʔeː]||ê|
The Khmer writing system contains several diacritics, used to indicate further modifications in pronunciation.
|ំ||nĭkkôhĕt (និគ្គហិត)||The Pali niggahīta, related to the anusvara. A small circle written over a consonant or a following dependent vowel, it nasalizes the inherent or dependent vowel, with the addition of [m]; long vowels are also shortened. For combinations in which it is used, see Dependent vowels. Sometimes represents [aɲ] in Sanskrit loanwords.|
|Related to the visarga. A pair of small circles written after a consonant or a following dependent vowel, it modifies and adds final aspiration [h] to the inherent or dependent vowel. For combinations in which it is used, see Dependent vowels.|
|ៈ||yŭkôleăkpĭntŭ (យុគលពិន្ទុ)||A "pair of dots", a fairly recently introduced diacritic, written after a consonant to indicate that the inherent vowel is to be shortened and followed by a glottal stop.|
|Two short vertical lines, written above a consonant, used to convert some o-series consonants (ង ញ ម យ រ វ) to the a-series. It is also used with ប bâ to convert it to a p sound (see Supplementary consonants).|
|៊||treisâpt (ត្រីសព្ទ)||A wavy line, written above a consonant, used to convert some a-series consonants (ស ហ ប អ) to the o-series.|
|ុ||kbiĕh kraôm (ក្បៀសក្រោម)||Also known as bŏkcheung (បុកជើង), "collision foot"; a vertical line written under a consonant, used in place of the diacritics treisâpt and musĕkâtônd when they would be impeded by superscript vowels.|
|់||bântăk (បន្តក់)||A small vertical line written over the last consonant of a syllable, indicating shortening (and corresponding change in quality) of certain vowels.|
|Corresponding to the Devanagari diacritic repha, this originally represented an r sound. Now, in most cases, the consonant above which it appears, and the diacritic itself, are unpronounced.|
|៍||tôndâkhéat (ទណ្ឌឃាដ)||Written over a final consonant to indicate that it is unpronounced.|
|៎||kakâbat (កាកបាទ)||Also known as a "crow's foot", used in writing to indicate the rising intonation of an exclamation or interjection; often placed on particles such as /na/, /nɑː/, /nɛː/, /vəːj/, and the feminine response /cah/.|
|Denotes stressed intonation in some single-consonant words.|
|័||sanhyoŭk sannha (សំយោគសញ្ញា)||Used in some Sanskrit and Pali loanwords (although alternative spellings usually exist); it is written above a consonant to change the inherent vowel to the sound that would be produced by the dependent vowel ា with shortening.|
|៑||vĭréam (វិរាម)||A mostly obsolete diacritic, corresponding to the virama.|
There is also ្, called ceung (ជើង), meaning "foot", a Unicode sign used to input subscript consonants; its appearance varies among fonts.
The sign ៘ means "et cetera" ("etc.").
The reduplication sign ៗ indicates that the preceding word or phrase is to be repeated.
Most consonants, including a few of the subscripts, form ligatures with all dependent vowels that contain the symbol used for the vowel a (ា). A lot of these ligatures are easily recognizable, however a few may not be. One of the more unrecognizable is the ligature for the bâ and a, បា, which was created to differentiate it from the consonant symbol hâ (ហ) and from the ligature for châ and a (ចា). It is not always necessary to connect consonants with the dependent vowel a.
Examples of ligatured symbols:
- chba (/cɓaː/) Subscript consonants with ascending strokes above the baseline also form ligatures with the dependent vowel a (ា).
- msau (/msaw/) Another example of a subscript consonant forming a ligature. In this case, it is with the digraph dependent vowel au. The digraph dependent vowel au includes the cane-like stroke of the vowel a.
- bau (/ɓaw/) The combination of the consonant bâ (ប) and any vowels or digraph vowels based on the vowel a (ា) is written with a stroke in the center of the ligature to give a distinction between the consonant hâ (ហ).
The numerals of the Khmer script, similar to that used by other civilizations in Southeast Asia, are also derived from the southern Indian script. Arabic numerals are also used, but to a lesser extent.
Several styles of Khmer writing are used for varying purposes. The two main styles are âksâr chriĕng (literally "slanted script") and âksâr mul ("round script").
- Âksâr chriĕng (អក្សរជ្រៀង) refers to oblique letters. Entire bodies of text such as novels and other publications may be produced in âksâr chriĕng. Unlike in written English, oblique lettering does not represent any grammatical differences such as emphasis or quotation. Handwritten Khmer is often written in the oblique style.
- Âksâr chhôr (អក្សរឈរ) or Âksâr tráng (អក្សរត្រង់) refers to upright or 'standing' letters, as opposed to oblique letters. Most modern Khmer typefaces are designed in this manner instead of being oblique, as text can be italicized by way of word processor commands and other computer applications to repsent the oblique manner of âksâr chriĕng.
- Âksâr khâm (អក្សរខម) is a style used in Pali palm-leaf manuscripts. It is characterized by sharper serifs and angles and retainment of some antique characteristics; notably in the consonant kâ (ក). This style is also for yantra tattoos and yantras on cloth, paper, or engravings on brass plates in Cambodia as well as in Thailand.
- Âksâr mul (អក្សរមូល) is calligraphical style similar to âksâr khâm as it also retains some characters reminiscent of antique Khmer script. Its name in Khmer, lit. 'round script', refers to the bold and thick lettering style. It is used for titles and headings in Cambodian documents, books, or currency, on shop signs or banners. It is sometimes used to emphasize royal names or other important nouns with the surrounding text in a different style.
The Unicode block for basic Khmer characters is U+1780–U+17FF:
Official Unicode Consortium code chart (PDF)
The Unicode block for additional Khmer symbols is U+19E0–U+19FF:
Official Unicode Consortium code chart (PDF)
- Herbert, Patricia; Anthony Crothers Milner (1989). South-East Asia: languages and literatures : a select guide. University of Hawaii Press. pp. 51–52. ISBN 0-8248-1267-0.
- Huffman, Franklin. 1970. Cambodian System of Writing and Beginning Reader. Yale University Press. ISBN 0-300-01314-0
- Punnee Soonthornpoct: From Freedom to Hell: A History of Foreign Interventions in Cambodian Politics And Wars. page 29, Vantage Press, Inc
- Russell R. Ross: Cambodia: A Country Study, page 112, Library of Congress. Federal Research Division, 1990
- Report on the Current Status of United Nations Romanization Systems for Geographical Names – Khmer, UNGEGN Working Group on Romanization Systems, September 2013 (linked from WGRS website).
- Unicode Character 'KHMER SIGN AHSDA' (U+17CF)
- Dictionnaire Cambodgien, Vol I & II, 1967, L'institut Bouddhique (Khmer Language)
- Jacob, Judith. 1974. A Concise Cambodian-English Dictionary. London, Oxford University Press.
- FAQ and Resources on Khmer in Unicode
- Enabling Khmer Unicode
- Khmer Unicode in some mobile phones
- Khmer Alphabet Chart with Audio
- How to Install Khmer Unicode on your Windows 7 Computer
- How to Install Khmer Unicode on your Windows XP Computer
- Omniglot entry on Khmer
- Geonames Khmer Alphabet Chart
- Khmer Romanization Table (PDF)
- Evolution of the Khmer script
- Authentic Khmer Online (common phrases in Khmer script with audio file examples)
- Khmer wordlist sortet frequenzy
- CBC radio documentary referring to development of keyboard for Khmer script
- A small Primer on the Khmer Language
- A Khmer Language Primer