Romanization of Japanese

The romanization of Japanese is the use of the Latin alphabet (called rōmaji (ローマ字) in Japanese) to write the Japanese language, which is normally written in logographic characters borrowed from Chinese (kanji) and syllabic scripts (kana). This is done in any context where Japanese text is targeted at those who do not know the language: for example, for names on street signs, passports, and in dictionaries and textbooks for foreign learners of the language. The word "rōmaji" is sometimes incorrectly transliterated as romanji or rōmanji (note the "n" before the "j").

There are a number of different romanization systems. The three main ones are Hepburn romanization, Kunrei-shiki Rōmaji (ISO 3602), and Nihon-shiki Rōmaji (ISO 3602 Strict). Variants of Hepburn are the most widely used.

All Japanese who have attended elementary school since World War II have been taught to read and write romanized Japanese. Romanization is also the most common way to input Japanese into word processors and computers. Therefore, almost all Japanese are able to read and write Japanese using rōmaji. The primary usage of romaji is on computers and other electronic devices that for whatever reason do not support the display or input of Japanese characters, in educational materials for foreigners, and in academic papers in English written on the topic of Japanese (i.e. linguistics or literature).

History

The earliest Japanese romanization system was based on the orthography of Portuguese. It was developed around 1548 by a Japanese Catholic named Yajiro. Jesuit presses used the system in a series of printed Catholic books so that missionaries could preach and teach their converts without learning to read Japanese ideographs. The most useful of these books for the study of early modern Japanese pronunciation and early attempts at romanization was the Nippo jisho, a Japanese-Portuguese dictionary written in 1603. In general, the early Portuguese system was similar to Nihon-shiki in its treatment of vowels. Some consonants were transliterated differently: for instance, the /k/ consonant was rendered as "c", and the /ɸ/ consonant (now pronounced /h/) as "f", so Nihon no kotoba ("The language of Japan") was spelled "Nifon no cotoba". The Jesuits also printed some secular books in romanized Japanese, including the first printed edition of the Japanese classic The Tale of the Heike, romanized as Feiqe no monogatari, and a collection of Aesop's Fables (romanized as Esopo no fabvlas). The latter continued to be printed and read after the suppression of Christianity (Chibbett, 1977).

Following the expulsion of Christians from Japan in the late 1590s and early 1600s, rōmaji fell out of use, and were only used sporadically in foreign texts until the mid-1800s, when Japan opened up again. The systems used today all developed in the latter half of the 19th century.

The first system to be developed was the Hepburn system, developed for James Curtis Hepburn's dictionary of Japanese words and intended for foreigners to use.

In the Meiji era, some Japanese scholars advocated abolishing the Japanese writing system entirely and using rōmaji in its stead. The Nihon shiki romanization was an outgrowth of this movement. Several Japanese texts were published entirely in rōmaji during this period, but it failed to catch on, perhaps because of the large number of homophones in Japanese, which are pronounced similarly but written in different characters. Later, in the early 20th century, some scholars devised syllabary systems with characters derived from Latin; these were even less popular, because they were not based on any historical use of the Latin alphabet.

Modern systems

Hepburn

The Revised Hepburn system of romanization uses a macron to indicate some long vowels, and an apostrophe to note the separation of easily confused phonemes. For example, the name じゅんいちろう, written with the kana characters ju-n-i-chi-ro-u, and romanized as Jun'ichirō in Revised Hepburn. This system is widely used in Japan and among foreign students and academics.

Hepburn romanization generally follows English phonology with Romance vowels, and is an intuitive method of showing Anglophones the pronunciation of a word in Japanese. It was standardized in the USA as American National Standard System for the Romanization of Japanese (Modified Hepburn), but this status was abolished on October 6, 1994. Hepburn is the most common romanization system in use today, especially in the English-speaking world. The Hepburn system has been criticized because its distortion of the Japanese phonology can make it harder to teach Japanese to non-natives.

Nihon-shiki

Nihon-shiki is probably the least used of the three main systems. It was originally invented as a method for the Japanese to write their own language. It follows Japanese phonology and the syllabary order very strictly and is hence the only major system of romanization that allows lossless mapping to and from kana. It has also been standardized as ISO 3602 strict form.

Kunrei-shiki

Kunrei-shiki is a slightly modified version of Nihon-shiki which eliminates differences between the kana syllabary and modern pronunciation. For example, when the words kana かな and tsukai つかい are combined, the result is written in kana as かなづかい with a dakuten (voicing sign) ゛on the つ (tsu) kana to indicate that the tsu つ is now voiced. The づ kana is pronounced in the same way as a different kana, す (su), with dakuten, ず. Kunrei-shiki and Hepburn ignore the difference in kana and represent the sound in the same way, as kanazukai, using the same letters "zu" as are used to romanize ず. Nihon-shiki retains the difference, and romanizes the word as kanadukai, differentiating the づ and ず kana, which is romanized as zu, even though they are pronounced identically. Similarly for the pair じ and ぢ, which are both zi in Kunrei-shiki and both ji in Hepburn romanization, but are zi and di respectively in Nihon-shiki. See the table below for full details.

Kunrei-shiki has been standardized by the Japanese Government and ISO (ISO 3602). Kunrei-shiki is taught to Japanese elementary school students in their fourth year.

Other variants

It is possible to elaborate these romanizations to enable non-native speakers to pronounce Japanese words more correctly. Typical additions include tone marks to note the Japanese pitch accent and diacritic marks to distinguish phonological changes, such as the assimilation of the moraic nasal /n/ (see Japanese phonology).

JSL

JSL is a romanization system based on Japanese phonology, designed using the linguistic principles used by linguists in designing writing systems for languages that do not have any. It is a purely phonemic system, using exactly one symbol for each phoneme, and marking pitch accent using diacritics. It was created for Eleanor Harz Jorden's system of Japanese language teaching. Its principle is that such a system enables students to better internalize the phonology of Japanese. Since it does not have any of the advantages for non-native speakers that the other rōmaji systems have, and the Japanese already have a writing system for their language, JSL is not widely used outside the educational environment.

Non-standard romanization

In addition to the standardized systems above, there are many variations in romanization, used either for simplification, in error or confusion between different systems, or for deliberate stylistic reasons.

Notably, the various mappings that Japanese input methods use to convert keystrokes on a Roman keyboard to kana often combine features of all of the systems; when used as plain text rather than being converted, these are usually known as wāpuro rōmaji. (Wāpuro is a portmanteau of wādo purosessā [word processor].) Unlike the standard systems, wāpuro rōmaji requires no characters from outside the ASCII character set.

While there may be arguments in favour of some of these variant romanizations in specific contexts, their use, especially if mixed, leads to confusion when romanized Japanese words are indexed.

The following variant romanizations are common:

Japanese words and names that have established English spellings, such as kudzu and jiu jitsu, or loanwords such as kyatto for "cat", are sometimes written as they are in English, without regard for the rules of romanization.
Jya for じゃ, which is ja in Hepburn and zya in Nihon-shiki and Kunrei-shiki, and similarly jyu for じゅ and jyo for じょ. The extraneous y seems to be the result of confusion between the romanization systems.
Cchi for っち (Hepburn tchi) and so on. This is wāpuro rōmaji, but is often used for stylistic reasons when rendering nicknames (for example, あきこ Akiko becoming あっちゃん Acchan rather than Atchan).
La for ら (Hepburn ra) and so on. The Japanese consonant /r/ has a sound (IPA [ɺ]) that is near, but not identical, to both of English "r" and "l". "R" and "l" are both transcribed into Japanese using the Japanese /r/. Examples of "l" in romanized Japanese include Japanese children's doll リカ, romanized as Licca.
Na for んあ (Hepburn n'a) and so on. This form of romanized Japanese is used in public information such as road and railway signs in Japan.
Nn for ん (Hepburn n). This is also an example of wāpuro rōmaji (although many Japanese input methods also accept the Hepburn n'). This leads to ambiguity with the more widespread Hepburn system. For example, the cluster nna, which is んな in Hepburn, represents んあ in this system. The double n is sometimes seen in names.

Long vowels

The most common variant romanization is to omit the macrons or circumflexes used to indicate a long vowel. This is extremely common in the romanized version of Japanese words used in English. For example the capital city of Japan, correctly written Tōkyō in romanized Japanese, is universally written as Tokyo. In Japan, since romanized Japanese is seen mostly as a convenience for foreigners to be able to read signs easily, macrons and circumflexes are usually omitted for simplification.

Many typewriters, word processors, and computerized systems cannot easily deal with the macron used in Hepburn romanization. Nihon-shiki and Kunrei-shiki use a circumflex accent (thus, Tôkyô). This may allow for easier input, since all of â, î, û, ê, and ô are in the ISO-8859-1 character set, and may be easily input on a variety of systems.

The following methods of representing long vowels also commonly occur:

Oh for おお or おう (Hepburn ō). This is sometimes known as "passport Hepburn", as the Japanese Foreign Ministry has authorized (but not required) this usage in passports [1]
Ou for おう (also Hepburn ō). This is also an example of wāpuro rōmaji.
Ô for おお or おう (Hepburn ō). This is valid Nihon-shiki and Kunrei-shiki, but occasionally occurs in otherwise Hepburn-romanized words (as described above).

Archaic variants

Main article: Historical kana usage

In older texts, other variant romanizations which are now no longer used are sometimes seen. Some of them have survived to the present day, although few of them are still actively used. Examples include:

The vowel i plus o was sometimes used to represent the Japanese yōon sound: hence Tokyo becomes "Tokio" and Kyoto becomes "Kioto". This romanization can still be seen in the species name "mioga" of the Japanese vegetable myōga.
The kana ゑ was rendered as ye. The actual pronunciation of this kana was once we, but the w had already been lost by the time that (e.g.) ゑど "Wedo" was first romanized as Yedo.
The kana づ (Nihon-shiki du) was romanized as dzu, as seen in the plant names adzuki and kudzu. This enjoys some currency even today as Hepburn-like wāpuro rōmaji, and has a phonetic value distinct from zu in many dialects of Japanese.
"e" has sometimes been rendered "ye" - e.g. "Iyeyasu" instead of "Ieyasu", "Inouye" instead of "Inoue", and "yen" instead of "en"

Romanization of Japanese names

Names can be subject to even more variation, with spellings depending on the individual's preference. For example, the manga artist Yasuhiro Nightow's family name would be more conventionally written in Hepburn romanization as Naitō.

Other variations seen in names include the substitution of K with C, as in the name of television celebrity Ricaco or the snack food Jagarico, or the removal of unvoiced vowels, as in the name of film director Macoto Tezka (the son of manga artist Osamu Tezuka). Note the removal of the u vowel.

Example words written in each romanization system

English	Japanese	Kana spelling	Romanization
English	Japanese	Kana spelling	Revised Hepburn	Kunrei-shiki	Nihon-shiki
Roman characters	ローマ字	ローマじ	rōmaji	rômazi	rômazi
Mount Fuji	富士山	ふじさん	Fujisan	Huzisan	Huzisan
tea	お茶	おちゃ	ocha	otya	otya
governor	知事	ちじ	chiji	tizi	tizi
to shrink	縮む	ちぢむ	chijimu	tizimu	tidimu
to continue	続く	つづく	tsuzuku	tuzuku	tuduku

Chart of romanizations

This chart shows the significant differences between the major romanization systems.

Kana	Revised Hepburn	Kunrei-shiki	Nihon-shiki
うう	ū	û	û
おう, おお	ō	ô	ô
し	shi	si	si
しゃ	sha	sya	sya
しゅ	shu	syu	syu
しょ	sho	syo	syo
じ	ji	zi	zi
じゃ	ja	zya	zya
じゅ	ju	zyu	zyu
じょ	jo	zyo	zyo
ち	chi	ti	ti
つ	tsu	tu	tu
ちゃ	cha	tya	tya
ちゅ	chu	tyu	tyu
ちょ	cho	tyo	tyo
ぢ	ji	zi	di
づ	zu	zu	du
ぢゃ	ja	zya	dya
ぢゅ	ju	zyu	dyu
ぢょ	jo	zyo	dyo
ふ	fu	hu	hu

Historical romanizations

Kana	Vocabvlario da Lingoa de Iapam (1603)	Arte da Lingoa de Iapam (1604-1608)	Arte Breve da Lingoa Iapoa (1620)
あ	a	a	a
い	i,j, y	i	y
う	v, u	v	v
え	ye	ye	ye
お	vo, uo	vo	vo
か	ca	ca	ca, ka
き	qi, qui	qui	ki
く	cu, qu	cu, qu	cu, ku
け	qe,que	que	ke
こ	co	co	co
きゃ	qia	quia	kia
きょ	qio, qeo	quio	kio
くゎ	qua	qua	qua
が	ga	ga	ga, gha
ぎ	gui	gui	ghi
ぐ	gu, gv	gu	gu, ghu
げ	gue	gue	ghe
ご	go	go	go, gho
ぐゎ	gua	gua	gua
ぎゃ	guia		ghia
ぎゅ	guiu	guiu	ghiu
ぎょ	guio	guio	ghio
さ	sa	sa	sa
し	xi	xi	xi
す	su	su	su
せ	xe	xe	xe
そ	so	so	so
しゃ	xa	xa	xa
しゅ	xu	xu	xu
しょ	xo	xo	xo
ざ	za	za	za
じ	ii, ji	ji	ii
ず	zu	zu	zu
ぜ	ie, ye		ie
ぞ	zo	zo	zo
じゃ	ia, ja	ia	ia
じゅ	iu, ju	ju	iu
じょ	io, jo	jo	io
た	ta	ta	ta
ち	chi	chi	chi
つ	tçu	tçu	tçu
て	te	te	te
と	to	to	to
ちゃ	cha	cha	cha
ちゅ	chu	chu	chu
ちょ	cho	cho	cho
だ	da	da	da
ぢ	gi	gi	gi
づ	zzu	dzu	dzu
で	de	de	de
ど	do	do	do
ぢゃ	gia	gia	gia
ぢゅ	giu	giu	giu
ぢょ	gio	gio	gio
な	na	na	na
に	ni	ni	ni
ぬ	nu	nu	nu
ね	ne	ne	ne
の	no	no	no
にゃ	nha	nha	nha
にゅ	nhu, niu	nhu	nhu
にょ	nho, neo	nho	nho
は	fa	fa	fa
ひ	fi	fi	fi
ふ	fu	fu	fu
へ	fe	fe	fe
ほ	fo	fo	fo
ひゃ	fia
ひゅ	fiu
ひょ	fio, feo	fio	fio
ば	ba	ba	ba
び	bi	bi	bi
ぶ	bu	bu	bu
べ	be	be	be
ぼ	bo	bo	bo
びゃ	bia	bia
びゅ	biu		biu
びょ	bio, beo
ぱ	pa	pa	pa
ぴ	pi	pi	pi
ぷ	pu	pu	pu
ぺ	pe	pe	pe
ぽ	po	po	po
ぴゃ	pia		pia
ぴゅ
ぴょ	pio
ま	ma	ma	ma
み	mi	mi	mi
む	mu	mu	mu
め	me	me	me
も	mo	mo	mo
みゃ	mia, mea
みょ	mio, meo		mio
や	ya	ya	ya
ゆ	yu	yu	yu
よ	yo	yo	yo
ら	ra	ra	ra
り	ri	ri	ri
る	ru	ru	ru
れ	re	re	re
ろ	ro	ro	ro
りゃ	ria, rea
りゅ	riu		riu
りょ	rio, reo	rio	rio
わ	va, ua	va	va
ゐ		y	y
ゑ		ye	ye
を	vo, uo	vo	vo
ん	n, m, ~ (tilde)	n	n. m
っ	-t, -cc-, -cch-, -cq-, -dd-, -pp-, -ss-, -tt, -xx-, -zz-	-t, -cc-, -cch-, -pp-, -cq-, -ss-, -tt-, xx-	-t, -cc-, -cch-, -pp-, -ck-, -cq-, -ss-, -tt-, -xx-

Alphabet letter names in Japanese

The list below shows how to spell Latin character words or acronyms in Japanese. For example, NHK is spelled enu-eichi-kei, (エヌエイチケイ).

A; ē or ei (エー or エイ)
B; bī (ビー, alternative pronunciation bē, ベー)
C; shī (シー or シィー, sometimes pronounced sī, スィー)
D; dī (ディー, alternative pronunciation dē, デー)
E; ī (イー)
F; efu (エフ)
G; jī (ジー)
H; eichi (エイチ)
I; ai (アイ)
J; jē or jei (ジェー or ジェイ)
K; kē or kei (ケー or ケイ)
L; eru (エル)
M; emu (エム)
N; enu (エヌ)
O; ō (オー)
P; pī (ピー, alternative pronunciation pē, ペー)
Q; kyū (キュー)
R; āru (アール)
S; esu (エス)
T; tī (ティー, though sometimes pronounced chī, チー, and alternatively pronounced tē, テー)
U; yū (ユー)
V; vi (ヴィ, though often pronounced bui, ブイ)
W; daburyū (ダブリュー)
X; ekkusu (エックス)
Y; wai (ワイ)
Z; zetto, zeddo, or zī (ゼット, ゼッド, or ズィー, though sometimes pronounced jī, ジー)

Kana without romanized forms

There is no generally accepted form of romanization for some forms of kana. In particular there is no form of romanization for full-sized kana combined with smaller versions of the vowel kana, "ぁ", "ぃ", "ぅ", "ぇ" and "ぉ", the smaller versions of the y kana, "ゃ", "ゅ", and "ょ", and the sokuon or small tsu kana "っ". Although these are usually regarded as merely phonetic marks or diacritics, they do appear on their own, for example at the end of sentences or in some names.

There is also no commonly accepted way of romanizing common combinations such as "トゥ" of katakana to and small u, used to represent sounds as in the English word "too". Some people write this pair as tu, but this is likely to be confused with the tu Nihon-shiki and Kunrei-shiki romanizations of the kana ツ, romanized as tsu in Hepburn romanization.

On a computer or word processor, these smaller kana may be produced in various ways. For example, an "x" or an "l" preceding the romanization of the full-sized kana produces a small version on some systems, thus xtu gives "っ" on a Microsoft computer. However this is not standardized, and these forms are restricted to use in input systems; they are not used to represent the smaller kana in romanized Japanese.

References

Chibbett, David (1977). The History of Japanese Printing and Book Illustration. Kodansha International Ltd. ISBN 0-87011-288-0.
Jun'ichirō Kida (紀田順一郎, Kida Jun'ichirō). Nihongo Daihakubutsukan (日本語大博物館) (in Japanese). Just System (ジャストシステム, Jasuto Shisutem). ISBN 4-88309-046-9.
Tadao Doi (土井忠生) (1980). Hōyaku Nippo Jisho (邦訳日葡辞書) (in Japanese). Iwanami Shoten (岩波書店).
Tadao Doi (土井忠生) (1955). Nihon Daibunten (日本大文典) (in Japanese). Sanseido (三省堂).
Mineo Ikegami (池上岑夫) (1993). Nihongo Shōbunten (日本語小文典) (in Japanese). Iwanami Shoten (岩波書店).
Hiroshi Hino (日埜博) (1993). Nihon Shōbunten (日本小文典) (in Japanese). Shin-Jinbutsu-Ôrai-Sha (新人物往来社).

External links

Convert Kanji to Rōmaji and Hiragana
Rōmaji sōdan shitsu (in Japanese) contains an extremely extensive and accurate collection of materials relating to rōmaji, including standards documents and even HTML versions of Hepburn's original dictionaries.
The rōmaji conundrum from Andrew Horvat's Total Quality Japanese contains a discussion of the problems caused by the variety of confusing romanization systems in use in Japan today.
All free Japanese rōmaji dictionaries
Rōmaji to Kana translator
Converts Rōmaji to Kana, Hepburn System
Converts Rōmaji to Kana, Hepburn System Same link with Frames disabled