Konjaku Mojikyō
The Mojikyō character map highlighting the Taiwanese kana [note 1]
Developer(s)Tadahisa Ishikawa
Tokio Furuya
Mojikyō Institute
Initial release1.0 / July 1997; 24 years ago (1997-07)
Final release
4.0 / December 15, 2018; 2 years ago (2018-12-15)
Operating systemMicrosoft Windows
Available inJapanese
TypeCharacter set bundled with fonts and a character map

Mojikyō (Japanese: 文字鏡), also known by its full name Konjaku Mojikyō (今昔文字鏡, lit.'(the) past and present character mirror'), is a character encoding scheme. The Mojikyō Institute (文字鏡研究会, Mojikyō Kenkyūkai), which publishes the character set, also published computer software and TrueType fonts to go along with it. The Mojikyō Institute, chaired by Tadahisa Ishikawa (石川忠久),[1] originally had its character set and related software and data redistributed on CD-ROM by Kinokuniya.[2] Conceptualized in 1996,[3] the first version of the CD-ROM was released in July 1997.[4] For a time, it offered a web subscription, "Mojikyō WEB" (文字鏡WEB) which had more up-to-date characters.[5]

As of September 2006 it encoded 174,975 characters.[6] Among those, 150,366 characters then belonged to the extended CJKV[note 2] family.[5] Many of the encoded characters are considered obsolete or otherwise obscure, and are not encoded by any other character set, including the international standard, Unicode.

Originally a paid product, as of 2015, the Mojikyō Institute began to upload its latest releases to Internet Archive as freeware,[7] as a memorial to honor one of its developers, Tokio Furuya (古家時雄), who had died that year.[3] On December 15, 2018, version 4.0 was released. The next day, Ishikawa announced that this would be the final release of Mojikyō.[3]


The Mojikyō encoding was created to provide a complete index of Chinese, Korean, and Japanese characters. It also encodes a large number of characters in ancient scripts, such as the oracle bone script, the seal script, and Sanskrit (Siddhaṃ). For many characters, it is the only character encoding to encode them, and its data is often used as a starting point for Unicode proposals.[8][9] However, Mojikyō has much looser standards than Unicode for encoding, which leads Mojikyō to have many encoded glyphs of dubious, or even fictional, origin.[10][11] As such, while many unencoded Mojikyō characters are suitable for encoding in Unicode, not all can become Unicode characters, due to the differing standards of evidence required by each.


The Mojikyō fonts (文字鏡フォント) are TrueType fonts that come in a ZIP file and are each around 2–5 megabytes; the different fonts contain different numbers of characters.[note 3] Also included is a Windows executable that implements a character map, the "Mojikyō Character Map" (文字鏡MAP), MOCHRMAP.EXE.[note 4][note 5] This allows the users to browse through the Mojikyō fonts, and copy and paste characters in lieu of typing them on the keyboard. As opposed to the regular Windows character map, or for that matter KCharSelect, which both support TrueType fonts, MOCHRMAP.EXE displays the Mojikyō encoding of the requested character.[12][note 6] In order for MOCHRMAP.EXE to work, all the Mojikyō fonts must be installed for all users (into C:\Windows\Fonts).


When referring to a character encoded in Mojikyō, the format MJXXXXXX is often used, similar to the U+XXXX format used for Unicode. For example, hentaigana U+1B008 𛀈 has Mojikyō encoding MJ090007 and Unicode encoding U+1B008.[13] A difference, however, is that Mojikyō encodings displayed this way are decimal, while Unicode's U+ encoding is hexadecimal.

From the earliest days of Unicode, Mojikyō has both influenced and been influenced by the standard—its glyphs first appear in a proposal to the Ideographic Rapporteur Group (IRG),[note 7] which is responsible for all CJK blocks in Unicode,[14][15] on 18 April 2002.[16] In May 2007, Mojikyō played a minor role in an eventually successful series of proposals to encode the Tangut script in Unicode;[17][note 8] Mojikyō already had within its encoding 6,000 Tangut characters by October 2002.[6]

The Unicode Standard's Unihan Database refers to Mojikyō as the "Japanese KOKUJI Collection" (日本国字集), abbreviated "JK".[18][not in cited source] For example, U+2B679 𫙹 ,[note 9] an ideograph read in Japanese as burizādo (ブリザード, lit.'blizzard'), has a J-Source[note 10] equal to JK-66038. All Unicode characters with a JK-prefixed J-Source originate from Mojikyō.[19][note 11] According to Ken Lunde, a subject matter expert in character encodings and East Asian languages, as of Unicode 13.0, 782 ideographs in Unicode originate from Mojikyō, split somewhat evenly between two blocks: CJK Unified Ideographs Extension C, with 367, and CJK Unified Ideographs Extension E, with 415.[20][21] Not all Unicode characters with Mojikyō origins (JK-prefixed J-Sources) have the same representative glyph in the code chart as in the Mojikyō font;[note 12] some characters had their shapes changed before final encoding, as investigation showed the shapes assigned by the Mojikyō Institute were wrong.[11][note 13]


Mojikyō puts CJKV characters in different blocks according to their traditional Kangxi radical. Common radicals containing an especially high number of characters, such as Radicals 9 () and 162 (), are split further by stroke order.[note 14]

No unification[edit]

Unlike Unicode, Mojikyō purposely avoids Han unification; no attempt at compactness of the encoding is made, nor is there an attempt to keep all common characters below U+FFFF as there is in Unicode.

Unicode, on the other hand, sorts its CJK into blocks based on how common they are: the most common are generally put into the Basic Multilingual Plane,[note 13] while those that are rare or obscure are put into the Astral Planes.

For example, Radical 9 has two characters where Unicode has one: MJ054435 (), and MJ059031 (), both represented in Unicode as U+4EE4 .


Mojikyō is proprietary software under a restrictive license. Originally, the Mojikyō Institute tried to prevent its character data from being used, and threatened those who published conversion tables to and from its character set. As of July 2010, the Mojikyō Institute has abandoned its efforts to stop users from publishing conversion tables or converting characters encoded in Mojikyō to Unicode or other character sets.[22][unreliable source] Such legal claims were probably never enforcable,[citation needed] as they are mere data and the shapes of letters, which are considered common property and as such do not meet the threshold of originality.[note 15]

Due to this legacy, however, GlyphWiki [ja] disallowed Mojikyō data as of 2020.[23]

Collected writing systems[edit]


Dead or obsolete[edit]

  1. ^ As yet, lacks a Unicode encoding, so is approximated here with CSS and U+30BB KATAKANA LETTER SE.
  2. ^ a b For Korean, Hanja are referred to. For Vietnamese, Chữ Nôm.
  3. ^ Download the file MojikyoCmap400ALL49TTF.7z from the official website
  4. ^ English name from the title of the window produced by running the executable; Japanese name from the icon of the executable.
  5. ^ Also called the "Mojikyō Cmap".
  6. ^ See the screenshots on the official website
  7. ^ As of 2019, the IRG rebranded as the Ideographic Research Group.
  8. ^ The history of the encoding of the Tangut script is quite complicated, see Tangut (Unicode block) § History for a full listing of all the related proposals and a timeline.
  9. ^ Ideographic Description Sequence: ⿰魚嵐
  10. ^ This is a column name in the Unihan database; ⟨J⟩ here is short for "Japanese glyph source". The full name of the column is kIRG_JSource. Under Han unification, there are nine such sources. See §3.1 of UAX#38 for a complete list and more information.
  11. ^ Other J-Source prefixes exist, such as J4, meaning the character originates from JIS X 0213:2004.
  12. ^ That is to say, a glyph made up of the same radicals in the same positions.
  13. ^ a b Errors in large collections of ideographs are, of course, not uncommon. Such errors even accidentally occur in well funded government-produced collections, such as the famous kanji from unknown sources in the Japanese Industrial Standards Committee's JIS X 0208 double-byte character encoding standard. All of these JIS X 0208 error kanji (幽霊漢字; e.g., ) have made their way into Unicode despite not being "real" kanji.
  14. ^ For proof, see the list in the Mojikyō Character Map, MOCHRMAP.EXE.
  15. ^ See also: fictitious entry; trap street.

