Half-width kana
Half-width kana (半角カナ) is half of fullwidth form. It refers to the katakana character portion of the character set specified by JIS X 0201.
Although an official name is JIS X 0201 katakana, half-width kana is the commonly known name and this term will be used in this article.
History
ASCII is defined as a 7-bit character set and has room for 128 characters. However, since this standard was designed for the United States, it does not contain characters and symbols (for example, the ¥ yen currency symbol) needed for representation of Japanese.
JIS X 0201 was developed in 1969, and since computers at that time simply did not have the computational power and memory necessary to process the thousands of Kanji (Chinese-based) characters that exist in written Japanese, therefore as a simplification, Kanji characters were always represented by katakana.
Half-width kana were developed as "...the first Japanese characters encoded on computers because they are used for Japanese telegrams. As single-byte characters..." [1]
To make katakana fit into the area allowed, some compromises were made: the diacritical marks Dakuten and Handakuten are treated as separate characters instead of being part of the preceding character. This led to the so-called "half-width kana" and these compromises still cause problems today for computer programs, apart from frequently being considered to be visually unattractive.
Half-width table
"J" indicates the first four bits in JIS X 0201 (though see below, these do not necessarily indicate half-width) and in other sets such as CP932, "U" indicates the row in Unicode.
J | U | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | FF6 | 。 | 「 | 」 | 、 | ・ | ヲ | ァ | ィ | ゥ | ェ | ォ | ャ | ュ | ョ | ッ | |
B | FF7 | ー | ア | イ | ウ | エ | オ | カ | キ | ク | ケ | コ | サ | シ | ス | セ | ソ |
C | FF8 | タ | チ | ツ | テ | ト | ナ | ニ | ヌ | ネ | ノ | ハ | ヒ | フ | ヘ | ホ | マ |
D | FF9 | ミ | ム | メ | モ | ヤ | ユ | ヨ | ラ | リ | ル | レ | ロ | ワ | ン | ゙ | ゚ |
Half-width kana on the Internet
Since the SMTP and NNTP protocols (used to deliver e-mail and Usenet, respectively) were formerly only able to transmit 7-bits, it was then the convention to use ISO-2022-JP for sending e-mail in Japanese.
Since half-width kana is not contained in ISO-2022-JP, half-width kana cannot be included in a message, but when half-width kana was accidentally included in a message, it can become garbled during transmission.
This is no longer such a problem since most e-mail servers today use ESMTP, and hence 8-bit characters are acceptable. Alternatively, an encoding system such as Base64 can be used and specified in the message using MIME.
Web pages
The problems that exists in e-mail do not exist with Web pages since HTTP accepts 8-bit characters.
A problem that does exist is that computer programs have difficulties whether to treat a character as Shift JIS,EUC-JP, or UTF-7 - hence character code information should be specified with a HTTP response header or a Meta tag.
Misunderstanding of JIS X 0201
In fact, JIS X 0201 katakana is not half-width katakana. The standard doesn't define character's width. It defines only the code representation of katakana characters. The term "half-width" is just the remains of the old devices that displayed single-byte characters in half-width (as compared with double-byte ones). In JIS X 0201 standard, katakana characters in its code chart are printed in normal width, not half-width.
However, the misunderstanding that the standard defines "half-width" characters is widespread. People who know the standard will often say "so-called half-width kana."
See also
References
- ^ Lunde, Ken. CJKV Information Processing. 1st ed. O'Reilly, 1999. p. 144-145