Jump to content

VISCII

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by HarJIT (talk | contribs) at 22:13, 8 May 2020 (→‎History and naming). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

VISCII
MIME / IANAVISCII
Language(s)Vietnamese, English
Created byViet-Std Group
DefinitionsRFC 1456
Classification8-bit SBCS
Based onASCII

VISCII is an unofficially-defined modified ASCII character encoding for using the Vietnamese language with computers. It should not be confused with the similarly-named officially registered VSCII encoding. VISCII keeps the 95 printable characters of ASCII unmodified, but it replaces 6 of the 33 control characters with printable characters. It adds 128 precomposed characters. Unicode and the Windows-1258 code page are now used for virtually all Vietnamese computer data, but legacy VSCII and VISCII files may need conversion.

History and naming

VISCII was designed by the Vietnamese Standardization Working Group (Viet-Std Group)[1] based in Silicon Valley, California in 1992 while they were working with the Unicode consortium to include pre-composed Vietnamese characters in the Unicode standard. VISCII, along with VIQR, was first published in a bilingual report in September 1992, in which it was dubbed the "Vietnamese Standard Code for Information Interchange".[2] The report noted a proliferation in computer usage in Vietnam, that existing applications used vendor-specific encodings which were unable to interoperate with one another, and that standardisation between vendors was therefore necessary.[2]

The next year, in 1993, Vietnam adopted TCVN 5712, its first national standard in the information technology domain.[3] This defined a character encoding named VSCII, which had been developed by the TCVN Technical Committee on Information Technology (TCVN/TC1), and with its name standing for "Vietnamese Standard Code for Information Interchange".[3] VSCII is incompatible with, and otherwise unrelated to, the earlier-published VISCII.[4] Unlike VISCII, VSCII is a "Vietnamese Standard" in the sense of a national standard.

VISCII and VIQR were approved as the informational-status RFC 1456, attributed to the Viet-Std group and dated May 1993. This RFC notes them to be "conventions" used by overseas Vietnamese speakers on Usenet, and that it "specifies no level of standard". In spite of this, it continues to call VISCII the "VIetnamese Standard Code for Information Interchange" (the same name taken by VSCII).[5] The labels VISCII and csVISCII are registered with the IANA for VISCII, with reference to RFC 1456.[6] (There is, on the other hand, no official IANA label for TCVN 5712 / VSCII, although x-viet-tcvn5712 was previously supported by Mozilla Firefox.[7])

Design

A traditional extended ASCII character set consists of the ASCII set plus up to 128 characters. Vietnamese requires 134 additional letter-diacritic combinations, which is six too many. There are (short of dropping tone mark support for capital letters, as in VSCII-3) essentially four different ways to handle this problem:

  1. Use variable-width encoding (as does UTF-8)
  2. Include combining diacritical marks for tone marks (as do VSCII-2 and Windows-1258) or for diacritics in general (as do ANSEL and VNI)
  3. Replace some ASCII punctuation, preferably punctuation which is not invariant in ISO 646 (as does VNI for DOS)
  4. Replace at least six of the basic ASCII control characters (as do VPS and VSCII-1)

VISCII went for the last option, replacing six of the least problematic (e.g., least likely to be recognised by an application and acted on specially) C0 control codes (STX, ENQ, ACK, DC4, EM, and RS) with six of the least-used uppercase letter-diacritic combinations.[2] While this option may cause programs that use those control codes to malfunction when handling VISCII text, it creates fewer complications than the other two options (the designers note that non-8-bit clean transmission had been found to pose more difficulty in practice than the control character re-use).[2] Nonetheless, locations of both C0 or C1 control characters and the codes used for the non-breaking space in ISO-8859-1, Mac OS Roman and OEM-US were deliberately assigned to uppercase letters, with the intention of making use of lowercase codepoints with an all-capital font a serviceable workaround if graphical characters could not be displayed for those codes.[2]

However, using up all the extended code points for accented letters left no room to add useful symbols, superscripted numbers, curved quotes, proper dashes, etc., like most other extended ASCII character sets.

Location of characters deliberately mostly follows ISO-8859-1 where there are characters in common between the two code pages (the uppercase Õ being noted as an exception), motivated by user friendliness concerns.[2]

Support

VISCII is partially supported by the TriChlor Software Group in California, which has released various VISCII-compliant software packages, libraries, and fonts for MS-DOS and Windows, Unix, and Macintosh. VISCII-compliant software is available at many FTP sites.

VISCII was historically offered as an encoding for outgoing email by Mozilla Thunderbird.[8]

VISCII was mostly used by overseas Vietnamese speakers, with VSCII (TCVN) being more popular in northern Vietnam and VNI being more popular in southern Vietnam.[9]

Character set

VISCII
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
0_
0
Template:Chset-color-ctrl|NUL
0000
Template:Chset-color-ctrl|SOH
0001
Template:Chset-color-letter-var|
1EB2
Template:Chset-color-ctrl|ETX
0003
Template:Chset-color-ctrl|EOT
0004
Template:Chset-color-letter-var|
1EB4
Template:Chset-color-letter-var|
1EAA
Template:Chset-color-ctrl|BEL
0007
Template:Chset-color-ctrl|BS
0008
Template:Chset-color-ctrl|HT
0009
Template:Chset-color-ctrl|LF
000A
Template:Chset-color-ctrl|VT
000B
Template:Chset-color-ctrl|FF
000C
Template:Chset-color-ctrl|CR
000D
Template:Chset-color-ctrl|SO
000E
Template:Chset-color-ctrl|SI
000F
1_
16
Template:Chset-color-ctrl|DLE
0010
Template:Chset-color-ctrl|DC1
0011
Template:Chset-color-ctrl|DC2
0012
Template:Chset-color-ctrl|DC3
0013
Template:Chset-color-letter-var|
1EF6
Template:Chset-color-ctrl|NAK
0015
Template:Chset-color-ctrl|SYN
0016
Template:Chset-color-ctrl|ETB
0017
Template:Chset-color-ctrl|CAN
0018
Template:Chset-color-letter-var|
1EF8
Template:Chset-color-ctrl|SUB
001A
Template:Chset-color-ctrl|ESC
001B
Template:Chset-color-ctrl|FS
001C
Template:Chset-color-ctrl|GS
001D
Template:Chset-color-letter-var|
1EF4
Template:Chset-color-ctrl|US
001F
2_
32
Template:Chset-color-misc|SP
0020
Template:Chset-color-punct|!
0021
Template:Chset-color-punct|"
0022
Template:Chset-color-punct|#
0023
Template:Chset-color-graph|$
0024
Template:Chset-color-punct|%
0025
Template:Chset-color-punct|&
0026
Template:Chset-color-punct|'
0027
Template:Chset-color-punct|(
0028
Template:Chset-color-punct|)
0029
Template:Chset-color-punct|*
002A
Template:Chset-color-graph|+
002B
Template:Chset-color-punct|,
002C
Template:Chset-color-punct|-
002D
Template:Chset-color-punct|.
002E
Template:Chset-color-punct|/
002F
3_
48
Template:Chset-color-digit|0
0030
Template:Chset-color-digit|1
0031
Template:Chset-color-digit|2
0032
Template:Chset-color-digit|3
0033
Template:Chset-color-digit|4
0034
Template:Chset-color-digit|5
0035
Template:Chset-color-digit|6
0036
Template:Chset-color-digit|7
0037
Template:Chset-color-digit|8
0038
Template:Chset-color-digit|9
0039
Template:Chset-color-punct|:
003A
Template:Chset-color-punct|;
003B
Template:Chset-color-graph|<
003C
Template:Chset-color-graph|=
003D
Template:Chset-color-graph|>
003E
Template:Chset-color-punct|?
003F
4_
64
Template:Chset-color-punct|@
0040
Template:Chset-color-letter|A
0041
Template:Chset-color-letter|B
0042
Template:Chset-color-letter|C
0043
Template:Chset-color-letter|D
0044
Template:Chset-color-letter|E
0045
Template:Chset-color-letter|F
0046
Template:Chset-color-letter|G
0047
Template:Chset-color-letter|H
0048
Template:Chset-color-letter|I
0049
Template:Chset-color-letter|J
004A
Template:Chset-color-letter|K
004B
Template:Chset-color-letter|L
004C
Template:Chset-color-letter|M
004D
Template:Chset-color-letter|N
004E
Template:Chset-color-letter|O
004F
5_
80
Template:Chset-color-letter|P
0050
Template:Chset-color-letter|Q
0051
Template:Chset-color-letter|R
0052
Template:Chset-color-letter|S
0053
Template:Chset-color-letter|T
0054
Template:Chset-color-letter|U
0055
Template:Chset-color-letter|V
0056
Template:Chset-color-letter|W
0057
Template:Chset-color-letter|X
0058
Template:Chset-color-letter|Y
0059
Template:Chset-color-letter|Z
005A
Template:Chset-color-punct|[
005B
Template:Chset-color-punct|\
005C
Template:Chset-color-punct|]
005D
Template:Chset-color-graph|^
005E
Template:Chset-color-punct|_
005F
6_
96
Template:Chset-color-graph|`
0060
Template:Chset-color-letter|a
0061
Template:Chset-color-letter|b
0062
Template:Chset-color-letter|c
0063
Template:Chset-color-letter|d
0064
Template:Chset-color-letter|e
0065
Template:Chset-color-letter|f
0066
Template:Chset-color-letter|g
0067
Template:Chset-color-letter|h
0068
Template:Chset-color-letter|i
0069
Template:Chset-color-letter|j
006A
Template:Chset-color-letter|k
006B
Template:Chset-color-letter|l
006C
Template:Chset-color-letter|m
006D
Template:Chset-color-letter|n
006E
Template:Chset-color-letter|o
006F
7_
112
Template:Chset-color-letter|p
0070
Template:Chset-color-letter|q
0071
Template:Chset-color-letter|r
0072
Template:Chset-color-letter|s
0073
Template:Chset-color-letter|t
0074
Template:Chset-color-letter|u
0075
Template:Chset-color-letter|v
0076
Template:Chset-color-letter|w
0077
Template:Chset-color-letter|x
0078
Template:Chset-color-letter|y
0079
Template:Chset-color-letter|z
007A
Template:Chset-color-punct|{
007B
Template:Chset-color-graph||
007C
Template:Chset-color-punct|}
007D
Template:Chset-color-graph|~
007E
Template:Chset-color-ctrl|DEL
007F
8_
128
Template:Chset-color-letter-var|
1EA0
Template:Chset-color-letter-var|
1EAE
Template:Chset-color-letter-var|
1EB0
Template:Chset-color-letter-var|
1EB6
Template:Chset-color-letter-var|
1EA4
Template:Chset-color-letter-var|
1EA6
Template:Chset-color-letter-var|
1EA8
Template:Chset-color-letter-var|
1EAC
Template:Chset-color-letter-var|
1EBC
Template:Chset-color-letter-var|
1EB8
Template:Chset-color-letter-var|
1EBE
Template:Chset-color-letter-var|
1EC0
Template:Chset-color-letter-var|
1EC2
Template:Chset-color-letter-var|
1EC4
Template:Chset-color-letter-var|
1EC6
Template:Chset-color-letter-var|
1ED0
9_
144
Template:Chset-color-letter-var|
1ED2
Template:Chset-color-letter-var|
1ED4
Template:Chset-color-letter-var|
1ED6
Template:Chset-color-letter-var|
1ED8
Template:Chset-color-letter-var|
1EE2
Template:Chset-color-letter-var|
1EDA
Template:Chset-color-letter-var|
1EDC
Template:Chset-color-letter-var|
1EDE
Template:Chset-color-letter-var|
1ECA
Template:Chset-color-letter-var|
1ECE
Template:Chset-color-letter-var|
1ECC
Template:Chset-color-letter-var|
1EC8
Template:Chset-color-letter-var|
1EE6
Template:Chset-color-letter-var|Ũ
0168
Template:Chset-color-letter-var|
1EE4
Template:Chset-color-letter-var|
1EF2
A_
160
Template:Chset-color-letter-var|Õ
00D5
Template:Chset-color-letter-var|
1EAF
Template:Chset-color-letter-var|
1EB1
Template:Chset-color-letter-var|
1EB7
Template:Chset-color-letter-var|
1EA5
Template:Chset-color-letter-var|
1EA7
Template:Chset-color-letter-var|
1EA9
Template:Chset-color-letter-var|
1EAD
Template:Chset-color-letter-var|
1EBD
Template:Chset-color-letter-var|
1EB9
Template:Chset-color-letter-var|ế
1EBF
Template:Chset-color-letter-var|
1EC1
Template:Chset-color-letter-var|
1EC3
Template:Chset-color-letter-var|
1EC5
Template:Chset-color-letter-var|
1EC7
Template:Chset-color-letter-var|
1ED1
B_
176
Template:Chset-color-letter-var|
1ED3
Template:Chset-color-letter-var|
1ED5
Template:Chset-color-letter-var|
1ED7
Template:Chset-color-letter-var|
1EE0
Template:Chset-color-letter-var|Ơ
01A0
Template:Chset-color-letter-var|
1ED9
Template:Chset-color-letter-var|
1EDD
Template:Chset-color-letter-var|
1EDF
Template:Chset-color-letter-var|
1ECB
Template:Chset-color-letter-var|
1EF0
Template:Chset-color-letter-var|
1EE8
Template:Chset-color-letter-var|
1EEA
Template:Chset-color-letter-var|
1EEC
Template:Chset-color-letter-var|ơ
01A1
Template:Chset-color-letter-var|
1EDB
Template:Chset-color-letter-var|Ư
01AF
C_
192
Template:Chset-color-letter|À
00C0
Template:Chset-color-letter|Á
00C1
Template:Chset-color-letter|Â
00C2
Template:Chset-color-letter|Ã
00C3
Template:Chset-color-letter-var|
1EA2
Template:Chset-color-letter-var|Ă
0102
Template:Chset-color-letter-var|
1EB3
Template:Chset-color-letter-var|
1EB5
Template:Chset-color-letter|È
00C8
Template:Chset-color-letter|É
00C9
Template:Chset-color-letter|Ê
00CA
Template:Chset-color-letter-var|
1EBA
Template:Chset-color-letter|Ì
00CC
Template:Chset-color-letter|Í
00CD
Template:Chset-color-letter-var|Ĩ
0128
Template:Chset-color-letter-var|
1EF3
D_
208
Template:Chset-color-letter-var|Đ
0110
Template:Chset-color-letter-var|
1EE9
Template:Chset-color-letter|Ò
00D2
Template:Chset-color-letter|Ó
00D3
Template:Chset-color-letter|Ô
00D4
Template:Chset-color-letter-var|
1EA1
Template:Chset-color-letter-var|
1EF7
Template:Chset-color-letter-var|
1EEB
Template:Chset-color-letter-var|
1EED
Template:Chset-color-letter|Ù
00D9
Template:Chset-color-letter|Ú
00DA
Template:Chset-color-letter-var|
1EF9
Template:Chset-color-letter-var|
1EF5
Template:Chset-color-letter|Ý
00DD
Template:Chset-color-letter-var|
1EE1
Template:Chset-color-letter-var|ư
01B0
E_
224
Template:Chset-color-letter|à
00E0
Template:Chset-color-letter|á
00E1
Template:Chset-color-letter|â
00E2
Template:Chset-color-letter|ã
00E3
Template:Chset-color-letter-var|
1EA3
Template:Chset-color-letter-var|ă
0103
Template:Chset-color-letter-var|
1EEF
Template:Chset-color-letter-var|
1EAB
Template:Chset-color-letter|è
00E8
Template:Chset-color-letter|é
00E9
Template:Chset-color-letter|ê
00EA
Template:Chset-color-letter-var|
1EBB
Template:Chset-color-letter|ì
00EC
Template:Chset-color-letter|í
00ED
Template:Chset-color-letter-var|ĩ
0129
Template:Chset-color-letter-var|
1EC9
F_
240
Template:Chset-color-letter-var|đ
0111
Template:Chset-color-letter-var|
1EF1
Template:Chset-color-letter|ò
00F2
Template:Chset-color-letter|ó
00F3
Template:Chset-color-letter|ô
00F4
Template:Chset-color-letter|õ
00F5
Template:Chset-color-letter-var|
1ECF
Template:Chset-color-letter-var|
1ECD
Template:Chset-color-letter-var|
1EE5
Template:Chset-color-letter|ù
00F9
Template:Chset-color-letter|ú
00FA
Template:Chset-color-letter-var|ũ
0169
Template:Chset-color-letter-var|
1EE7
Template:Chset-color-letter|ý
00FD
Template:Chset-color-letter-var|
1EE3
Template:Chset-color-letter-var|
1EEE

  Letter  Number  Punctuation  Symbol  Other  Undefined

Differences from ISO-8859-1 are shown shaded.

See also

References

  1. ^ Phung, Quang; Ngo, Hoc D.; Bui, Cuong. "Vietnamese-Standard Working Group Home Page". Viet-Std Group. Retrieved 2019-08-23.
  2. ^ a b c d e f Vietnamese Character Encoding Standardization Report - VISCII And VIQR 1.1 Character Encoding Specifications (Technical report). Viet-Std Group. 1992.
  3. ^ a b "[news] TCVN 5712:1993 (VSCII) -- Vietnamese national standard". 1993-06-02. Archived from the original on 2017-01-11.
  4. ^ Lunde, Ken. "Chapter 1: CJKV Information Processing Overview (§ Are VISCII and VSCII identical? What about TCVN?)". CJKV Information Processing (2nd ed.). p. 17. ISBN 978-0-596-51447-1.
  5. ^ Vietnamese Standardization Working Group. "RFC 1456: Conventions for Encoding the Vietnamese Language". IETF.
  6. ^ "Character Sets". IANA.
  7. ^ Sivonen, Henri (2014-09-26). "Character encoding changes in m-c require c-c action". mozilla.dev.apps.thunderbird.
  8. ^ Sivonen, Henri (2014-09-26). "Character encoding changes in m-c require c-c action". mozilla.dev.apps.thunderbird. VISCII and armscii-8 are special in the sense that, for long time, Thunderbird itself (misguidedly) provided these encodings in the user interface for the choice of outgoing character encoding when composing a message. Therefore, it is possible that there exists a Thunderbird-created legacy of VISCII and armscii-8 email and Usenet posts.
  9. ^ Ngo, Hoc Dinh; Tran, TuBinh. "5. Why Having Vietnamese Charset (Character Set – Encoding) Conversion?". Some special functions of WinVNKey.

Further reading