Jump to content

Windows-1258

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Bender the Bot (talk | contribs) at 21:51, 1 November 2020 (External links: HTTP → HTTPS for Internet Assigned Numbers Authority, replaced: http://www.iana.org/ → https://www.iana.org/). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Windows-1258
MIME / IANAwindows-1258
Language(s)Vietnamese, English
Created byMicrosoft
StandardWHATWG Encoding Standard
Classificationextended ASCII, Windows-125x
Based onWindows-1252

Windows-1258 is a code page used in Microsoft Windows to represent Vietnamese texts. It makes use of combining diacritical marks.

Windows-1258 is compatible with neither the Vietnamese standard (TCVN 5712 / VSCII), nor the various other encodings in use in practice (VISCII, VNI, VPS). Rather, it is very similar to Windows-1252, with the differences being that s-caron and z-caron (which were added to Windows-1252 later) are missing, five of the letters with diacritics have been replaced by combining diacritics for Vietnamese tone marks, one has been replaced with the đông sign, and eight others (four per case) have been changed to four otherwise-unsupported Vietnamese letters.

Use of combining diacritics means that Windows-1258 can cover the large number of combinations of letters and tone marks in Vietnamese without compromising coverage of control codes or symbols. However it also means that software must be careful to handle conversions between precomposed characters and combining sequences correctly when converting to/from other encodings and makes determining user-visible length of a string more difficult.

IBM uses code page 1258 (CCSID 1258 and euro sign extended CCSID 5354) for Windows-1258.[1][2][3]

UTF-8 is the preferred encoding for Vietnamese in modern applications. Windows-1258 may not always round-trip Unicode encoded Vietnamese due to changes caused by Unicode normalization.[4] Combining diacritics are encoded after the letter in both Windows-1258 and Unicode[4] (like VNI, unlike ANSEL).

Character set

The following table shows Windows-1258. Each character is shown with its Unicode equivalent.

Windows-1258[5][6][7][8][9][10]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
0_
0
Template:Chset-color-ctrl|NUL
0000
Template:Chset-color-ctrl|SOH
0001
Template:Chset-color-ctrl|STX
0002
Template:Chset-color-ctrl|ETX
0003
Template:Chset-color-ctrl|EOT
0004
Template:Chset-color-ctrl|ENQ
0005
Template:Chset-color-ctrl|ACK
0006
Template:Chset-color-ctrl|BEL
0007
Template:Chset-color-ctrl|BS
0008
Template:Chset-color-ctrl|HT
0009
Template:Chset-color-ctrl|LF
000A
Template:Chset-color-ctrl|VT
000B
Template:Chset-color-ctrl|FF
000C
Template:Chset-color-ctrl|CR
000D
Template:Chset-color-ctrl|SO
000E
Template:Chset-color-ctrl|SI
000F
1_
16
Template:Chset-color-ctrl|DLE
0010
Template:Chset-color-ctrl|DC1
0011
Template:Chset-color-ctrl|DC2
0012
Template:Chset-color-ctrl|DC3
0013
Template:Chset-color-ctrl|DC4
0014
Template:Chset-color-ctrl|NAK
0015
Template:Chset-color-ctrl|SYN
0016
Template:Chset-color-ctrl|ETB
0017
Template:Chset-color-ctrl|CAN
0018
Template:Chset-color-ctrl|EM
0019
Template:Chset-color-ctrl|SUB
001A
Template:Chset-color-ctrl|ESC
001B
Template:Chset-color-ctrl|FS
001C
Template:Chset-color-ctrl|GS
001D
Template:Chset-color-ctrl|RS
001E
Template:Chset-color-ctrl|US
001F
2_
32
Template:Chset-color-misc|SP
0020
Template:Chset-color-punct|!
0021
Template:Chset-color-punct|"
0022
Template:Chset-color-punct|#
0023
Template:Chset-color-graph|$
0024
Template:Chset-color-punct|%
0025
Template:Chset-color-punct|&
0026
Template:Chset-color-punct|'
0027
Template:Chset-color-punct|(
0028
Template:Chset-color-punct|)
0029
Template:Chset-color-punct|*
002A
Template:Chset-color-graph|+
002B
Template:Chset-color-punct|,
002C
Template:Chset-color-punct|-
002D
Template:Chset-color-punct|.
002E
Template:Chset-color-punct|/
002F
3_
48
Template:Chset-color-digit|0
0030
Template:Chset-color-digit|1
0031
Template:Chset-color-digit|2
0032
Template:Chset-color-digit|3
0033
Template:Chset-color-digit|4
0034
Template:Chset-color-digit|5
0035
Template:Chset-color-digit|6
0036
Template:Chset-color-digit|7
0037
Template:Chset-color-digit|8
0038
Template:Chset-color-digit|9
0039
Template:Chset-color-punct|:
003A
Template:Chset-color-punct|;
003B
Template:Chset-color-graph|<
003C
Template:Chset-color-graph|=
003D
Template:Chset-color-graph|>
003E
Template:Chset-color-punct|?
003F
4_
64
Template:Chset-color-punct|@
0040
Template:Chset-color-letter|A
0041
Template:Chset-color-letter|B
0042
Template:Chset-color-letter|C
0043
Template:Chset-color-letter|D
0044
Template:Chset-color-letter|E
0045
Template:Chset-color-letter|F
0046
Template:Chset-color-letter|G
0047
Template:Chset-color-letter|H
0048
Template:Chset-color-letter|I
0049
Template:Chset-color-letter|J
004A
Template:Chset-color-letter|K
004B
Template:Chset-color-letter|L
004C
Template:Chset-color-letter|M
004D
Template:Chset-color-letter|N
004E
Template:Chset-color-letter|O
004F
5_
80
Template:Chset-color-letter|P
0050
Template:Chset-color-letter|Q
0051
Template:Chset-color-letter|R
0052
Template:Chset-color-letter|S
0053
Template:Chset-color-letter|T
0054
Template:Chset-color-letter|U
0055
Template:Chset-color-letter|V
0056
Template:Chset-color-letter|W
0057
Template:Chset-color-letter|X
0058
Template:Chset-color-letter|Y
0059
Template:Chset-color-letter|Z
005A
Template:Chset-color-punct|[
005B
Template:Chset-color-punct|\
005C
Template:Chset-color-punct|]
005D
Template:Chset-color-graph|^
005E
Template:Chset-color-punct|_
005F
6_
96
Template:Chset-color-graph|`
0060
Template:Chset-color-letter|a
0061
Template:Chset-color-letter|b
0062
Template:Chset-color-letter|c
0063
Template:Chset-color-letter|d
0064
Template:Chset-color-letter|e
0065
Template:Chset-color-letter|f
0066
Template:Chset-color-letter|g
0067
Template:Chset-color-letter|h
0068
Template:Chset-color-letter|i
0069
Template:Chset-color-letter|j
006A
Template:Chset-color-letter|k
006B
Template:Chset-color-letter|l
006C
Template:Chset-color-letter|m
006D
Template:Chset-color-letter|n
006E
Template:Chset-color-letter|o
006F
7_
112
Template:Chset-color-letter|p
0070
Template:Chset-color-letter|q
0071
Template:Chset-color-letter|r
0072
Template:Chset-color-letter|s
0073
Template:Chset-color-letter|t
0074
Template:Chset-color-letter|u
0075
Template:Chset-color-letter|v
0076
Template:Chset-color-letter|w
0077
Template:Chset-color-letter|x
0078
Template:Chset-color-letter|y
0079
Template:Chset-color-letter|z
007A
Template:Chset-color-punct|{
007B
Template:Chset-color-graph||
007C
Template:Chset-color-punct|}
007D
Template:Chset-color-graph|~
007E
Template:Chset-color-ctrl|DEL
007F
8_
128
Template:Chset-color-graph|
20AC
Template:Chset-color-undef| Template:Chset-color-punct|
201A
Template:Chset-color-letter|ƒ
0192
Template:Chset-color-punct|
201E
Template:Chset-color-punct|
2026
Template:Chset-color-punct|
2020
Template:Chset-color-punct|
2021
Template:Chset-color-letter|ˆ
02C6
Template:Chset-color-punct|
2030
Template:Chset-color-undef-box| Template:Chset-color-punct|
2039
Template:Chset-color-letter|Œ
0152
Template:Chset-color-undef| Template:Chset-color-undef-box| Template:Chset-color-undef|
9_
144
Template:Chset-color-undef| Template:Chset-color-punct|
2018
Template:Chset-color-punct|
2019
Template:Chset-color-punct|
201C
Template:Chset-color-punct|
201D
Template:Chset-color-punct|
2022
Template:Chset-color-punct|
2013
Template:Chset-color-punct|
2014
Template:Chset-color-graph|˜
02DC
Template:Chset-color-graph|
2122
Template:Chset-color-undef-box| Template:Chset-color-punct|
203A
Template:Chset-color-letter|œ
0153
Template:Chset-color-undef| Template:Chset-color-undef-box| Template:Chset-color-letter|Ÿ
0178
A_
160
Template:Chset-color-misc|NBSP
00A0
Template:Chset-color-punct|¡
00A1
Template:Chset-color-graph|¢
00A2
Template:Chset-color-graph|£
00A3
Template:Chset-color-graph|¤
00A4
Template:Chset-color-graph|¥
00A5
Template:Chset-color-graph|¦
00A6
Template:Chset-color-punct|§
00A7
Template:Chset-color-graph|¨
00A8
Template:Chset-color-graph|©
00A9
Template:Chset-color-letter|ª
00AA
Template:Chset-color-punct|«
00AB
Template:Chset-color-graph|¬
00AC
Template:Chset-color-ctrl|SHY
00AD
Template:Chset-color-graph|®
00AE
Template:Chset-color-graph|¯
00AF
B_
176
Template:Chset-color-graph|°
00B0
Template:Chset-color-graph|±
00B1
Template:Chset-color-digit|²
00B2
Template:Chset-color-digit|³
00B3
Template:Chset-color-graph|´
00B4
Template:Chset-color-letter|µ
00B5
Template:Chset-color-punct|
00B6
Template:Chset-color-punct|·
00B7
Template:Chset-color-graph|¸
00B8
Template:Chset-color-digit|¹
00B9
Template:Chset-color-letter|º
00BA
Template:Chset-color-punct|»
00BB
Template:Chset-color-digit|¼
00BC
Template:Chset-color-digit|½
00BD
Template:Chset-color-digit|¾
00BE
Template:Chset-color-punct|¿
00BF
C_
192
Template:Chset-color-letter|À
00C0
Template:Chset-color-letter|Á
00C1
Template:Chset-color-letter|Â
00C2
Template:Chset-color-letter-box|Ă
0102
Template:Chset-color-letter|Ä
00C4
Template:Chset-color-letter|Å
00C5
Template:Chset-color-letter|Æ
00C6
Template:Chset-color-letter|Ç
00C7
Template:Chset-color-letter|È
00C8
Template:Chset-color-letter|É
00C9
Template:Chset-color-letter|Ê
00CA
Template:Chset-color-letter|Ë
00CB
Template:Chset-color-misc-box|̀
0300
Template:Chset-color-letter|Í
00CD
Template:Chset-color-letter|Î
00CE
Template:Chset-color-letter|Ï
00CF
D_
208
Template:Chset-color-letter-box|Đ
0110
Template:Chset-color-letter|Ñ
00D1
Template:Chset-color-misc-box|̉
0309
Template:Chset-color-letter|Ó
00D3
Template:Chset-color-letter|Ô
00D4
Template:Chset-color-letter-box|Ơ
01A0
Template:Chset-color-letter|Ö
00D6
Template:Chset-color-graph|×
00D7
Template:Chset-color-letter|Ø
00D8
Template:Chset-color-letter|Ù
00D9
Template:Chset-color-letter|Ú
00DA
Template:Chset-color-letter|Û
00DB
Template:Chset-color-letter|Ü
00DC
Template:Chset-color-letter-box|Ư
01AF
Template:Chset-color-misc-box|̃
0303
Template:Chset-color-letter|ß
00DF
E_
224
Template:Chset-color-letter|à
00E0
Template:Chset-color-letter|á
00E1
Template:Chset-color-letter|â
00E2
Template:Chset-color-letter-box|ă
0103
Template:Chset-color-letter|ä
00E4
Template:Chset-color-letter|å
00E5
Template:Chset-color-letter|æ
00E6
Template:Chset-color-letter|ç
00E7
Template:Chset-color-letter|è
00E8
Template:Chset-color-letter|é
00E9
Template:Chset-color-letter|ê
00EA
Template:Chset-color-letter|ë
00EB
Template:Chset-color-misc-box|́
0301
Template:Chset-color-letter|í
00ED
Template:Chset-color-letter|î
00EE
Template:Chset-color-letter|ï
00EF
F_
240
Template:Chset-color-letter-box|đ
0111
Template:Chset-color-letter|ñ
00F1
Template:Chset-color-misc-box|̣
0323
Template:Chset-color-letter|ó
00F3
Template:Chset-color-letter|ô
00F4
Template:Chset-color-letter-box|ơ
01A1
Template:Chset-color-letter|ö
00F6
Template:Chset-color-graph|÷
00F7
Template:Chset-color-letter|ø
00F8
Template:Chset-color-letter|ù
00F9
Template:Chset-color-letter|ú
00FA
Template:Chset-color-letter|û
00FB
Template:Chset-color-letter|ü
00FC
Template:Chset-color-letter-box|ư
01B0
Template:Chset-color-graph-box|
20AB
Template:Chset-color-letter|ÿ
00FF

  Letter  Number  Punctuation  Symbol  Other  Undefined  Differences from Windows-1252

Code page 1129

IBM's code page 1129 (CCSID 1129 and euro sign extended CCSID 1163)[11][12][13] is similar to code page 1258, but with the following differences:

Code page 1129 (differences from code page 1258)[14][15][16][17][18][19]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
8_
128
Template:Chset-color-undef-box| Template:Chset-color-undef| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef| Template:Chset-color-undef| Template:Chset-color-undef|
9_
144
Template:Chset-color-undef| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef| Template:Chset-color-undef-box| Template:Chset-color-undef-box| Template:Chset-color-undef| Template:Chset-color-undef| Template:Chset-color-undef-box|
A_
160
Template:Chset-color-misc|NBSP
00A0
Template:Chset-color-punct|¡
00A1
Template:Chset-color-graph|¢
00A2
Template:Chset-color-graph|£
00A3
Template:Chset-color-graph|¤
00A4
Template:Chset-color-graph|¥
00A5
Template:Chset-color-graph|¦
00A6
Template:Chset-color-punct|§
00A7
Template:Chset-color-letter-box|œ
0153
Template:Chset-color-graph|©
00A9
Template:Chset-color-letter|ª
00AA
Template:Chset-color-punct|«
00AB
Template:Chset-color-graph|¬
00AC
Template:Chset-color-ctrl|SHY
00AD
Template:Chset-color-graph|®
00AE
Template:Chset-color-graph|¯
00AF
B_
176
Template:Chset-color-graph|°
00B0
Template:Chset-color-graph|±
00B1
Template:Chset-color-digit|²
00B2
Template:Chset-color-digit|³
00B3
Template:Chset-color-letter-box|Ÿ
0178
Template:Chset-color-letter|µ
00B5
Template:Chset-color-punct|
00B6
Template:Chset-color-punct|·
00B7
Template:Chset-color-letter-box|Œ
0152
Template:Chset-color-digit|¹
00B9
Template:Chset-color-letter|º
00BA
Template:Chset-color-punct|»
00BB
Template:Chset-color-digit|¼
00BC
Template:Chset-color-digit|½
00BD
Template:Chset-color-digit|¾
00BE
Template:Chset-color-punct|¿
00BF

  Letter  Number  Punctuation  Symbol  Other  Undefined  Differences from Windows-1258

See also

References

  1. ^ "Code page 1258 information document". Archived from the original on 2016-03-03.
  2. ^ "CCSID 1258 information document". Archived from the original on 2014-11-29.
  3. ^ "CCSID 5354 information document". Archived from the original on 2014-11-29.
  4. ^ a b Kaplan, Michael S. (2005-04-19). "A few of the gotchas of MultiByteToWideChar". Sorting it all out.
  5. ^ Steele, Shawn (1998-04-15). "cp1258 to Unicode table". Microsoft.
  6. ^ Unicode mappings of windows 1258 with "best fit"
  7. ^ Code Page CPGID 01258 (pdf) (PDF), IBM
  8. ^ Code Page CPGID 01258 (txt), IBM
  9. ^ International Components for Unicode (ICU), ibm-1258_P100-1997.ucm, 2002-12-03
  10. ^ International Components for Unicode (ICU), ibm-5354_P100-1998.ucm, 2002-12-03
  11. ^ "Code page 1129 information document". Archived from the original on 2010-09-21.
  12. ^ "CCSID 1129 information document". Archived from the original on 2016-03-27.
  13. ^ "CCSID 1163 information document". Archived from the original on 2014-11-29.
  14. ^ Lunde, Ken. "Appendix L: Vietnamese Character Sets" (PDF). CJKV Information Processing (2nd ed.). ISBN 978-0-596-51447-1.
  15. ^ Code Page CPGID 01129 (pdf) (PDF), IBM
  16. ^ Code Page CPGID 01129 (txt), IBM
  17. ^ International Components for Unicode (ICU), ibm-1129_P100-1997.ucm, 2002-12-03
  18. ^ Code Page CPGID 01163 (pdf) (PDF), IBM
  19. ^ Code Page CPGID 01163 (txt), IBM