Template:Chset-cell-unified
This template is the metatemplate behind {{chset-ctrl}}
, {{chset-ctrl3}}
, {{chset-ctrl4}}
, {{chset-cell}}
, {{chset-cell3}}
, and {{chset-cell4}}
. The intention is to implement them using this template and thus make it easier to keep them in sync.
Usage
Used with Template:chset-tableformat to indicate a table cell.
- First row:
- Parameter
char
: the character in question. May link to the appropriate article or Wiktionary page if appropriate. Only provide for a non-control, non-whitespace printing character. If there are alternative characters separate with a slash. If it is a sequence of characters put them next to each other. - Parameter
ctrl
: XX, name of a whitespace, control, format, separator or otherwise non-printing character (e.g., SP, LF, HT, NBSP, ZWNJ, PDO), with link to appropriate article if it exists. Do not provide at the same time aschar
. This just does template:sc2 so you can use that if you need to combine a control with a normal character. You can also use lower-case letters to get tinier text to fit a longer string in. - Parameter
fn
: printed in normal (small) size after the letter. This is useful to add a reference or template:efn footnote to the glyph.
- Parameter
- Second row:
- Parameter
unic
: hhhh, Unicode value in hexadecimal, 4 digits for most codepoints (those on the Basic Multilingual Plane) and 5 otherwise, (e.g., 0020, 1D44A).- A little-used feature is that if the
char
field is blank, the matching Unicode character is placed there, but this only works if this is just a hex number. - If there are multiple mappings separate them with a slash (such as
0020/00A0
), if this translates to a series of characters separate them with a space. - Set to
for a character without a Unicode mapping. Alternatively, if a Private Use Area mapping is in established/documented use for such a character (e.g. the Apple logo in Mac OS Roman) then it may be given, but don't make them up. - Set to
LEAD
for a lead byte (rather than a character).L
is not a hex digit so this is unambiguous (or use the hex code to indicate something about what lead byte this is, for example in UTF-8).
- A little-used feature is that if the
- Parameter
- Subsequent rows:
- Parameter
deci
: arbitrary text drawn in bold, for displaying input methods. This is most often a decimal number for the Windows Alt code input. - Parameter
octl
: a second line of arbitrary text drawn in bold. You probably should not use this unless the input method really uses a second form. - Parameter
kuten
: arbitrary text not in bold. For JIS (men)kuten, GB quwei, KS hangyol or equivalent code (English: (plane-)row-cell, or (plane-)section-position).- This is a important identifier for characters in CJK DBCSs such as JIS X 0208 (more so than e.g.
deci
, which is not usually used for a DBCS). - (d(d)-)d(d)-d(d) (two or three numbers of up to two digits each, e.g.,
91-1
,2-2-1
). Generally numbers 1 through 94 correspond with encoding bytes of either 0x21 through 0x7E, or 0xA1 through 0xFE. - For a lead byte, specify underscores in place of subsequent numbers, this may look something like
16-_
. - For visual consistency, may be set to
-
for a byte which is not within the lead/trail byte range, but which is in the same line as those which are.
- This is a important identifier for characters in CJK DBCSs such as JIS X 0208 (more so than e.g.
- Parameter
You should use the same entries for every cell in a table (or at least in a table row), otherwise they will not line up horizontally. Use
if a field should be blank.
Examples
A few examples:
{| {{chset-tableformat}}
<!-- ctrl4 plus kuten -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=3000|ctrl=[[space character|IDSP]]|deci=33|octl=041|kuten=1-1}}
<!-- ctrl4 -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=00A0|ctrl=[[non-breaking space|NBSP]]|deci=160|octl=240}}
<!-- cell4 plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|char=[[⛣]]|deci=33|octl=041|kuten=91-1}}
<!-- cell4 plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|deci=33|octl=041|kuten=91-1}}
<!-- cell4 -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1|char=[[inverted exclamation mark|¡]]|deci=161|octl=241}}
<!-- cell4 -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1|deci=161|octl=241}}
<!-- ctrl3 plus kuten -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=3000|ctrl=[[space character|IDSP]]|deci=33|kuten=1-1}}
<!-- ctrl3 -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=00A0|ctrl=[[non-breaking space|NBSP]]|deci=160}}
<!-- cell3 plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|char=[[⛣]]|deci=33|kuten=91-1}}
<!-- cell3 plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|deci=33|kuten=91-1}}
<!-- cell3 -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1|char=[[inverted exclamation mark|¡]]|deci=161}}
<!-- cell3 -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1|deci=161}}
<!-- ctrl plus kuten -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=3000|ctrl=[[space character|IDSP]]|kuten=1-1}}
<!-- ctrl -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=00A0|ctrl=[[non-breaking space|NBSP]]}}
<!-- cell plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|char=[[⛣]]|kuten=91-1}}
<!-- cell plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|kuten=91-1}}
<!-- cell -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1|char=[[inverted exclamation mark|¡]]|fn={{efn|A footnote next to character}}}}
<!-- cell -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1}}{{efn|A trailing footnote}}
|}
IDSP 3000 33 041 1-1 |
NBSP 00A0 160 240 |
⛣ 26E3 33 041 91-1 |
⛣ 26E3 33 041 91-1 |
¡ 00A1 161 241 |
¡ 00A1 161 241 |
IDSP 3000 33 1-1 |
NBSP 00A0 160 |
⛣ 26E3 33 91-1 |
⛣ 26E3 33 91-1 |
¡ 00A1 161 |
¡ 00A1 161 |
IDSP 3000 1-1 |
NBSP 00A0 |
⛣ 26E3 91-1 |
⛣ 26E3 91-1 |
¡[a] 00A1 |
¡ 00A1[b] |
Chset family of templates
See PETSCII, and Computer Braille Code for examples of usage.
Header and footer rows
- Template:chset-table-header — Header and title row for a 16 column character set table
- Template:chset-table-footer — Footer row for a 16 column character set table
Character row header
- Template:chset-left — Left row code header
Character cell colors
Note: if adjusting these colors, reference Template:Chset-table-header/family-test-sheet for a reference of how well they work together, and whether base / variant / boxed / legend colors are properly in sync.
Boxed and slightly shaded variants of these exist in order to indicate some kind of additional information (depending on the article) like, for example, a derivation from a base codepage, a variance of definition of the corresponding codepage in different sources (to be explained in the article) or in different revisions of a code page
For generating colors for cells by Unicode category, this script may be helpful.
Please note that the boxed variants must not be used, if a cell, which is not to be marked, is surrounded by four cells, which need to be marked, as this would make the central cell appear marked as well. The shaded variants do not exhibit this problem.
Character cell contents
- Template:chset-cell — Character cell with character + Unicode value
- Template:chset-cell3 — Character cell with character + Unicode value + decimal index
- Template:chset-cell4 — Character cell with character + Unicode value + decimal + octal index
- Template:chset-ctrl — Control character cell with name + Unicode value
- Template:chset-ctrl3 — Control character cell with name + Unicode value + decimal index
- Template:chset-ctrl4 — Control character cell with name + Unicode value + decimal + octal index
- Template:chset-cell-unified — Any of the above, plus optional kuten
Test table
The following colours should be in sync with one another and with the legend.
Letter Number Punctuation Symbol Other Lead byte Undefined
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0_ 0 |
NUL 0000 |
SOH 0001 |
STX 0002 |
ETX 0003 |
EOT 0004 |
ENQ 0005 |
ACK 0006 |
BEL 0007 |
BS 0008 |
HT 0009 |
LF 000A |
VT 000B |
FF 000C |
CR 000D |
SO |
SI |
1_ 16 |
DLE 0010 |
DC1 0011 |
DC2 0012 |
DC3 0013 |
DC4 0014 |
NAK 0015 |
SYN 0016 |
ETB 0017 |
CAN 0018 |
EM 0019 |
SUB 001A |
ESC 001B |
SS2 |
SS3 |
RS 001E |
US 001F |
2_ 32 |
SP 0020 |
! 0021 |
" 0022 |
# 0023 |
$ 0024 |
% 0025 |
& 0026 |
' 0027 |
( 0028 |
) 0029 |
* 002A |
+ 002B |
, 002C |
- 002D |
. 002E |
/ 002F |
3_ 48 |
0 0030 |
1 0031 |
2 0032 |
3 0033 |
4 0034 |
5 0035 |
6 0036 |
7 0037 |
8 0038 |
9 0039 |
: 003A |
; 003B |
< 003C |
= 003D |
> 003E |
? 003F |
4_ 64 |
@ 0040 |
A 0041 |
B 0042 |
C 0043 |
D 0044 |
E 0045 |
F 0046 |
G 0047 |
H 0048 |
I 0049 |
J 004A |
K 004B |
L 004C |
M 004D |
N 004E |
O 004F |
5_ 80 |
P 0050 |
Q 0051 |
R 0052 |
S 0053 |
T 0054 |
U 0055 |
V 0056 |
W 0057 |
X 0058 |
Y 0059 |
Z 005A |
[ 005B |
¥ 00A5 |
] 005D |
^ 005E |
_ 005F |
6_ 96 |
` 0060 |
a 0061 |
b 0062 |
c 0063 |
d 0064 |
e 0065 |
f 0066 |
g 0067 |
h 0068 |
i 0069 |
j 006A |
k 006B |
l 006C |
m 006D |
n 006E |
o 006F |
7_ 112 |
p 0070 |
q 0071 |
r 0072 |
s 0073 |
t 0074 |
u 0075 |
v 0076 |
w 0077 |
x 0078 |
y 0079 |
z 007A |
{ 007B |
| 007C |
} 007D |
‾ 203E |
DEL 007F |
Letter Number Punctuation Symbol Other Lead byte Undefined