Jump to content

General Punctuation

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Drmccreedy (talk | contribs) at 23:07, 19 July 2018 (→‎History: add doc). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

General Punctuation
RangeU+2000..U+206F
(112 code points)
PlaneBMP
ScriptsCommon (109 char.)
Inherited (2 char.)
Symbol setsPunctuation
Spaces
Format controls
Assigned111 code points
Unused1 reserved code points
6 deprecated
Unicode version history
1.0.0 (1991)67 (+67)
1.1 (1993)76 (+9)
3.0 (1999)83 (+7)
3.2 (2002)95 (+12)
4.0 (2003)97 (+2)
4.1 (2005)106 (+9)
5.1 (2008)107 (+1)
6.3 (2013)111 (+4)
Unicode documentation
Code chart ∣ Web page
Note: [1][2]

General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interobang, and invisible mathematical operators.

Additional punctuation characters are in the Supplemental Punctuation block and sprinkled in dozens of other Unicode blocks.

Block

General Punctuation[1][2][3]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+200x NQ
 SP 
MQ
 SP 
EN
 SP 
EM
 SP 
 3/M 
SP
 4/M 
SP
 6/M 
SP
F
 SP 
P
 SP 
TH
 SP 
H
 SP 
ZW
 SP 
ZW
 NJ 
 ZW 
J
 LRM   RLM 
U+201x  NB 
U+202x L
 SEP 
P
 SEP 
 LRE   RLE   PDF   LRO   RLO   NNB 
SP
U+203x
U+204x
U+205x MM
  SP  
U+206x  WJ   ƒ()    ×     ,     +    LRI   RLI   FSI   PDI  I
 SS 
A
 SS 
I
 AFS 
A
 AFS 
NA
 DS 
NO
 DS 
Notes
1.^ As of Unicode version 15.1
2.^ Grey area indicates non-assigned code point
3.^ Unicode code points U+206A - U+206F are deprecated as of Unicode version 3.0

Emoji

The General Punctuation block contains two emoji: U+203C and U+2049.[3][4]

The block has four standardized variants defined to specify emoji-style (U+FE0F VS16) or text presentation (U+FE0E VS15) for the two emoji, both of which default to a text presentation.[5]

Emoji variation sequences
U+ 203C 2049
base code point
base+VS15 (text) ‼︎ ⁉︎
base+VS16 (emoji) ‼️ ⁉️

History

The following Unicode-related documents record the purpose and process of defining specific characters in the General Punctuation block:

Version Final code points[a] Count L2 ID WG2 ID Document
1.0.0 U+2000..202E, 2030..203E, 2040..2044 67 (to be determined)
L2/11-438[b][c] N4182 Edberg, Peter (2011-12-22), Emoji Variation Sequences (Revision of L2/11-429)
L2/17-086 Burge, Jeremy; et al. (2017-03-27), Add ZWJ, VS-16, Keycaps & Tags to Emoji_Component
1.1 U+203F, 2045..2046, 206A..206F 9 (to be determined)
3.0 U+202F, 2048..2049 3 L2/98-088 N1711 The Working Meeting on Mongolian Encoding Attended by Representatives of China and Mongolia, 1998-02-15
L2/98-104 N1734 Whistler, Ken (1998-03-20), Comments on the Mongolian Encoding Proposal, WG2 N1711
L2/98-252 N1833 Moore, Richard (1998-05-04), Feedback on Ken Whistler's Comments on Mongolian Encoding: N 1734
L2/98-251 N1808 Reply to "Proposal WG2 N1734" Raised at the Seattle Meeting Regarding "Proposal WG 2 N1711", 1998-07-09
L2/98-389R Aliprand, Joan, "RESOLUTION M35.11", Consent docket re WG2 Resolutions at its Meeting #35
L2/99-075.1 N1973 Irish Comments on SC 2 N 3208, 1999-01-19
L2/99-075 N1972 Summary of Voting on SC 2 N 3208, PDAM ballot on WD for ISO/IEC 10646-1/Amd. 29: Mongolian, 1999-02-12
L2/99-113 Text for FPDAM ballot of ISO/IEC 10646, Amd. 29 - Mongolian, 1999-04-06
L2/99-304 N2126 Paterson, Bruce (1999-10-01), Revised Text for FDAM ballot of ISO/IEC 10646-1/FDAM 29, AMENDMENT 29: Mongolian
L2/99-381 Final text for ISO/IEC 10646-1, FDAM 29 -- Mongolian, 1999-12-07
L2/07-209 Whistler, Ken (2007-07-05), UTR 14 and U+202F NARROW NO-BREAK SPACE
L2/07-225 Moore, Lisa (2007-08-21), "B.11.4.1.2", UTC #112 Minutes
L2/11-438[b][c] N4182 Edberg, Peter (2011-12-22), Emoji Variation Sequences (Revision of L2/11-429)
L2/15-187 Moore, Lisa (2015-08-11), "B.14.5", UTC #144 Minutes
L2/16-258 N4752R2 Eck, Greg (2016-09-19), Mongolian Base Forms, Positional Forms, & Variant Forms
L2/16-259 N4753 Eck, Greg; Rileke, Orlog Ou (2016-09-20), WG2 #65 Mongolian Discussion Points
L2/16-266 Anderson, Deborah; Whistler, Ken; McGowan, Rick; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu; Moore, Lisa (2016-09-26), "1. Mongolian", Comments on Mongolian, Small Khitan, and other WG2 #65 documents
L2/16-297 N4769 Anderson, Deborah (2016-10-27), Mongolian ad hoc report
U+204A..204D 4 (to be determined)
3.2 U+2047, 2051 2 L2/99-238 Consolidated document containing 6 Japanese proposals, 1999-07-15
N2092 Addition of forty eight characters, 1999-09-13
L2/99-365 Moore, Lisa (1999-11-23), Comments on JCS Proposals
L2/00-024 Shibano, Kohji (2000-01-31), JCS proposal revised
L2/99-260R Moore, Lisa (2000-02-07), "JCS Proposals", Minutes of the UTC/L2 meeting in Mission Viejo, October 26-28, 1999
L2/00-098 N2195 Rationale for non-Kanji characters proposed by JCS committee, 2000-03-15
L2/00-119[d] N2191R Whistler, Ken; Freytag, Asmus (2000-04-19), Encoding Additional Mathematical Symbols in Unicode
L2/00-297 N2257 Sato, T. K. (2000-09-04), JIS X 0213 symbols part-1
L2/00-342 N2278 Sato, T. K.; Everson, Michael; Whistler, Ken; Freytag, Asmus (2000-09-20), Ad hoc Report on Japan feedback N2257 and N2258
U+204E..2050, 2057, 205F..2062 8 L2/00-119[d] N2191R Whistler, Ken; Freytag, Asmus (2000-04-19), Encoding Additional Mathematical Symbols in Unicode
U+2052, 2063 2 L2/01-142[d] N2336 Beeton, Barbara; Freytag, Asmus; Ion, Patrick (2001-04-02), Additional Mathematical Symbols
L2/01-156 N2356 Freytag, Asmus (2001-04-03), Additional Mathematical Characters (Draft 10)
L2/01-344 N2353 Umamaheswaran, V. S. (2001-09-09), Minutes from SC2/WG2 meeting #40 -- Mountain View, April 2001
4.0 U+2053..2054 2 L2/02-141 N2419 Everson, Michael; et al. (2002-03-20), Uralic Phonetic Alphabet characters for the UCS
L2/02-192 Everson, Michael (2002-05-02), Everson's Reply on UPA
N2442 Everson, Michael; Kolehmainen, Erkki I.; Ruppel, Klaas; Trosterud, Trond (2002-05-21), Justification for placing the Uralic Phonetic Alphabet in the BMP
L2/02-291 Whistler, Ken (2002-05-31), WG2 report from Dublin
L2/02-292 Whistler, Ken (2002-06-03), Early look at WG2 consent docket
L2/02-166R2 Moore, Lisa (2002-08-09), "Scripts and New Characters - UPA", UTC #91 Minutes
L2/02-253 Moore, Lisa (2002-10-21), UTC #92 Minutes
4.1 U+2055 1 L2/03-151R Constable, Peter; Lloyd-Williams, James; Lloyd-Williams, Sue; Chowdhury, Shamsul Islam; Ali, Asaddar; Sadique, Mohammed; Chowdhury, Matiar Rahman (2003-05-10), Revised Proposal for Encoding Syloti Nagri Script in the BMP
L2/03-136 Moore, Lisa (2003-08-18), "Scripts and New Characters - Syloti Nagri Script", UTC #95 Minutes
U+2056, 2058..2059 3 L2/03-282R N2610R Everson, Michael; Cleminson, Ralph (2003-09-04), Final proposal for encoding the Glagolitic script in the UCS
L2/03-324 N2642 Pantelia, Maria (2003-10-06), Proposal to encode additional Greek editorial and punctuation characters in the UCS
U+205A..205C 3 L2/03-157 Pantelia, Maria (2003-05-19), Additional Beta Code Characters not in Unicode (WIP)
L2/03-193R N2612-7 Pantelia, Maria (2003-06-11), Proposal to encode additional Punctuation Characters in the UCS
U+205D 1 L2/02-312R Pantelia, Maria (2002-11-07), Proposal to encode additional Greek editorial and punctuation characters in the UCS
L2/03-324 N2642 Pantelia, Maria (2003-10-06), Proposal to encode additional Greek editorial and punctuation characters in the UCS
U+205E 1 L2/03-354 N2655 Freytag, Asmus (2003-10-10), Proposal -- Symbols used in Dictionaries
L2/03-356R2 Moore, Lisa (2003-10-22), "97-C15", UTC #97 Minutes
5.1 U+2064 1 L2/07-011R N3198R Freytag, Asmus; Beeton, Barbara; Ion, Patrick; Sargent, Murray; Carlisle, David; Pournader, Roozbeh (2007-01-15), 29 Additional Mathematical and Symbol Characters
6.3 U+2066..2069 4 L2/12-186R Lanin, Aharon; Davis, Mark; Pournader, Roozbeh (2012-07-24), A Proposal for Bidi Isolates in Unicode
L2/12-290 N4310 Lanin, Aharon; Davis, Mark; Pournader, Roozbeh (2012-07-31), Proposal for Four Characters for Bidi
L2/12-239 Moore, Lisa (2012-08-14), UTC #132 Minutes
L2/13-040 Pournader, Roozbeh; Lanin, Aharon (2013-01-29), Fasttracking Arabic Letter Mark (ALM)
L2/13-125 N4447 Constable, Peter (2013-06-10), Unicode Liaison Report to WG2
  1. ^ Proposed code points and characters names may differ from final code points and names
  2. ^ a b See also L2/10-458, L2/11-414, L2/11-415, and L2/11-429
  3. ^ a b Refer to the history section of the Miscellaneous Symbols and Pictographs block for additional emoji-related documents
  4. ^ a b c Refer to the history section of the Miscellaneous Mathematical Symbols-B block for additional math-related documents

References

  1. ^ "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. ^ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
  3. ^ "UTR #51: Unicode Emoji". Unicode Consortium. 2018-05-21.
  4. ^ "UCD: Emoji Data for UTR #51". Unicode Consortium. 2018-05-22.
  5. ^ "UTS #51 Emoji Variation Sequences". The Unicode Consortium.