Template:General Category (Unicode)

From Wikipedia, the free encyclopedia
Jump to: navigation, search
General Category (Unicode Character Property)[a][1]
Value Category Major, minor Basic type[b] Character assigned[b] Fixed[c] Remarks
000Letter
001Lu Letter, uppercase Graphic Character
002Ll Letter, lowercase Graphic Character
003Lt Letter, titlecase Graphic Character Ligatures containing uppercase followed by lowercase letters (e.g., Dž, Lj, Nj, and Dz)
004Lm Letter, modifier Graphic Character
005Lo Letter, other Graphic Character
010Mark
011Mn Mark, nonspacing Graphic Character
012Mc Mark, spacing combining Graphic Character
013Me Mark, enclosing Graphic Character
020Number
021Nd Number, decimal digit Graphic Character All these, and only these, have Numeric Type = De[c]
022Nl Number, letter Graphic Character Numerals composed of letters or letterlike symbols (e.g., Roman numerals)
023No Number, other Graphic Character E.g., vulgar fractions, superscript and subscript digits
030Punctuation
031Pc Punctuation, connector Graphic Character Includes "_" underscore
032Pd Punctuation, dash Graphic Character Includes several hyphen characters
033Ps Punctuation, open Graphic Character Opening bracket characters
034Pe Punctuation, close Graphic Character Closing bracket characters
035Pi Punctuation, initial quote Graphic Character Opening quotation mark. Does not include the ASCII "neutral" quotation mark. May behave like Ps or Pe depending on usage
036Pf Punctuation, final quote Graphic Character Closing quotation mark. May behave like Ps or Pe depending on usage
037Po Punctuation, other Graphic Character
040Symbol
041Sm Symbol, math Graphic Character
042Sc Symbol, currency Graphic Character
043Sk Symbol, modifier Graphic Character
044So Symbol, other Graphic Character
050Separator
051Zs Separator, space Graphic Character Includes the space, but not TAB, CR, or LF, which are Cc
052Zl Separator, line Format Character Only U+2028 line separator (L​SEP)
053Zp Separator, paragraph Format Character Only U+2029 paragraph separator (P​SEP)
060Other
061Cc Other, control Control Character Fixed 65 No name[d], <control>
062Cf Other, format Format Character Includes the soft hyphen, control characters to support bi-directional text, and language tag characters
063Cs Other, surrogate Surrogate Not (but abstract) Fixed 2048 No name[d], <surrogate>
064Co Other, private use Private-use Not (but abstract) Fixed 6400 in BMP, 131,068 in Planes 15–16 No name[d], <private-use>
065Cn Other, not assigned Noncharacter Not Fixed 66 No name[d], <noncharacter>
Reserved Not Not fixed No name[d], <reserved>
  1. ^ Unicode 6.0, Chapter 4, table 4-9
  2. ^ a b Unicode 6.0, Chapter 2, table 2-3: Types of code points
  3. ^ a b Stability policy: Property Value Stability and table. Stability policy: Some gc groups will never change. gc=Nd corresponds with Numeric Type=De (decimal).
  4. ^ a b c d e Unicode 6.0, Chapter 4, table 4-12 Name=""; a Code Point Label may be used to identify a nameless code point. E.g. <control-hhhh>, <control-0088>. The Name remains blank, which can prevent inadvertently replacing, in documentation, a Control Name with a true Control code. Unicode also uses <not a character> for <noncharacter>.

References

These references will appear in the article, but this list appears only on this page.
  1. ^ "Characters by Unicode General Category". 2011. Retrieved 2012-01-25. 
Documentation icon Template documentation[view] [edit] [history] [purge]

General Category is a Unicode character property, defined in Chapter 4 of the Unicode Standard: "Character Properties".

Usage[edit]

{{General Category (Unicode)|state=}}
  • State can be set to any of the wikitable collapsing classes. Default is collapsed.

See also[edit]