Talk:Unicode font

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Typography (Rated C-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Typography, a collaborative effort to improve the coverage of articles related to Typography on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the quality scale.
 High  This article has been rated as High-importance on the importance scale.
WikiProject Computing (Rated C-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.

Other Unicode Font[edit]

"Wide range" is subjective, but I expect it would apply to fonts that aim to cover most of Unicode, like Code2000. I don't think Junicode, Charis SIL, Doulos SIL, and Gentium belong here. They only aim to cover a small subset of Unicode: the Latin alphabet, and possibly Greek or Cyrillic.

Also, AFAIK Gentium and Arial Unicode MS are no longer being updated. --Ptcamn 09:26, 23 May 2006 (UTC)

But those, at least have much wider coverage than other typefaces. lets add few columns, next to each typeface and add info on which range they cover and similar and related info. i'll change the table now. thanks. ~ Tarikash.
If more users also finds that those fonts are not eligible to be included here, then after Consensus, lets take them out. Thanks. ~ Tarikash.
Basic Arial (1419) and Times New Roman (1419) have only slightly less characters than Gentium (1469) and Junicode (1433). Tahoma (1912) has more. Microsoft Sans Serif (2301) is just short of having twice as many. Should they be here too? --Ptcamn 00:39, 25 May 2006 (UTC)
Lets add ... Chrysanthi Unicode (4818 chars v3.1), Microsoft Sans Serif (2301 chars v1.41), Lucida Grande (2244 glyphs v5.0d8e1), Tahoma (1912 chars v3.14), Times New Roman (1419 chars v3.00), Arial (1419 chars v3.00). Thanks. ~ Tarikash


I have to question the definition of "Unicode font" which doesn't match my experience. The implication is that a Unicode font is one that aims to provide a lot of the Unicode characters. But I have seen this used simply to mean any font that provides Unicode information - for instance TrueType fonts may or may not include a map to/from Unicode for the characters in the font. In this sense only symbol fonts containing characters outside Unicode are not Unicode fonts. This page [1] appears to use Unicode font in this way, and uses the term "Large font" for what this article is describing.Notinasnaid 03:53, 25 May 2006 (UTC)

Speaking as a professional in the font industry, I agree. In fact, I think the very term "Unicode font" is nonsensical in this usage. But it is a popular term, even if seemingly meaningless, so what to do? To me, a Unicode font is a font with an internal Unicode encoding. Pretty much every OpenType and TrueType font developed in the past decade is a Unicode font. Otherwise you get silliness like a Latin font is considered a "Unicode font" because it supports all of 600 characters, but then what do you do about the fact that even a really basic Chinese font supports thousands of characters? Are all Chinese fonts then Unicode fonts for that reason? Or is (as seems to be the case) the bar arbitrary and different for different writing systems? Thomas Phinney (talk) 01:34, 29 October 2010 (UTC)
I'm not a font designer, so my idea is this, an 'Unicode font' is, which has at least one 'unicode table' that matches completely or that matches with one or more multiple unicode blocks specified by 1 of the 'Standard' released by the UNICODE Consortium, and an Unicode font can output the correct glyph representing the included individual character, when an unicode codepoint was supplied to it. Some C,J,K font(s) were already having very large characters, even before 1993, even b4 the Lucida Sans Unicode, but they were usin bit different encoding, other than unicode, so they r not unicode font. Later their portion of encoding techniques were used+modified to form the early unicode fonts. When those C,J,K fonts have included the 'unicode' compliant encoding/table, then they have become unicode font. --Tarikash 06:43, 29 October 2010 (UTC).

See also [2] (PDF), which talks about "Unicode Font Design" in connection with Lucida Sans Unicode, with 1700 glpyhs. Notinasnaid 04:12, 25 May 2006 (UTC)

So, let's be specific, and encyclopedic about this: what is the source of the term as it is used in this article? Notinasnaid 04:15, 25 May 2006 (UTC)

Hi, to clarify the real unicode fonts from a partially acting one, by using a standard or other point of view or definition, a new paragraph explaining this can be added, in a new section "Definition". If the explanation about those is/becomes very lengthy, then a different page, for example, "Unicode font definition", can be created and a short summary paragraph can be placed here along with the link to the main article. Please feel free to update/change/remove any part you wish to make right correction and justified, as long as we focus on the "Unicode fonts". ~Tarikash 06:04, 25 May 2006 (UTC)

Reading the current article, the definition is so vague as to be nearly useless. Somebody could go in and add every single serious Japanese font Adobe and everybody else makes, because they cover Japanese, Latin, Greek and Cyrillic, and have 8,000–25,000 characters, which is many more than the 3000–4000 that some of the listed fonts have. Thomas Phinney (talk) 20:05, 6 May 2013 (UTC)

Vertical Font Name in the Table (Gif Picture)[edit]

Vertical font names displayed in the table box are actually gif picture file. Process to create such picture, can be found in the description of the "Arial Unicode MS" image file's page Image:Arial Unicode MS uf vt.gif. Thanks. ~ Tarikash

Using a SVG-Image would be better IMHO: looks better, editable and can be rendered directly in modern browsers. --Hhielscher 06:12, 31 May 2006 (UTC)
As of today, SVG is not yet natively supported by the web browser software. It will catch up. Other than the GIF, the PNG is another good solution. ~ Tarikash 09:00, 2 June 2006 (UTC)
There is no need for every web browser to support SVG as long as mediawiki does. See the examples in commons:Category:SVG. Your statement about web browsers is wrong as well, see Scalable_Vector_Graphics#Native Support.--Hhielscher 15:56, 2 June 2006 (UTC)
I meant MAJOR web browsers, for example, Internet Explorer. ~ Tarikash 18:56, 2 June 2006 (UTC).

Vertical Font Name List and SVG images[edit]

A template {{Unicode Vertical Font Name List}} is used to display the vertical names of BMP plane fonts, to reduce this page's edit size. If you're to add or Update data of a specific Column which is related to a specific font in the BMP unicode plane, then you must also Edit this template Template:Unicode Vertical Font Name List. Different templates are used now, for different unicode planes (SMP, SIP, SSP, PUA-A, PUA-B), so goto appropriate template for updating. And after updating, you must righaway also update/change the 'colspan' parameter's value in the main article, to reflect the right count of fonts + 1 (block name). In that way html table will appear right. Also update the font name list in the beginning of each unicode plane in the main article, so others can know, in which exact order which font's data are kept, so its easier to update or to check by another user. --Tarikash (talk).

Few users had decided, and replaced, all the GIFs with SVGs, but they replaced those without checking if the font name is actually showing the right way and if showing completely or not ! I have found many of them were not showed correctly ! i've fixed those now. Please dont make that mistake again, create a svg file, and check the SVG image file first in ur own computer through a test html file. Place a border around the image like this <img src="New_Font_Name_uf_vt.svg" style="border: 1px solid black;"> in that test html file. And load+view the test file using a web-browser, if its appearing right or not. If you need help with SVG, or Template, ask/request for help. --Tarikash (talk) 10:43, 24 October 2010 (UTC).

Ticks, X's, and numbers[edit]

These abound in the charts... what do they mean? Evertype 11:43, 6 June 2006 (UTC)

sorry, responding in late in this paragraph, but, added the legend then, right after you've pointed that out. ~ Tarikash 03:41, 6 July 2006 (UTC).
To use/display a/the tick mark (YesY), please use the template code {{U2713}}, (which will be replaced with a tick mark). For X mark, simply use the capital X or {{X mark-n}}. Thanks. ~ Tarikash 07:09, 30 July 2006 (UTC).
{{subst:U2713}} should be used: I don't see a reason for transclusion, and it puts unnecessary strain on the server. dab () 09:25, 31 July 2006 (UTC)
Suggested to use that way for the reason, that, the total byte(/size) counts for this article page remains smaller. subst (WP:SUB) will replace the template code {{U2713}} with the actual template code, which contains the wiki image tag [[Image:...]] to place the picture character, this wiki image tag uses more bytes than the template code itself. But you're right that, calling template many times will burden the server a bit more. But, Wiki Servers' one of the main task is to carry out templates and wiki markup/tag with cache optimization for multiple use, which is very common for all wikipedia pages, because all wiki pages are full of wiki tags/codes/markups. Beside, most tick marks already been replaced by their actual character count for that Unicode block. ~Tarikash 10:21, 31 July 2006 (UTC).

Article name renamed from "Unicode fonts" to "Unicode typefaces"[edit]

The word "typefaces" or "typeface" is more common to "typography", but not used commonly at other areas. Most doen't even know, what is "typeface". I understand that renaming was consistent with other "category" names and etc. But still article name could have been kept "Unicode fonts" to locate this topic or words easily. REDIRECTs (redirected pages) should help common visitors to goto "Unicode typefaces", even if they link or use "Unicode fonts", so its ok. ~ Tarikash 04:03, 6 July 2006 (UTC).

Two of this article's four categories use "typefaces." The other two don't use "typefaces" or "fonts." Most other related articles use the term "typefaces." The naming consensus is clear, so I renamed the article. --Davidstrauss 05:48, 13 July 2006 (UTC)
If i were to take this step, i would have definitely mention my reason(s) in this talk page first, before i change. I thought that is the way of Wikipedia. But anyway, its ok. ~ Tarikash 07:40, 13 July 2006 (UTC).
First, the Wikipedia way is "be bold," not assume the change will be disputed. Second, I gave my reason in the edit summary: "moved Unicode fonts to Unicode typefaces: Make name consistent with categories and other typeface pages." --Davidstrauss 16:38, 23 July 2006 (UTC)
Also this article isn't primerally about typefaces its about fonts, e.g. those things on a computer that contain a selection of characters which may or may not be in the same typeface. Plugwash 11:15, 11 August 2006 (UTC)
Speaking as a professional typographer, I assert it was a poor change. There is a difference between the meaning of the words "font" and "typeface," and the difference is at least moderately relevant here. The assertion that "typeface" is a more common term in my profession is only relevant if the two words are synonyms. If not, that argument is equivalent to changing "beige" to "white" in an article on painting, on the grounds that "white" is a more commonly used term. If I knew how to change it back (and add any needed redirects from the temporary poor title), I would. Thomas Phinney (talk) 16:48, 26 June 2010 (UTC)

Sylfaen font[edit]

Sylfaen is also a unicode font that is distributed with windows and contains characters for Armenian, Georgian and other alphabets.

How is the data recorded?[edit]

What programms are used to get the numbers? What programm gives accurate data about the kerning pairs included?--Hhielscher 13:28, 8 November 2006 (UTC)

"Unicode typefaces" is a generic term, and doesn't define any particular font format. You'd need to find the font file format and use a tool designed for that format. Notinasnaid 13:38, 8 November 2006 (UTC)
There are various software, to obtain data from fonts. For some fonts i've used FontForge. There is a MinGW port of FontForge, (i found that worked little bit better at this moment in XP). --Tarikash 10:25, 18 October 2010 (UTC)
Use "Fontforge" for finding total # glyphs (not "characters"), revision # / info, date, included characters/glyphs in each range (those ranges are shown sorted with hexadecimal codepoints (like the way main article does)), license, font family name, Serif style, PFM family, weight, width, etc from any TTF, TTC, OTF, BDF, etc fonts. If you have installed the font in your Windows system, then use "BabelMap" (BabelMap \ Fonts \ Font Analysis Utility \ select-font \ Font Info) for finding the total # characters (not "glyph"), format, license, included char in each range (ordered alphabetically using each range's name), font family, etc. You may use "FontExpert" to find the TTF,TTC, etc font's weight, width, style, panose serif style, table, version, etc. Goto Microsoft download site, get "Font Properties Extension", it allows to find total # glyphs, total standard Kernpairs, Hinting/smoothing info, opentype layout tables, supported codepages, version, etc. --Tarikash 04:52, 23 October 2010 (UTC)
The latest release of BabelMap (version now includes both the total character count and the glyph count in the Font Info dialog. BabelStone (talk) 21:04, 27 October 2010 (UTC)
If BabelMap showed the Hex range for each block or sorted based on Hex range, then regular users could find more easily the exact section in the main article to verify/update, now they will have to use web-browser's "find" / "search" feature to goto the exact section for each. --Tarikash 22:40, 27 October 2010 (UTC).
Ah, yes, I see what you mean. I could add an option to sort by hex range in the next version; but actually if you press the "Copy" button and paste the table into a text editor the blocks are ordered by range value not alphabetically, which is convenient for what you want. BabelStone (talk) 00:17, 28 October 2010 (UTC)
Wow, i didnt know that, copied data at least was sorted range-wise, Thanks. This should help others to verify/update a font column. I placed the max character count from 'Fontforge', in each block, for each font, one by one, manually. An option to order with hex range wise would be great, for both side, those who prefers hex range wise, and those who prefers an alphabetically ordered list. --Tarikash 09:51, 28 October 2010 (UTC).
Hi, this is mostly very good and useful info. However, I noticed one aspect that is incorrect. Under CJK Punctuation, Hiragana, Katakana for GNU FreeSerif, it reports that there a characters in those ranges. All these characters were deleted years ago, and the version of the font cited certainly does not have them. (Possibly an old version the font is hanging around in the system, or a font cache is not cleared?) -- Stevan White (talk) 09:12, 11 August 2011 (UTC)

Which typefaces to include?[edit]

The current collection of typefaces appears to be rather random and includes fonts which are hardly "Unicode" (Arial, Times, etc.) I'll list the fonts currently included on the page, with comments, and make my own suggestions for additions:

  • Arial - Keep as an example of an average Microsoft font, but update the statistics according to the newest version.
  • Arial Unicode MS - Definitely keep, no question.
  • Bitstream Cyberbit - Can also be kept.
  • Cardo - I'd keep this too (but see below)
  • Caslon Roman - Can also be kept.
  • Code2000 - obvious keep.
  • Charis SIL - Since this font is identical in coverage to Doulos, which is already listed, I'd strike it out.
  • Chryſanþi Unicode (Chrysanthi Unicode) - Can be kept.
  • ClearlyU - Never heard of this font, but it seems reasonable to keep.
  • DejaVu Sans - obvious keep.
  • Doulos SIL - can be kept.
  • Everson Mono Unicode - One of the rare monospace Unicode fonts. Keep.
  • FreeSerif - Keep too.
  • Gentium Regular - Actually I'd strike this out. This contains just Latin and Greek and nothing else. There are lots of better fonts to list instead.
  • GNU Unifont - The numbers alone make me want to keep this.
  • Junicode - Can be kept.
  • Linux Libertine - Meh, keep this.
  • Lucida Grande - As an example of an average Mac OS X font, I'd keep it.
  • Lucida Sans Unicode - By today's standards hardly Unicode. But I'd still keep it because of its popularity.
  • Microsoft Sans Serif - Strike out, hardly an Unicode font and inferior in coverage to Arial.
  • New Gulim - Can be kept.
  • Tahoma - Strike out, hardly an Unicode font and inferior in coverage to Arial.
  • Times New Roman - Strike out, hardly an Unicode font and identical in coverage to Arial.
  • TITUS Cyberbit Basic - Keep.
  • Y.OzFontN - Keep as an example of a Japanese font (other Japanese fonts have nearly the same Unicode coverage because of the JIS encodings)

Also, I'd like to have these fonts added if possible:

  • Sun-ExtA - Created by various Chinese universities, this font covers 102 blocks with 50,112 characters.
  • MPH 2B Damase - I wonder why this hasn't been added already. It's one of the few PD fonts and covers mainly extremely rare blocks. 58 blocks with 2,743 characters.
  • Quivira - This is another nice font with rare blocks. 62 blocks with 6,380 characters.

Also, I'd prefer a separate section for SMP fonts, since most SMP-specialized fonts contain few BMP characters. -- Prince Kassad (talk) 09:47, 21 December 2007 (UTC)

Sun-ExtA info & details added. Sun-ExtB info added. Thanks. --Tarikash 12:27, 23 October 2010 (UTC)
I've created separate sections for SMP, SIP, SSP, PUA-A, PUA-B plane fonts as well. So now each plane can have different set of Fonts. Thanks. --Tarikash 08:37, 24 October 2010 (UTC)

Han Nom A&B are two fonts that between them cover all of the Unified CJK extensions (as coverage of Ext. B is quite lacking in most fonts due to the fact that the majority of the characters were only used in Vietnam) 加持 (talk) 18:42, 13 January 2008 (UTC)

Han Nom A & Han Nom B info added. Thanks. --Tarikash 12:27, 23 October 2010 (UTC)

Charis is a better font than Doulos typographically, so as long as they cover the same glyphs (which I believe they do), I vote to keep Charis instead. kwami (talk) 21:03, 13 January 2008 (UTC)

Suggest adding GNU FreeFont too, again just for the numbers Ecobun (talk) 16:09, 19 August 2010 (UTC)

I've re-verified info of the included one of the GNU FreeFont (aka, Free UCS Outline Fonts) : FreeSerif. --Tarikash 12:27, 23 October 2010 (UTC)

Suggest adding HanaMin A & B or HanaZono fonts which cover all of CJK Rtega: —Preceding undated comment added 17:14, 22 February 2012 (UTC).

Cardo contains placeholders[edit]

Encouraged by the fact that Cardo font contains full Gothic range I downloaded it (version 0.98) and took a look. There are all characters of this range, but all of them are actually placeholders, in form of letter ahsa in a box. I'm quite sure that they should be listed as 0 (or maybe 1) glyphs in the table.

Similarly the runic range is composed mostly of similar placeholders, this time of the form of rune F in an identical box.

Uzyel (talk) 19:06, 1 July 2008 (UTC)

Numbers in chart need verification...[edit]

The numbers in the "characters per code block" chart are problematic. The column for Bitstream Cyberbit seems particularly bad; the number is often based on the size of the range for that block, rather than the actual number of characters in the font or even the number of assigned characters in Unicode:

  • The chart claims Bitstream Cyberbit has 256 characters for Miscellaneous Technical. As of Unicode 5.1, only 232 code points have been assigned in that block.
  • BC supposedly has 112 characters in General Punctuation, but there are only 107 assigned code points.
  • BC supposedly has 48 characters in Bopomofo out of 41 assigned.
  • BC and ClearlyU are both listed with 2,350 precomposed Hangul syllables. (Interestingly, they aren't pink.) This isn't just larger than the number of assigned code points in Unicode; it's larger than the number of possible code points between AC00 and D7AF. If this is in anyway correct (do they have extra precomposed syllables in the PUA or something?), I'd be interested to know how.
  • Three fonts supposedly include 128 Basic Latin characters. Including the control characters?

Anyway, this is a potentially really useful chart (and not bad as it is for getting the gist of a font's coverage); some of it just needs to be verified. If I find some time, I may look into that.Chris Johnson (talk) 05:13, 4 August 2008 (UTC)

It took me few days to re-verify & update existing all fonts data, and even add more font data, with currently available latest font files. I've also updated each row, with max assigned codepoint count, next to their hex range, and updated each table cell with colorizing scheme templates, if all glyphs exist in that table cell, etc. Now its uptodate as of 2010-10-23. Some people did do vandalization on data, as many things are not known instantly by wiki admins or wikie users or are instantly verifiable, so thats why, now, possible max character count is included in each block. If anyone updating a font version, only then, s/he should change the specific column data and must mention font version as well and also change all appropriate sections for that specific font. Thanks. --Tarikash 12:49, 23 October 2010 (UTC)


This page has a more general definition of "family" (i.e., 'serif', 'sans-serif', 'monospace') which I find to be more useful as a reference. I was wondering if the table in this article might be changed to be more like this. SharkD (talk) 18:26, 16 September 2008 (UTC)

It's just wrong. That's why nothing in Wikipedia uses that definition of "family." That level of terms could be referred to as "classifications." Thomas Phinney (talk) 02:06, 25 August 2010 (UTC)

New Unicode fonts added[edit]

I first updated the status of GNU Unifont, which is now maintained by Paul Hardy. He incorporated all the 12pt Chinese bitmaps from WenQuanYi Bitmap Song, and hand-drew all other missing characters. Now GNU Unifont has complete coverage to BMP. The updated font is now pushed to Debian and other Linux distributions.

I also included two other Unicode fonts developed at by myself and my team (for 4 years by now). These fonts, Zen Hei and Bitmap Song ,has been set as the default desktop fonts for Chinese locales since Ubuntu 8.04, Fedora 8, Slackware 12.1, among many other Linux distros. So, I felt it is proper to add these fonts to the list, not only from the perspective of code-point coverage, but also the popularity.

Please let me know if there is any issue that I did not aware for this update. —Preceding unsigned comment added by FangQ (talkcontribs) 01:08, 23 November 2008 (UTC)

Even though you are the creator of WenQuanYi series fonts, and mentioned & added it here, You've mentioned existing facts, so i see no problem there. Beside, i've verified all the related data for the 'WenQuanYi Zen Hei' and its availability in Linux distributions, etc, and i've found to my best that those info are correct, so now it's included in the block by block details column of fonts, and if were necessary, verified or corrected info. Thanks for your help. --Tarikash 13:01, 23 October 2010 (UTC)

Minor error noted[edit]

In the charts that show details of what each font includes, near the end, "Byzantine Musical Symbols" appears twice, once as part of a corrupted table row. I didn't want to correct the table, because I'm not confident that I wouldn't risk corrupting it by trying to fix it. Regards, Nikevich (talk) 09:03, 15 April 2009 (UTC)

han unification, traditional and simplified chinese[edit]

Unicode point U+9AA8 (骨) is typographically different between simplified Chinese and traditional Chinese.

I'm a little confused by this sentence. The article on han unification states

Chinese users seem to have fewer objections to Han unification, largely because Unicode did not attempt to unify Simplified Chinese characters (an invention of the People's Republic of China, and in use among Chinese speakers in the PRC, Singapore, and Malaysia), with Traditional Chinese characters

so, which is true? —Preceding unsigned comment added by (talk) 22:57, 2 July 2009 (UTC)

some rasterized SVGs in the Vertical Font Name List need to be re-rasterized[edit]

There are several examples of too-tightly cropped rasterized images in the Vertical Font Name List. See File:Chrysanthi_Unicode_uf_vt.svg, File:Lucida_Sans_Unicode_uf_vt.svg, and File:TITUS_Cyberbit_Basic_uf_vt.svg. Notice they get cut off at the top. Here is what Lucida Sans Unicode should look like: rasterized at by Apache Batik. —Preceding unsigned comment added by Earthsound (talkcontribs) 18:26, 4 December 2009 (UTC)

I think this is a manifestation of bug 3769. Earthsound (talk) 20:16, 4 December 2009 (UTC)
Many of the svg images with vertical font names, were missing the last few characters, so I've fixed and verified those now. Thanks. --Tarikash 13:06, 23 October 2010 (UTC)

New Fonts[edit]

I would suggest adding the following fonts (implementing Unicode 5.1):

  • New Athena Unicode [3], 1627 Chars with outstanding coverage of Polytonic Greek and Coptic
  • RomanCyrillic Std [4], 3165 Chars
  • Quivira [5], 7155 Chars
  • Sacco-Vanzetti [6], 2269 Chars

--Bubuka (talk) 12:52, 19 December 2009 (UTC)

What the heck does it mean to claim these "implement Unicode 5.1"? I also don't see anything that makes most of these exceptional compared to dozens of other fonts that are also not on the list. Of course, the guidelines for the whole article as to what constitutes a "Unicode font" are so vague as to be nearly meaningless anyway... sigh. Thomas Phinney (talk) 00:10, 21 December 2009 (UTC)
A Unicode font is any one that assigns its characters to Unicode values. A notable category not too long ago, but less and less notable as Unicode comes to dominate the market. kwami (talk) 02:12, 21 December 2009 (UTC)
  • These fonts added some of the signs introduced in Unicode 5.1. Such fonts as TITUS Cyberbit, Cardo, Gentium haven't been updated since long ago, so they have none of the signs introduced in Unicode 5.1.
That's my definition of "Unicode font" as well. Unicode fonts have been commonplace for at least a decade now, and virtually all new fonts made today are "Unicode fonts" in this sense. The fundamental problem with this article is that it is taking a folk concept and using it to take over terminology which is used by experts in the field with an entirely different meaning. I understand the interest in the subject matter of the article, but I wish it could be re-titled to something so that it would no longer seem absurd to professionals in the field. Thomas Phinney (talk) 02:10, 25 August 2010 (UTC)
As far as the guidelines for the whole article are concerned, I think it should be noted that only fonts with unique Unicode coverage are listed here. I. e. universal fonts supporting rare or ancient scripts, such as Coptic, Ancient Greek, Armenian, phonetic transcription, Asian Languages etc. --Bubuka (talk) 20:20, 21 December 2009 (UTC)

Impossible to create font[edit]

I'm not quite clear what the following text is supposed to mean:

In fact, it would be impossible to create such a font in any common font format, as Unicode includes over 100,000 characters, while no widely-used font format supports more than 65,535 glyphs. So while one could make a set of related fonts to cover all of Unicode, a single Unicode font is not possible at this time.

This text starts out to assert impossibility, as if it were an absolute fact, but then weasels out with the vague "at this time". Such a statement could make sense if the distinction between fonts were clearly defined, but since there are fonts from different blocks that look very similar, that seems only a semantic finesse to me. A font marketing company that offers e.g. nonserif fonts for a number of blocks could easily just give them one name across the blocks and call them one font. Or am I missing something? — Sebastian 18:58, 19 June 2010 (UTC)

In any common font format, it is impossible to create such a font. In the future, it's possible a font format could be made that doesn't have the limitations current ones have. The distinction between fonts is clearly defined; each font is contained in a ttf or otf file, or the like for other formats, and has a unique name, and if I'm not mistaken, Windows, Mac OS X and X11 will all be less than functional about a group of font files all with the same font name.--Prosfilaes (talk) 00:33, 20 June 2010 (UTC)
Hello Prosfilaes, nice to meet you again! I'm getting it now that the problem is the font format, not the fonts. If we had a font format that supported > 2**16 glyphs, then a font provider could pack their fonts together into one big font file, correct? So, the vagueness in the article seems to be caused by the wording "no widely-used font format". It (somewhat weasely) leaves open if there are any less widely used formats that don't have that limitation. Let's assume there's one. Then presumably installing such a font would be a hassle for users, and any font provider that used such a format would have a problem selling it. Correct? — Sebastian 05:40, 20 June 2010 (UTC)
The weasely wording is appropriate. But it wouldn't be "a hassle" to install it—either some of the characters would be inaccessible, or it would be rejected/ignored by the operating system or by most apps, depending on how breaking the 64K barrier was implemented. That said, there's already a solution that has been developed, which may or may not achieve wide adoption: the ISO/IEC 14496-28 "Composite Font Representation" standard overcomes these limitations via an XML-based representation that allows linking of existing font files into a single virtual font called a "Composite Font." There have been several earlier composite font formats and representations, but this one is backed by more players, and might get broader support in the long run. Thomas Phinney (talk) 22:27, 9 February 2015 (UTC)

Country Flag or Code[edit]

We need to include tiny country FLAG or a 3 letter Country Codes next to each block, or in the font list, so that those shows which country currently using that language or where its originated. --Tarikash 07:02, 29 October 2010 (UTC).

Thanks to user 'closedmouth', pointed out to this template Template:Flagicon, which can be used to show tiny country Flag, by using like this, {{flagicon|country_name}}. And 'country_name' will have to be mentioned like the mentioned inside this page (long country name), or like this page (3 letters short code), i would prefer to use the 3 letters short code to keep this article's size shorter. --Tarikash 07:28, 29 October 2010 (UTC).
I think that is a really bad idea, and would be almost impossible to implement sensibly. It would be difficult enough to do this at the script level (how many 100s of countries and 1000s of languages use the Latin script?, and what countries "own" historic scripts that were used in regions that do not correspond to modern political entities?), and quite impractical at the block level. Blocks are collections of characters which may or may not be used for the same script and/or language in one or many countries, and it is (in my opinion) pointless to try to artificially assign them to individual countries. Who does Basic Latin belong to (long, long list of countries that use A-Z and/or 0-9); who does Braille belong to; who does Arabic belong to; who does Runic belong to; who does Gothic belong to; who does General Punctuation belong to; who does Miscellaneous Symbols belong to ... ? BabelStone (talk) 11:18, 29 October 2010 (UTC)
Yes, some 'scripts' (aka, languages) could have even 203 flags for 203 countries, and some with no country flag at all, that is IF we show flag based on its current 'usage'. Script(s)/Language(s), are connected, with 1 or more multiple certain locality on earth. So we should include Flag(s) either for the 'ORIGIN' country/place-name, and/or, currently 'DOMINANTLY-used', or, 'PRIMARY' Language of country/place-name, and/or, Primary language of a single or more State of a Country, or, if its 1 of 'State Language'. For example, Chinese script/language is dominantly used by China, and in Hong Kong, so a CJK block will have flags of those 2 countries for 'Chinese' language, and even though Chinese is used in other countries, but its not a primary language of those other areas. So my suggestion is to, create a new data column or row named as 'Primary Language', or, 'Primary Official Language'. --Tarikash 02:49, 30 October 2010 (UTC).
I agree that this is a bad idea. Use ISO 15924 script codes - or some set of icons indicating the scripts (writing systems) not ones indicating languages or countries. Chris Fynn (talk) 09:39, 25 May 2013 (UTC)

Gentium Plus[edit]

updating for Gentium Plus. However, they show a block of 32 currency symbols, which isn't even the whole block, whereas other fonts are maxed at 25. I may have messed s.t. up in that row. — kwami (talk) 23:31, 8 November 2010 (UTC)

Verified & Corrected data for the 'Gentium Plus' v1.502 in all BMP blocks, and in SMP blocks. 'Currency Symbols' block has max 25 characters defined. and this font has 22. Tested 1st in my own computer and then uploaded a new SVG image file 'Gentium_Plus_uf_vt.svg' in wikimedia. And have also updated Font List Template of BMP & SMP sections, to show 'Gentium Plus'. I see that you have also removed the Hangul Jamo (main/combined) section, Thanks. That was there for some of the older version fonts, which showed all type of Hangul Jamo chars Combindly. But now most font shows Hangul Jamo Choseong, Hangul Jamo Jungseong & Hangul Jamo Jongseong separately. And also becuz of 'FontForge', BabelMap, etc software are now updated to show separately. Thanks. --Tarikash (talk) 13:23, 9 November 2010 (UTC).
Thanks for verifying. There's another combined-count row which I'll also remove. Pls revert if there's some reason to keep. — kwami (talk) 17:14, 9 November 2010 (UTC)
yes, 'Alphabetic Presentation Forms'. I would recommend to keep this top most 'combined-count' block intact, as many fonts are Still identifying all these characters, under same range/block. It would be more time consuming to find which sub-block have how many total chars. --Tarikash (talk) 18:46, 9 November 2010 (UTC).

Requested move[edit]

The following discussion is an archived discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. No further edits should be made to this section.

The result of the move request was: page moved. Vegaswikian (talk) 06:31, 16 November 2011 (UTC)

Unicode typefaceUnicode font – This article should be renamed from Unicode typeface back to Unicode font (it was moved from "font" to "typeface" in 2006 after a few discussion), the reasons are:

  • The "Unicode" determines that this article is (only) about the computer fonts, which are electronic data files in computer science. The "computer font" is commonly shorted as "font" but not often called "typeface", when talking about the digital one, we call it "font", we never say something like "Bitmap typeface", same thing applys to "Unicode typeface".
  • "Unicode typeface" is a very weird use, mixed the computer industry term and the term does not very often used in computer industry (comparing "typeface" to "font"). If you search "Unicode typeface" in Google, it shows no result other than the pages from or copied from Wikipedia. Such an uncommon title definitely does not meet WP:COMMONNAME and may even violates WP:No original research. edited @ 03:59, 9 November 2011 (UTC)
  • It can be considered that "typeface" includes "computer font", a Wikipedia article is better named specifically. We've got a lot of "font" articles: Fallback font, Fonts on Macintosh, List of CJK fonts. When an article is named "typeface" such as List of typefaces, it refers to a globle "typeface" which includes "computer fonts" and other "typefaces" which may or may not yet digitized.
  • I feel that "typeface" is usually (not always however) a traditional use, it is hard to imagine that one "typeface" could be designed for both Latin alphabet and Chinese characters. But if it is called "(computer )font", it is far more easier to understand.
  • In the content of the article, "Unicode typeface" is rarely used, "Unicode font" is far more used instead. It is weird the title is not used in the content.
  • In the discussion, a professional typographer User:Tphinney who worked in the type group at Adobe Systems also opposed the previous move after it was done and gave his suggestions.

Btw, if the move is done, Open-source Unicode typefaces should also move to Open-source Unicode font for the same reason. --Tomchen1989 (talk) 08:01, 8 November 2011 (UTC)

Comment. "Unicode" in the title (or in the meaning or concept) does not point to computer font (or typeface). Unicode only is the definition of a character, not it´s presentation or format. That definition includes a number, a name, etc. Unicode is explicitly not about a glyph, let alone it would be a computer glyph. With this, the title "Unicode typeface", means that characters in a typeface are identifiable by Unicode, and so use the Unicode definition. So with this in mind, I think arguments on this are wrong. (btw these days typefaces do exist, Unicode typefaces, that cover many many scripts in design, e.g. Latin and Chinese). I do not know about the subtle differences between typeface and font, so there can be other reasons in that to change the title. -DePiep (talk) 18:23, 8 November 2011 (UTC)
Reply: "Unicode" in the title points to "computer font", because "Unicode is a computing industry standard for the consistent encoding ..." (see definition on Unicode article). If this article is not only about the computer font for real as you said (in fact it is), it should be titled something like "Typefaces which contains a wide range of characters" rather than "'Unicode' typeface". "Unicode typeface" is a very weird use, mixed the computer industry term and the term does not very often used in computer industry. It may even violates WP:No original research. If you search "Unicode typeface" in Google, it shows no result other than the pages from or copied from Wikipedia. Such an uncommon title definitely does not meet WP:COMMONNAME. --Tomchen1989 (talk) 03:59, 9 November 2011 (UTC)
Re: let me quote from your quote: "encoding". Indeed, that is what Unicode is about. These typefaces are encoded in Unicode. That does not make that typeface (or font) a "computer font". In itself, Unicode is not about a font or typeface. So surely not a "computer font" whatever that may be. The rest of your reply does not refer to my point. (E.g., I did not argue that the title is OK). Your suggested descriptive title can be more specific: "Typefaces [or fonts] that use Unicode encoding for its characters". Though not the most clear one, the title fits this meaning.
All this points to the meaning (usage) of Unicode. Unicode is the encoding, not the typeface (or font) name specifier. Again, this is not my point here, it could be "font" or "typeface". -DePiep (talk) 21:21, 9 November 2011 (UTC)
No, it couldn't be either. As someone who has actually surveyed current industry use on the subject of the definitions of these two words within the last few years, I think I can say definitively that the overwhelming industry preference in this situation would be "font" and not "typeface." The two words are not interchangeable. A typeface is the abstract design, and does not have an encoding (though it can have a character set). Thomas Phinney (talk) 08:44, 12 November 2011 (UTC)
Wow, what a professor you are. But again: I do not argue between font or typeface (remember). I say: "Unicode" is not used the way you do. Since you do not reply to (or read) my talk, I'd say end of talk. -DePiep (talk) 23:04, 12 November 2011 (UTC)-DePiep (talk) 23:04, 12 November 2011 (UTC)
You may not argue between font and typeface, but you misuse at least the latter term. Contrary to your ad hominem comment, I did indeed read your discussion. Contrary to what you say, "typefaces" are not "encoded"; fonts are. Which is exactly why Unicode is not related much to typefaces, but is closely connected to fonts. The digital files described in this article are computer fonts, not typefaces. Thomas Phinney (talk) 00:21, 15 November 2011 (UTC)
  • Oppose. As discussed above: the word 'Unicode' is misunderstood, and there is no final conclsion on 'font' vs 'typeface'. -DePiep (talk) 23:09, 12 November 2011 (UTC)
  • Support. As discussed above, there is a final conclusion on font vs typeface. Thomas Phinney (talk) 00:21, 15 November 2011 (UTC)
  • Support. Per Thomas Phinney, typefaces are abstract designs, whereas fonts are particular implementations of a typeface (thus you can have multiple fonts that implement the same typeface). This article is clearly about specific, individual fonts rather than abstract typefaces. Moreover "Unicode typeface" makes no sense as Unicode does not define or specify typefaces; whereas "Unicode font" does make sense as the term refers to fonts that implement mappings of glyphs to Unicode characters. BabelStone (talk) 23:26, 12 November 2011 (UTC)
The above discussion is preserved as an archive of a requested move. Please do not modify it. Subsequent comments should be made in a new section on this talk page. No further edits should be made to this section.

Google "Noto"[edit]

How about Google's "Noto" family? They use an Apache license. I can't tell how far they are on the way to their huge goal, covering ALL of unicode, but Google certainly has the resources for it. Here's the project page: -- (talk) 22:24, 20 April 2013 (UTC)

"Coverage" of Fonts[edit]

I find the tables in the section Comparison of fonts to be misleading. While fonts may contain a full set of glyphs to cover a particular Unicode range or script, in the case of complex scripts such as Arabic, Devanagri or Tibetan, simple coverage is insufficient for practical purposes since these scripts are unreadable if the font does not inlude additional component glyphs and lookup tables (e.g. OpenType tables) necessary to form the numerous contextual and combined glyph forms required to display and read languages written in these scripts properly. Several "Unicode fonts" contain the all the basic glyphs for various scripts, but do not contain the additional glyphs and lookup tables necessary to render these scripts usefully. Where this is the case, it could be indicated in these tables by using a color other than green - perhaps amber.

Chris Fynn (talk) 09:20, 25 May 2013 (UTC)

Those people who use complex scripts such as those are probably aware of the issue, and will take the numbers with a grain of salt. If you don't use those complex scripts, you probably don't care. The tables are useful for a large majority of scripts. ⇔ ChristTrekker 18:29, 2 October 2013 (UTC)

standard for inclusion[edit]

Has a standard to have a font included on this page ever been specified? ⇔ ChristTrekker 18:31, 2 October 2013 (UTC)

1F000-1F6FF no data[edit]

The rows for these are under the heading "0000–1D7FF", which is wrong, and contain no cells. Do none of the listed fonts support them? If so, "N/A" cells should be added. -- (talk) 15:58, 13 July 2014 (UTC)