Talk:Combining character

From Wikipedia, the free encyclopedia
Jump to: navigation, search

The merger[edit]

Oppose Combining characters and dead keys have nothing to do with each other. One's a Unicode feature and the other's a keyboard feature. --/ɛvɪs/ /tɑːk/ /kɑntrɪbjuʃ(ə)nz/ 01:35, 17 November 2005 (UTC)

I agree the co\ncepts, while related, are quite distinct. A-giau 03:51, 29 January 2006 (UTC)
I agree too - no reason to merge pages. Can we get rid of the 'suggested merge' tag? Richard Donkin 08:59, 29 January 2006 (UTC)

stub[edit]

I dont really think this article is a stub...

Color coding?[edit]

Could someone please explain why the tables are broken down into different colored zones? Thanks. -- Fullstop 19:49, 2 September 2007 (UTC)

Text from dead key[edit]

With the advent of Unicode character encoding it is possible to combine any available diacritical mark with any other character. The “combining diacritical marks”can be found in Unicode space U+0300–U+036F. For example, you can combine “˜” (U+0303 Combining Tilde) with “p” so you get “”, whether this makes sense or not.

More exotically, you can combine “ ̐” (U+0310 Combining Candrabindu) with “” so you get “∞̐”.

In case this is useful for this article. —Random832 13:51, 20 September 2007 (UTC)

Purpose[edit]

What is the purpose of combining characters, given that all the combinations used by real languages are intended to be encoded as real characters anyway? This article should answer this question. — Timwi (talk) 22:19, 29 May 2011 (UTC)

Your question is based on a false premise. There are actually many languages which use combinations of diacritics and base characters that are not precomposed in Unicode. Furthermore, the Unicode Consortium has explicitly stated that they do not want to add more precomposed characters if there are combining characters already available that can be used to construct them. That means that precomposed characters are actually deprecated in a sense, and the Unicode Consortium certainly does not intend for all combinations in use to be encoded as precomposed characters, but rather quite the opposite. Most of the precomposed characters that they have defined are actually compatibility characters that only exist because they were already defined in some other character set standard, e.g. VISCII. You also do not seem to have considered what you mean by “real language”, given that there are probably more than 6000 out there, and there are many orthographies and writing systems that have yet to be accepted into Unicode. — 128.189.187.210 (talk) 18:49, 3 September 2011 (UTC)

Unicode Consortium must change their politika![edit]

U.C. must add precomposed characters for really exist languages in code table (but not lot of mindless arrows, dingbats and emoticons, as they do). Combining diacritical marks do very difficult edit the text and often look ugly. --Jugydmort (talk) 12:40, 24 September 2012 (UTC)

  • The problem, quite simply, is that there are so many different letters and variants. Just take the latin alphabet, for example- French, for example, has four different variations of e: è, é, ë, and ê, and some like Icelandic add several unique characters such as 'Þ'. Multiply this times dozens of languages, each with their own quirks, and you can begin to see the problem- and the latin alphabet is the simplest of the many alphabets that unicode has to include. Creating prerendered characters for all possible variants of all letters for all languages would take tens of thousands of different characters, a huge number of which would be found only a handful of words in their parent language. And then you need unique codes for all of them: To fit that many into the value chart with any sort of organization, the unicode values would need to look like international phone numbers... not to mention I would pity the poor soul trying to make a universal font for that system. Creating combining marks is quite simply the most efficient solution to a complicated problem- a simple four character code for the base letter forms in an alphabet, plus a second four letter code to get a variant if needed. By using the combining mark method, even extremely complex alphabets like Devanagari can be represented in a mere 128 characters (enough to fit on a standard keyboard, and a manageable number for font creators) but can still create the thousands of unique letters and letter variants of the language.--Scorpion451 rant 16:03, 29 January 2014 (UTC)