Talk:UTF-9 and UTF-18

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing  
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 ???  This article has not yet received a rating on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.

Untitled discussion[edit]

What does "standard communication protocols are built around octets rather than nonets" mean? Is this an assertion that there are no standard protocols for computers with 9-bit bytes, such as the PDP-10? —Preceding unsigned comment added by (talk) 11:09, 20 July 2007

A protocol that requires 9-bit bytes isn't likely to become a standard. 9-bit machines on the Internet use the same octet-based protocols as everyone else, generally with some adaptation. FTP, for instance, can operate in text mode, unpacking characters into an octet each, or image mode, packing 2 36-bit words into 9 octets. Yuubi 20:26, 11 October 2007 (UTC)

There is another encoding called "UTF-9"(draft-abela-utf9-00.txt) aim for similar target. Roytam1 (talk) 12:04, 2 May 2010 (UTC)

Indeed, that's a rather silly comment, since the whole point of these UTFs is for use on 9-bit, 18-bit and 36-bit systems. I think the only reason someone thought it funny enough for 1 April is that these machines are obsolete, though they (and hence UTF-9 and UTF-18) might be of interest to retrocomputing enthusiasts and esolangers. Obviously you wouldn't seriously use them on octet-based systems - you'd use UTF-8, UTF-16 or UTF-32. As such, I'd be inclined to reword the comment, if not remove it. I'll see what I can come up with. — Smjg (talk) 01:01, 4 September 2013 (UTC)

these machines are obsolete

For what it's worth, while the PDP-10 may indeed be obsolete, Unisys continues to produce, advertise, sell, maintain and operate their 1100/2200 series 36-bit hardware. I develop on such systems, which enjoy most of the usual "comforts" of modern computing environments including compilers for C and Java, a TCP/IP stack, FTP, email and other TCP/IP-based protocols such as OLTP. To someone accustomed to other hardware, it can be a bit disconcerting to realize that the internal size of a char is 9 bits. Extra care to this issue is needed on interfaces to "the outside world." Carl Smotricz (talk) 13:09, 5 February 2015 (UTC)


UTF-12 has been invented recently, too. See it here. — Monedula (talk) 11:29, 17 June 2010 (UTC)

Looks like someone's personal invention, not a standard, so probably not worth covering in Wikipedia for now. -- intgr [talk] 14:38, 17 June 2010 (UTC)

UTF-9 and UTF-18 aren't standards either. As for "not worth covering", that applies to this article about a joke RFC ... not at all notable. -- (talk) 01:32, 5 February 2015 (UTC)

UTF-9 alleged problem[edit]

I don't think the alleged problem with UTF-9 exists, or else I don't understand the problem. Since one would never search for partial characters (a nonet sequence starting with the second or third nonet of the first character of the search string, or ending before the final nonet of the last character of the search string), an exact, unambiguous match does not require looking at any nonets prior to the first nonet of the match candidate.

If there isn't a clarification and or an example illustrating this alleged problem, it should be removed from the article. --Brouhaha (talk) 07:07, 21 February 2015 (UTC)

I agree. At the very least this smacks of WP:OR. It's also WP:undue since the problem section is as big as the rest of the article put together. Cut this way back, or eliminate it. Kendall-K1 (talk) 14:04, 21 February 2015 (UTC)