Talk:ISO/IEC 2022

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing  
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 ???  This article has not yet received a rating on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.

ISO 2022 vs ISO 646[edit]

To represent large character sets, ISO 2022 builds on ISO 646's property that 1 byte can define 94 graphic (printable) characters (in addition to space and 33 control characters).

So the control characters are always available no matter which character table is currently shifted in? --Abdull 20:17, 7 June 2007 (UTC)

I don't have ISO 2022 text, but according to JIS X 0202 (which corresponds to ISO 2022), you can designate control character sets (C0, C1) by escape sequences. See control character sets registrations at . ESC is guaranteed to be same code for all control character sets. --Fukumoto 16:42, 8 June 2007 (UTC)

Comparison with other encodings[edit]

The comment appears to indicate that ISO-2022 is not useful except with 7-bit displays. Since both GL and GR are mapped, it applies to 8-bit and 7-bit displays (with the latter requiring extra effort on the part of the application developer). Tedickey (talk) 22:15, 25 July 2009 (UTC)

The comment regarding disadvantages is also misleading, since (applying to cut/paste - apparently), it ignores the actual terminal implementations which may pass selections around as UTF-8. Tedickey (talk) 22:18, 25 July 2009 (UTC)

If a text processor needs random access to the character data, it basically has two options:
  • Normalize the text by repeating the current shift code before every character,
  • Convert everything to UTF-8 — but then why not use UTF-8 in the first place?
Both methods are unwieldy enough to be regarded as a disadvantage.
--Yecril (talk) 13:38, 25 September 2009 (UTC)

"Display" is the wrong word; "system" is slightly better. IIRC, the typical PC console driver is 9-bit (512 glyphs can be used at a time, or so) plus 8 bits of colour (4 background, 4 foreground) plus a few bits for bold/underline/blink, plus some more stuff I've forgotten.

  • ISO-2022-JP is presumably useful over a 7-bit transports (traditional SMTP comes to mind). In fact, all the examples listed in "ISO 2022 character sets" appear to be 7-bit.
  • You can always convert generic 8-bit ISO 2022 text (i.e. text that uses GR) into equivalent 7-bit ISO 2022 text by inserting the appropriate control codes and using GL instead. I don't think anyone uses plain ISO 2022, but this may be an advantage. I'm ignoring C1 control codes and DOCS.
  • It's actually easier for a developer to use only the 7-bit range because there's less choice (I have GL mapped to G0 and GR mapped to G1, and now I need a character from G2. What do I do?).
Optimise ...
  • "Actual terminal implementations" — which ones? Why specifically terminals? And no, it's about text processing, not copy/paste (which is simple):
    • A perl script is parsing text. It reads the next byte, which is an "e". But what shift state am I in? Which character is that? How do I represent a "character"? Some bytes need to be accompanied by a shift state, a character number needs to be accompanied by a charset number, and control codes are a right pain...
    • Your mail client is searching for some text ("Hello world!"). But wait — it might be in GL or GR. It might have random shift codes in the middle. Sigh.
  • Any text encoding which has support for arbitrary future extensions (ESC % / in particular) is broken — it's practically impossible to write an implementation that will fail gracefully when, for example, you switch to EBCDIC. And what's "use ESC % @ to return"? Is that the relevant bytes in EBCDIC or ASCII?
ISO 2022/ECMA-35 is defined in terms of bit patterns so it's 0x1B 0x25 0x40 whatever the character set (talk) 21:58, 3 October 2011 (UTC)
  • How nice of them to support "private use F bytes". Suddenly, there's a an unknown blob in your string, and you can't even let the user select inside it because you don't know where the character boundaries are. And how are you supposed to compare two private-use blobs for equality?

The article has many problems, such as the introduction saying that it's a 7-bit encoding (helpfully contradicting the rest of the article!), but not as many problems as ISO 2022. ⇌Elektron 04:13, 29 January 2010 (UTC)

But I agree with the rest of your rant; ISO 2022 is EVIL (talk) 21:58, 3 October 2011 (UTC)

DICOM ISO 2022 variation[edit]

Reference 4, "DICOM ISO 2022 variation" is an incorrect url. It points to a simple test email message in a sourceforge project which does not appear to have any relation to DICOM. I've searched to try to find the correct link, with no success. I'd be very interested in the correct target of this link if it could be found. Dlmason (talk) 12:25, 7 April 2012 (UTC)

Rather than that link, this may be useful TEDickey (talk) 13:31, 7 April 2012 (UTC)
Thanks, those examples are helpful, but are largely directed to VRs of type PN. There should (I hope) be a link that talks about other differences or issues in general DICOM encodings compared with ISO 2022 -- for example, the DICOM standard forbids certain control characters and shifts. I'm hoping there's a nice summary somewhere to list all the differences in simpler language than is used in the DICOM standard. Dlmason (talk) 12:31, 8 April 2012 (UTC)

Missing an history section.[edit]

Missing an history section. (talk) 19:01, 19 July 2012 (UTC)

removing POV tag with no active discussion per Template:POV[edit]

I've removed an old neutrality tag from this page that appears to have no active discussion per the instructions at Template:POV:

This template is not meant to be a permanent resident on any article. Remove this template whenever:
  1. There is consensus on the talkpage or the NPOV Noticeboard that the issue has been resolved
  2. It is not clear what the neutrality issue is, and no satisfactory explanation has been given
  3. In the absence of any discussion, or if the discussion has become dormant.

Since there's no evidence of ongoing discussion, I'm removing the tag for now. If discussion is continuing and I've failed to see it, however, please feel free to restore the template and continue to address the issues. Thanks to everybody working on this one! -- Khazar2 (talk) 04:26, 27 June 2013 (UTC)