Talk:ASCII

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Former featured article ASCII is a former featured article. Please see the links under Article milestones below for its original nomination page (for older articles, check the nomination archive) and why it was removed.
Article milestones
Date Process Result
January 19, 2004 Refreshing brilliant prose Kept
December 30, 2005 Featured article review Kept
May 10, 2008 Featured article review Demoted
Current status: Former featured article
          This article is of interest to the following WikiProjects:
WikiProject Computing (Rated C-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 
WikiProject Writing systems (Rated C-class, Mid-importance)
WikiProject icon This article falls within the scope of WikiProject Writing systems, a WikiProject interested in improving the encyclopaedic coverage and content of articles relating to writing systems on Wikipedia. If you would like to help out, you are welcome to drop by the project page and/or leave a query at the project’s talk page.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 
Wikipedia Version 1.0 Editorial Team / v0.5 (Rated C-class)
WikiProject icon This article has been reviewed by the Version 1.0 Editorial Team.
C-Class article C  This article has been rated as C-Class on the quality scale.
Checklist icon
 ???  This article has not yet received a rating on the importance scale.
 
Note icon
This article is within of subsequent release version of Engineering, applied sciences, and technology.
Taskforce icon
This article has been selected for Version 0.5 and subsequent release versions of Wikipedia.
WikiProject Typography (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Typography, a collaborative effort to improve the coverage of articles related to Typography on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.
 

Second representation of the printable character list[edit]

I've added a previously removed second representation of the ASCII characters that supports easy copy-pasting. I wasn't aware that it already has been on the page. My edit got reverted. However, I think it should be still there.

In Wikipedia, there are lots of examples where information is displayed multiple times, even when we don't count efforts to help disabled people, like "spoken wikipedia" or descriptions below images: take AES as an example. The text perfectly describes the steps but the images display a second representation of those steps (and a third is in the image description, but as previously described we don't count it).

We can see the redundancy also on the principle of the lead paragraph: Except of the X in "is a X", most information gets repeated in the article below. It gives the reader a concise definition of the topic, that can be retrieved without having to read the whole article. The second representation of the ascii list fulfills this second purpose: the reader doesn't have to read every single character to get a full list of ASCII characters.

And, redundancy is still present in the list itself:

Binary Oct Dec Hex Glyph
010 0000 040 32 20 (space)

The list gives us three representations of the the character's number.

I think there are different use cases linked to both representations: first (currently only) representation helps readers with various conversions between the character and its ASCII address. Therefore the multiple number formats. The second (disputed) representation helps the reader in the case they want to act on the whole set of characters: I've used it for password generation, and others might want to use it in a program the case they write in a language that doesn't have such an easy linking between numbers and characters like C.

What do you think? — Preceding unsigned comment added by ‎Muelleum (talkcontribs) 21:04, 12 August 2014 (UTC)

Hello there! Hm, so the main purpose would be to make the whole ASCII set of characters easily available for copying and pasting? Maybe some kind of a compromise could be to provide it in form of a note after "There are 95 printable characters in total", using the {{Efn}} template? — Dsimic (talk | contribs) 21:54, 12 August 2014 (UTC)
I'm OK with that. Muelleum (talk) 23:24, 12 August 2014 (UTC)
Looking good, having a scrollable box was the only solution for long lines in reference tooltips. — Dsimic (talk | contribs) 23:32, 12 August 2014 (UTC)

A three-part article on ASCII[edit]

Sometime in the 1980's, I read an article in three successive issues of a personal (micro) computer magazine by the "inventor of ASCII", whoever that was, that went over all the non-alphabetic codes and was very enlightening. I've never been able to find it again. If anybody knows, please send me a message. Thanks.deisenbe (talk) 16:08, 14 August 2014 (UTC)

You are probably looking for "Inside ASCII". This was originally published as:
  • Bemer, R. W. (May 1978). "Inside ASCII - Part I". Interface Age. Portland, OR: Dilithium Press. 3 (5): 96–102. 
  • Bemer, R. W. (June 1978). "Inside ASCII - Part II". Interface Age. Portland, OR: Dilithium Press. 3 (6): 64–74. 
  • Bemer, R. W. (July 1978). "Inside ASCII - Part III". Interface Age. Portland, OR: Dilithium Press. 3 (7): 80–87. 
Unfortunately it is almost impossible to find copies of Interface Age anymore. I am not sure how available this is either but it was also republished as:
  • Bemer, R. W. (1980). "Inside ASCII". General Purpose Software. Best of Interface Age. 2. Portland, OR: Dilithium Press. Chapter 1. ISBN 0-918398-37-1. 
Bob Bemer wrote many other things on ASCII as well. Perhaps you can find one of these:
I hope that was helpful. 50.126.125.240 (talk) 00:56, 3 January 2016 (UTC)
That must be it, though the magazine doesn't ring a bell. Thank you. I met the author, and had a discussion of what he called the "Data-Link Escape" code. This must have been at the big microcomputer expo in Los Angeles in the spring of 1980. deisenbe (talk) 21:06, 3 January 2016 (UTC)

In the Order section: Numbers are sorted naïvely as strings?[edit]

The article in the Order section talks about "ASCIIbetical order". The following quote appears in the text:

"Numbers are sorted naïvely as strings; for example, "10" precedes "2""

Is naïvely the right word? Natively maybe?

I am researching one proposed TerSCII table and curious as to why ASCIIbetical order is what it is. There is a lot of thinking went into how the ASCII table is built. It would be nice to carry over lessons learned from ASCII to TerSCII.

2606:6000:6042:9600:25F2:8029:9F5C:52C9 (talk) 19:46, 4 June 2015 (UTC)Wilx

No, "naïvely" is what is intended - if you just, well, *naïvely* assume that all strings should be sorted the same way, you end up with "10" being less than "2", as "1" is < "2". ("Naïvely", as in "showing a lack of experience, wisdom, or judgement", i.e. not realizing that if you sort numbers as strings they won't come out in numerical order.)
That whole section doesn't really explain what it's talking about; what it's really discussing is sorting strings by simply comparing individual characters' code values, without paying any attention to getting numbers sorted by numerical value, words sorted without regard to case, etc.. That's less a characteristic of ASCII than of simple (naïve) string comparison operations. You could have EBCDICibetical order as well, for example. Guy Harris (talk) 22:47, 4 June 2015 (UTC)
In particular, you are extremely unlikely to find a character encoding scheme that would magically make naïve string comparison magically sort strings the way humans would want them sorted, if you're going to compare ternary strings by comparing the numerical values of the characters in the string from beginning to end. Having the encodings for upper-case and lower-case letters be adjacent, so that 'A' < 'a' < 'B' < 'b' < 'C' < 'c' etc., and putting accented letters in the appropriate places might help, but that's not all there is to sorting words, and that won't fix the problem of sorting numbers, either. Localization of sorting may end up being about as painful in ternaryland as in binaryland.... Guy Harris (talk) 22:57, 4 June 2015 (UTC)

Control-Z as end-of-file[edit]

First for TOPS-10. The use of Control-Z as End-Of-File existed but only from the Teletype. Control-Z on paper-tape, mag-tape, disk-files was just another character. In other words, this was specific to the terminal device driver. I don't remember if there was a standard escape mechanism for the various control characters - other than the program using a raw mode.

Second, also for TOPS-10, disk-files had a count of words. Not a count of characters and not a count of records. So plain text files could have 0 to 4 NULs at the end to finish the last word. The input routine ignored all NULs coming in - also due to sequence numbered files being word aligned for every line - see SOS and PIP.

Third, as far as CP/M goes, the original use of Control-Z was as a filler character since the OS only did a count of (128 byte) records. So a character file would have 0 to 127 SUB at the end to fill out the last record. (Why they used SUB instead of NUL like DEC baffles me.)

Then, common usage changed this to merge TOPS-10's TTY end-of-file and the filler idea to have a Control-Z as an explicit 'end' character.

I have some TOPS-10 and CP/M manuals. I can do some better research if desired.

"No I've got sandwiches"[edit]

In the article under 7-bit the following is given to illustrate national variant problems:

Many programmers kept their computers on US-ASCII, so plain-text in Swedish, German etc. (for example, in e-mail or Usenet) contained "{, }" and similar variants in the middle of words, something those programmers got used to. For example, a Swedish programmer mailing another programmer asking if they should go for lunch, could get "N{ jag har sm|rg}sar." as the answer, which should be "Nä jag har smörgåsar." meaning "No I've got sandwiches."

If "many" programmers used US-ASCII then the incidence of a programmer emailing another programmer would have some probability ((many/all)2) of both being US-ASCII. That indefinite probability added to the probability of both using Swedish encoding adds to half at the least, so the majority of emails sent from swedish programmers to swedish programmers should have ended up being displayed as they appeared on the sender's computer.

All I mean to say is that maybe one of the hypothetical people in the example shouldn't be a programmer. But I'm not going to change it because it doesn't make much difference anyway. Mattman00000 (talk) 06:53, 18 February 2016 (UTC)