Talk:ASCII

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Former featured article ASCII is a former featured article. Please see the links under Article milestones below for its original nomination page (for older articles, check the nomination archive) and why it was removed.
          This article is of interest to the following WikiProjects:
WikiProject Computing (Rated C-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 C  This article has been rated as C-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 
WikiProject Writing systems (Rated C-class, Mid-importance)
WikiProject icon This article falls within the scope of WikiProject Writing systems, a WikiProject interested in improving the encyclopaedic coverage and content of articles relating to writing systems on Wikipedia. If you would like to help out, you are welcome to drop by the project page and/or leave a query at the project’s talk page.
 C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 
Wikipedia Version 1.0 Editorial Team / v0.5
WikiProject icon This article has been reviewed by the Version 1.0 Editorial Team.
Taskforce icon
This article has been selected for Version 0.5 and subsequent release versions of Wikipedia.
 
 C  This article has been rated as C-Class on the quality scale.
WikiProject Typography (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Typography, a collaborative effort to improve the coverage of articles related to Typography on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 C  This article has been rated as C-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.
 

Archives
1, 2


Contents

[edit] Font size for Unicode control character glyphs

We seem to have a potential edit war starting over whether the Unicode glyphs in the U+2400 block should be displayed at the normal font size or at an arbitrarily-increased size. Let's discuss it. For reference, these characters are:

␀␁␂␃␄␅␆␇␈␉␊␋␌␍␎␏␐␑␒␓␔␕␖␗␘␙␚␛␜␝␞␟␠␡␢␣

On my Firefox on Linux, they look something like this:

Unicode 2400 block (Firefox on Linux).png

On Safari on OS X, they look something like this:

Unicode 2400 block (Safari on OS X).png

In both cases the characters are rather hard to read, but this is presumably intentional on the part of the font design and IMO there is no point in arbitrarily increasing the font size. Anomie 15:02, 25 October 2010 (UTC)

My Safari OS/X looks similar except "esc" is drawn like the other letters and ␢␣ is narrower.Spitzak (talk) 05:24, 26 October 2010 (UTC)

On Ubuntu 10.10, both Firefox and Chrome look like this:

Unicode 2400 Chrome Ubuntu.pngSpitzak (talk) 05:24, 26 October 2010 (UTC)
Interesting, it must be using a different font. Chromium for me uses the same font as Firefox. Anomie 11:15, 26 October 2010 (UTC)
Better readability seems a good enough reason to me. Screw the font designers' vision. Sometimes workarounds are necessary. None of the renderings presented are easily legible, although Linux seems to do better at least. --Cybercobra (talk) 06:28, 26 October 2010 (UTC)
The thing is, these characters really are that small. While a case might be made that all characters should be blown up to show detail, that would require it be done for ASCII#ASCII printable characters too and it would require they all be the same size. Expanding them arbitrarily so they look "right" to one random person's sensibilities with their particular browser and font is certainly not the thing to do. Anomie 11:15, 26 October 2010 (UTC)
I would certainly not put a *different* scale on some of the characters. That will look obviously wrong in all the above examples. Also surely somebody still uses IE and Windows, can they post a screen shot?Spitzak (talk) 19:34, 26 October 2010 (UTC)
I would favour readability over strict adherence to font sizes. It is an article about ASCII not a specific font used to represent unicode. I'm guessing the fonts are designed the way they are so that all characters can fit into a standard size for use in character table. This is not an issue for us as we have two separate tables. I'm happy to keep with font-size:large which is just about readable, just so we don't end up on WP:LAME.--Salix (talk): 20:57, 26 October 2010 (UTC)

[edit] Vertical bar in second picture

There is a slight problem with the vertical bar character (hex 0x7C, column 7 row 12) in the second picture. It looks identical to the slash (0x2F), but it should be completely vertical. In the original scan (jpg, see picture source) the bar appears slightly slanted, probably caused by inaccuracies in the printing and/or scanning process, but there is a clear distinction between the slash and the vertical bar. Lemming (talk) 03:07, 6 November 2010 (UTC)

I guess the "inaccuracies in the printing and/or scanning process" made me type a slash. Fixed now. Thanks for the notice on my talk page. - LWChris (talk) 12:09, 6 November 2010 (UTC)

[edit] "most recent update during 1986"

Why was the standard updated in 1986? What could have needed changing? I'd imagined it was pretty stable since the 1960s and there wouldn't have been any cause to mess with it. The control codes are obsolete but there's no need to remove them when they can be ignored (and indeed they haven't been removed), and for jobs that go beyond the capabilities of ASCII there are all kinds of more recent schemes that have superseded it, so I'm struggling to imagine what they changed. The source given is "American National Standard for Information Systems — Coded Character Sets — 7-Bit American National Standard Code for Information Interchange (7-Bit ASCII), ANSI X3.4-1986, American National Standards Institute, Inc., March 26, 1986". It would be good if someone could have a look at this, if they have access. Beorhtwulf (talk) 22:40, 28 February 2011 (UTC)

[edit] Invisible?

Ifyoutakeallthespacesoutthetextisveryhardtoread.Clearlywecanseethespacessotheyarenotinvisible. --Wtshymanski (talk) 03:36, 3 March 2011 (UTC)

Just because something is not visible doesn't mean it doesn't take up space. Anomie 03:58, 3 March 2011 (UTC)
Too subtle for me. If I can see it, I call it "visible". "Is that window closed?" someone will ask me, and "No", I'll say "I can see a space between the window and the sill". Or am I like Alice, who could see Nobody on the road a great way off? --Wtshymanski (talk) 14:42, 3 March 2011 (UTC)
Let me put it this way. In the following box, there is one (non-breaking) space character. Tell me where it is, without highlighting the text to make it a different color or looking at the source or cheating in some other manner.
 
Is it at the left? The right? In the middle? It's not visible, but it still takes up space. You can only "see" a space character normally because of the gap it leaves between the characters that are visible. Anomie 19:34, 3 March 2011 (UTC)
But if you take the spaces out, the document looks different. True, I wouldn't be able to tell by looking at the paper lying in the printer if the printer had just received a "form feed" character (apparently obsolete), or 66 lines of 72 CHR$(32) followed by a CR LF sequence (surely not obsolete control characters), or if someone had just left a blank sheet of paper in the output tray. The intent of sending 0X20 to the printer is to make a *visible* space in the line, though. If we didn't want to see the space, we'd send 0X00 (apparently another obsolete control character). Or possibly even 0X1A, but probably not 0X1B (definitely not obsolete). --Wtshymanski (talk) 20:22, 3 March 2011 (UTC)
You both have good points here. Like most things on wikipedia the resolution comes from the sources. Do the reliable sources class a space character as a visible character or not? --Salix (talk): 00:21, 4 March 2011 (UTC)
Sources? Sure:
  • "But was the Space character a control character or a graphic character?... It is, of course, both. However, from the point of view of a parallel printer, it is only one of those things, the invisible graphic. By this rather hair-splitting reasoning, the standards committee persuaded itself that the Space character must be regarded as a graphic character; that is, it must be positioned in a column of graphics, not in a column of controls." — Mackenzie, Charles E. (1980). Coded Character Sets, History and Development. Addison-Wesley. ISBN 0-201-14460-3. 
  • "SP (Space): A normally non-printing graphic character used to separate words." — RFC 20
HTH. Anomie 04:00, 4 March 2011 (UTC)
CT-1024 Terminal with monitor [1]

The ASCII space is most certainly a printing character. Please put up with my history lesson of video terminals in the early 1970s and I will explain.

When ASCII was developed, video terminals had character only displays. This was a single font, often only upper case. The terminal had a read/write memory to hold every character on every line of the display (including the space character.) The IC that converted the ASCII code to a bit pattern for display was known as a "character generator" The most popular one was the Signetics 2513 MOS ROM. This would produce characters 5 dots wide and 7dots high for raster scan CRTs. It just handled the 64 upper case ASCII characters. Deluxe terminals would offer a lower case option that increased the read/write screen memory and a second character generator for the lower case characters. (The early terminals used shift registers, not RAM for screen memory.)

The control characters in the first two columns for the ASCII chart were not stored in memory. The next four columns were stored in screen memory and displayed on the CRT. (The last two columns were only used on terminals that supported lower case.) The 2513 character generator converted the ASCII code into a 35 dot display pattern for each letter. (There were additional blank dots between each character and blank lines between the rows.) The ASCII space just produced a pattern were all 35 dots were off.

The computer did not have direct access to the screen memory, but video terminals would allow cursor control to any location on the screen. The next character would print at the cursor location. To erase a six letter word on the screen, you would set the cursor at the beginning of the word and send six space characters. The spaces over printed the characters in memory and the new 35 dot pattern was displayed. The ASCII space printed all dots off unless the terminal did inverted video, then it was all dots on.

Here is a Signetics data book that has the 2513 Character Generator. This online copy is the 1972 edition; my personal copy is the 1971 edition.

Here is a web site that explains how a character generator works.[2]

-- SWTPC6800 (talk) 03:00, 4 March 2011 (UTC)

All of which means little. DEL was used to "print" over sections of paper tape to erase them, but it's not considered printable. Anomie 04:00, 4 March 2011 (UTC)
A DEL turned a printable character into a control character on the paper tape. In a character generator ROM a space is was one of the 64 character dot patterns. When the video terminal came to a screen memory location with a hex 20; it displayed a 5 by 7 pattern of 35 dots off. (Or 35 dots on in reverse video.) When RFC 20 was written in 1969, printing terminals such as a Teletype were way more common than video terminals. On a Teletype, a space doesn't put ink on the paper (not printing).-- SWTPC6800 (talk) 04:49, 4 March 2011 (UTC)
A DEL is a "punchable" character, however. It causes the punch mechanism to punch a pattern of holes and advance the tape. For the tape a DEL is the same class as any other character, unless there is one that prevents punching.
I agree with the original poster, SPACE is a character. The hardware argument is pretty persuasive: it is enormously easier to treat SPACE as a character with no bits turned on (thus reusing all the existing hardware that puts any other character on the screen and moves the cursor right) than to treat it as a control character with special handling. In addition ASCII chose bit patterns so it was easy to group SPACE with the "printable" characters, since the hardware manufacturers demanded it, so they could treat it as a printing character.Spitzak (talk) 17:04, 4 March 2011 (UTC)
Soemthing that has always amused me in the computer racket is the keen appreciation you develop for the different kinds of nothingness; space, blank, null, NUL, 0...--Wtshymanski (talk) 18:46, 4 March 2011 (UTC)

You might be interested in the work of Roy Sorensen. See for example "Nothingness" at the Stanford Encyclopedia of Philosophy. —Ruud 15:32, 19 March 2011 (UTC)

[edit] ASCII85 printable characters

I added a mention of ASCII85 in the section about the printable characters. Previously there was no indication of why the printable characters should be set apart from the rest of the ASCII characters. ASCII85 is an old encoding format, but the article about it says that it is still in use in modern times in PostScript and PDF. Working that info, and any other similar info, into a small paragraph will illustrate why the printable characters are distinctly important.

Off the top of my head, I can't think of anything that uses only the the 95 printable characters. ASCII85 was the best that I could come up with to show the notability of the printable characters, but it does not include the space. I know this seems obvious, but there really ought to be another example that uses all 95 printable characters, and I can't think of one.

Badon (talk) 08:55, 28 August 2011 (UTC)

Do we really need to explain why it's important that you can *read* the characters? The thing that uses the 95 printable characters is *printing*. Weird encoding schemes dreamed up by grad students are a decidedly secondary application and somewhat beside the point of an alphabet. --Wtshymanski (talk) 14:31, 28 August 2011 (UTC)
Personal tools
Namespaces

Variants
Actions
Navigation
Interaction
Toolbox
Print/export