Talk:Digital encoding of APL symbols

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
Low	This article has been rated as Low-importance on the project's importance scale.
	This article is supported by WikiProject Software (assessed as Low-importance).

Typography Low‑importance

	This article is within the scope of WikiProject Typography, a collaborative effort to improve the coverage of articles related to Typography on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.TypographyWikipedia:WikiProject TypographyTemplate:WikiProject TypographyTypography articles
Low	This article has been rated as Low-importance on the importance scale.

Underscored alphabetics:

When APL was first made available on IBM printing terminals in the 1960s, the typing element did not enough room for lower case a-z. Thus underscored alphabetics were an available way to get another "case". Possibly the Unicode designers thought that underscoring was a text attribute, not unlike the underscore, bold, italic, etc. as one would find in MS Word. This is not true with APL - underscored characters were distinct from non-underscored characters and allowed a kind of upper / lower case. Only A-Z and the "delta" symbol could be underscored. With the IBM 3279 display terminals, one could now have three alphabet cases in programs - upper and lower case, plus underscored characters.

Usage of underscored alphabetic characters are considered by many APL programmers to be obsolete, not modern, and in bad style. APL+Win has eliminated them from their version of the language. Other implementations still support them, but consider them deprecated.

188.60.207.142 (talk) 17:21, 28 May 2011 (UTC)[reply]

I've used APL on mainframe since the 70ies and still use it on my OS/2 PC today (for math). PC-APL started with lower-case instead of underline via the ALT-key. On mainframe programs (OS/360,OS/370, VM, ...) it was some kind of programmers agreement to use underlines for global variables between functions (same workspace), except for text-headlines, as a better way of readability. APL-workspaces have also been portable from mainframe to PC and i still use most if my statistical work downloaded up to now. Actual use some stochastic work on demographics. Beside FORTRAN-programs APL is my main "calculation suite" if Lotus123(OS/2) or Excel(Win) does not fit. regards from a retired IBM-Dinosaur....... — Preceding unsigned comment added by 79.247.7.235 (talk) 18:18, 27 February 2015 (UTC)[reply]

External links modified[edit]

Hello fellow Wikipedians,

I have just modified 2 external links on APL (codepage). Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 07:42, 1 October 2016 (UTC)[reply]

External link gone[edit]

The IBM code page 293 is gone. I didn't check others. David Lambert, 2019-MAR-01 — Preceding unsigned comment added by 69.204.222.194 (talk) 12:46, 1 March 2019 (UTC)[reply]

Incorrect mapping of EBCDIC code page 310 (APL) to Unicode characters?[edit]

The character table in this article for EBCDIC code page 310 maps:

Byte value X'81' to the Unicode character U+2551
Byte value X'82' to the Unicode character U+2550

All example glyphs that I have seen for these two Unicode characters show their double lines close together.

However, the IBM 3270 Information Display System Character Set Reference shows a chart of APL characters (figure 10-41 on page 10-42, PDF sheet number 287 / 340) containing these characters with the double lines as far apart as possible: at the edges.

The difference between the glyphs that I have seen for the Unicode characters and the characters as presented in that IBM documentation is significant enough that I suspect that this mapping is incorrect.

However, I'm unaware of any Unicode characters that match those two characters as presented in the IBM documentation: essentially, a rectangle with either the vertical or the horizontal sides removed.

I have also seen IBM 3270 terminal emulators that use APL fonts with these characters (X'81' and X'82') as shown in that IBM documentation.

Do I have a point? Who do I contact about this mapping?

Does the Unicode standard include a written description, or graphical specification, of what a geometric or box-drawing glyph should look like? If so, where do I find that information?

Graham Hannington (talk) 16:16, 29 March 2022 (UTC)[reply]

You are correct that they're listed with those mappings because those are the non-PUA mappings listed for them in a secondary source, not because those mappings are particularly good. IBM actually maps SF630000 (the 0x81, double vertical one) to U+F892 in their corporate Private Use Area scheme, and SF620000 (the 0x82, double horizontal one) to U+F893, also in the Private Use Area (as seen in unicode.nam, included here). In terms of more recent additions to Unicode that the cited sources did not have the benefit of, 🮀 (U+1FB80) in the Symbols for Legacy Computing block is a much closer match to the double horizontal one, but there is still no particularly good match to the double vertical one. --HarJIT (talk) 21:59, 29 March 2022 (UTC)[reply]

@HarJIT, thank you so much for your advice. I claim interest in this area, but not expertise.

My interest: years ago, I developed a tool that captures 3270 screens in various rich-text formats, including HTML. I am now redeveloping that tool from scratch. One function of the tool is to map byte values in the 3270 data stream to characters in HTML; for graphic escape characters, typically, that means a numeric character entity reference. For example, I've been using the EBCDIC code page 310 character table in this article to map byte value X'81' to ║ (U+2551). Incorrectly, as it turns out. I discovered my mistake only recently when I saw an example of X'81' as presented by a 3270 terminal emulator: the glyph was as per the IBM 3270 documentation I cited earlier, with the double lines as far apart as possible. The font used by the emulator in this case was proprietary, not a Unicode font.

The difference between the glyph for U+2551 and the glyph shown in the IBM 3270 docs (and that proprietary APL font), which you identified as U+F892 (I've downloaded and viewed unicode.nam, thank you!), can significantly affect the appearance, even usability, of a 3270 screen. For example, when used as a table column separator, to distinguish, say, non-scrollable columns from scrollable columns, U+2551 has white space on either side of its lines, giving the characters in the adjoining table cells some breathing space, whereas the lines in U+F892 abut adjacent characters. Arguably, then, U+2551 is usable in this context, but not U+F892.

I recently made a fool of myself by suggesting X'81' as a table column separator in a 3270 screen, thinking I was suggesting U+2551, unaware that it actually meant U+F892.

Can you point me to more information about IBM's corporate PUA scheme? For example, any resource that offers more detail than the unicode.nam file that you've already pointed me to? (Note: I document and help develop IBM-brand products, but I don't work directly for IBM.)

Some wise and generous person has anonymously published a Code page information website with content that IBM previously published on its now-defunct Globalization website and associated FTP site. (This Wikipedia article contains numerous dead links to those defunct IBM sites.)

For example, to augment my understanding of EBCDIC code page 310, in addition to the character table in this Wikipedia article, I'm now using a CCSID Explorer page on that site.

For several years, I've been lobbying the designers of the IBM Plex Mono font to increase that font's coverage of characters in EBCDIC code page 310. Most recently, in GitHub issue #433 for IBM/plex, "IBM docs present 3270 terminal screens in IBM Plex Mono, but IBM Plex Mono lacks some APL characters used in 3270 screens". So far, I've pointed the font designers to the character table in this Wikipedia article. I'm about to head over to that issue, and add another comment. Please feel free to add your own comments to that issue, including correcting anything I've written there! Graham Hannington (talk) 02:03, 30 March 2022 (UTC)[reply]

I asked:

> Can you point me to more information about IBM's corporate PUA scheme?

I've found some more detail on that "Code page information" site, mapping Unicode PUA code points to GCGIDs, with renderings of the corresponding glyphs: CCSID Explorer. That's pretty useful.

Ideally, however, I'd like a table of EBCDIC code page 310 that shows the PUA mappings: the PUA glyphs, with the mappings to the PUA code points. I'm unaware of any Unicode font that contains those glyphs: specifically, a font that contains the IBM PUA glyphs, mapped to the PUA code points. Do you know of any such fonts?

@HarJIT wrote:

> those are the non-PUA mappings listed for them in a secondary source

I'm curious: what is that secondary source? Graham Hannington (talk) 04:17, 30 March 2022 (UTC)[reply]

I was referring to the Tachyon Software mapping table. --HarJIT (talk) 08:40, 31 March 2022 (UTC)[reply]

Thanks for that Tachyon link; that's very useful!

I'm observing some discrepancies (described below) between unicode.nam and other sources.

unicode.nam has an IBM copyright notice in its header, but I downloaded it from a secondary source.

I downloaded and CP00310.txt and CS00963.txt directly from IBM. I wish I knew a direct IBM source for GCGID-to-Unicode mapping.

In all of the cases I've observed so far, I'm siding against unicode.nam.

`X'B1'`

unicode.nam maps U+2373 (APL FUNCTIONAL SYMBOL IOTA) to GCGID SL720000 and calls it "iotaapl".

CP00310.txt and CS00963.txt both refer to GCGID SL720000 as "Epsilon (APL)".

Tachyon maps EBCDIC code page 310 byte value X'B1' to U+2208 (ELEMENT OF).

`X'B2'`

unicode.nam maps U+2374 (APL FUNCTIONAL SYMBOL RHO) to GCGID SL730000 and calls it "rhoaapl".

CP00310.txt and CS00963.txt both refer to GCGID SL730000 as "Iota (APL)".

Tachyon maps EBCDIC code page 310 byte value X'B2' to U+2373 (APL FUNCTIONAL SYMBOL IOTA).

`X'B3'`

unicode.nam maps U+2375 (APL FUNCTIONAL SYMBOL RHO) to GCGID SL740000 and calls it "omegaapl".

CP00310.txt and CS00963.txt both refer to GCGID SL740000 as "Rho (APL)".

Tachyon maps EBCDIC code page 310 byte value X'B3' to U+2374 (APL FUNCTIONAL SYMBOL RHO).

`X'B4'`

unicode.nam contains no entry for SL750000.

CP00310.txt and CS00963.txt both refer to GCGID SL750000 as "Omega (APL)".

Tachyon maps EBCDIC code page 310 byte value X'B4' to U+2375 (APL FUNCTIONAL SYMBOL OMEGA). Graham Hannington (talk) 10:39, 1 April 2022 (UTC)[reply]