ZX Spectrum character set
||This article needs additional citations for verification. (February 2011)|
|This article relies on references to primary sources. (September 2011)|
The ZX Spectrum character set is the variant of ASCII used in the British Sinclair ZX Spectrum computers. It is based on ASCII-1967 (the standard ASCII on which modern character sets are based), but with one character from ASCII-1963 (the first version of ASCII), two non-standard graphics characters, an idiosyncratic use of the control code area and use of the 128 high-bit characters beyond the ASCII range.
Printable characters 
The printable part of the Spectrum character set, 0x20–0x7F, is almost standard, except that 0x60 is the pound sign (£) instead of the grave accent ( ` ) and 0x7F is the copyright sign (©) instead of the control code
DEL. The pound sign was mapped to 0x60, and not 0x23 as in the British variant of ASCII (ISO-646-GB), making both the pound sign and the number sign (#) available universally. Code 0x5E contains an up-arrow (↑) as in ASCII-1963 instead of the ASCII-1967 caret (^); however, 0x5F has an underscore (_) and not a left-arrow (←).
Beyond 0x7F, the Spectrum character set uses the high-bit range, 0x80–0xFF, for special purposes. 0x80–0x8F contain block graphics. 0x90–0xA4 contain the User Defined Graphics (UDGs), which the user can customise with a few lines of BASIC. 0xA5–0xFF contain tokens (BASIC keywords represented as single characters): for example, pressing P at the beginning of a line would generate the code 0xF6, which would cause the BASIC keyword PRINT to display on the screen. Codes 0xC7–0xC9 are the mathematical operators <= (less-than-or-equal), >= (greater-than-or-equal) and <> (not-equal) respectively; unlike the relational operators of most other systems, these are characters in their own right and cannot be achieved by typing the two constituent symbols one after the other.
The default printable characters (32 (space) to 127 (copyright)) are stored at the end of the Spectrum's ROM at memory address 15616 (0x3D00) to 16383 (0x3FFF) and are referenced by the system variable CHARS which can be found at memory address 23606/7. The value in CHARS is actually 256 bytes lower than the first byte of the space character so that referencing a printable ASCII character does not need to consider the first 32 characters. As such, the CHARS value (by default) holds the address 15360 (0x3C00).
The UDG characters (Gr-A to Gr-U) are stored at the end of the Spectrum's RAM at memory address 65368 (0xFF58) to 65535 (0xFFFF). As such, a POKE issued to this address range changes the UDG characters used in subsequent PRINT statements (though not any UDG characters already drawn to the screen). The USR keyword (when followed by a single quoted character) provides a quick method to reference these addresses from BASIC. As with the printable characters, the location of the UDG characters is stored in the system variable UDG.
The final two UDG characters (Gr-T and Gr-U) are not available on the 128K Spectrums (except in the backward-compatible 48K mode), where they are replaced with two new BASIC keywords: SPECTRUM and PLAY. A side-effect of this is that some older games do not work properly, displaying the keywords SPECTRUM and PLAY instead of their intended graphics.
Control codes 
In the control codes area (the C0 range), the Spectrum uses its own proprietary controls, such as INK and PAPER to control foreground and background colour. The only similarity to ASCII is having cursor-left for 0x08 (ASCII Back Space) and ENTER for 0x0D (ASCII Carriage Return), which also generates an automatic linefeed. Cursor-down 0x0A (ASCII Line Feed) can be simulated with 32 spaces printed with OVER 1 (transparent overprint) and cursor-up 0x0B (ASCII Vertical Tabulation) can be simulated with 32 backspaces. The system ROM has a fault which prevents cursor-right 0x09 (ASCII Horizontal Tabulation) from working.
Control code 0x0e is used to indicate that a floating-point number follows, to accelerate text processing. In a Sinclair BASIC program, ASCII numbers are followed by a 0x0E byte, and then a 5-byte representation of the number in binary floating point format. When listing the Basic program the LIST command skips past these 5 bytes, but when the program is being run the 5-byte representation is used and the text part is ignored. Some Spectrum programs used this behaviour to hide the real numbers from the user. For example, a BASIC line could contain the ASCII characters GOTO 10, followed by a 0x0e byte and the floating-point representation of 100. Anyone listing the program would see the number 10, but when executed the program would jump to line 100.
Undefined codes 
Ranges 0x00–0x05, 0x07, 0x0A–0x0C, 0x0F and 0x17–0x1F are undefined.
Codepage layout 
|Spectrum Character Set|
|0x keypress||0x character||1x||2x||3x||4x||5x||6x||7x||8x||9x||Ax||Bx||Cx||Dx||Ex||Fx|
|x4||true video||INVERSE||$||4||D||T||d||t||(E)||(U)5||TAN||BIN||CLOSE #||DATA||POKE|
(X) characters are User Definable Graphics
1In the Standard ROM CHR$8 fails backing from line 1 to line zero, and backing off line zero.
2In the Standard ROM CHR$9 does not actually move the text output position.
3 Used in Basic programs as a inline marker prefixing a 5-byte floating point number. Is not a printable character or control code.
4 SPECTRUM in 128K BASIC.
5 PLAY in 128K BASIC.
See also 
- ZX Spectrum manual, Appendix A, the character set
- ZX Spectrum manual, Chapter 25, the system variables
- Sinclair Spectrum+ 48K Character Set From Michael Zaretski's website
- Mapping table from Sinclair Spectrum+ 48K Character Set to Unicode From the same site
- The floating point package