ZX Spectrum character set
The ZX Spectrum character set is the variant of ASCII used in the British Sinclair ZX Spectrum family computers. It is based on ASCII-1967 but the characters ^, ` and
DEL are replaced with ↑, £ and ©. It also differs in its use of the C0 control codes other than the common
CR, and it makes use of the 128 high-bit characters beyond the ASCII range. The ZX Spectrum's main set of printable characters and system font are also used by the Jupiter Ace computer.
Standard US-ASCII, 0x20–0x7F, is included in the Spectrum character set except that code point 0x5E is an up-arrow (↑) instead of a caret (^), 0x60 is the pound sign (£) instead of the grave accent (`), and 0x7F is the copyright sign (©) instead of the control character
DEL. Note that the use of 0x5E as ↑ was also the case in older 1963 version of ASCII. The £ sign was not mapped to 0x23 as in the British variant of ASCII (ISO-646-GB), allowing both the pound sign and the number sign (#) simultaneously. The ↑ character is the exponentiation operator in Spectrum's BASIC, just like the ^ it replaces compared to ASCII-1967 is used for exponentiation in many other programming languages.
Beyond 0x7F, the Spectrum character set uses the high-bit range 0x80–0xFF for special purposes. 0x80–0x8F contain the same 2×2 block graphics characters that the ZX80 character set and the ZX81 character set have (at other locations), also available in the Block Elements Unicode block. However the ZX Spectrum does not by default include their 50% dithered 1×2 block graphics characters. Code points 0x90–0xA4 contain the originally 21 User-Defined Graphics (UDG) characters, and 0xA5–0xFF contain BASIC keywords tokenized as single characters. In the 128 BASIC mode introduced later, this was changed to 19 UDG characters ending at 0xA2 followed by the two new tokens
PLAY. Code points 0xC7–0xC9 are the two-character operators
<>, similarly tokenized into single code points. These tokens allow a BASIC command like
All non-UDG Spectrum characters can be mapped to Unicode. The three non ASCII-1967 characters ↑, £ and © are at U+2191, U+00A3 and U+00A9. The 2×2 block graphics characters are in the Block Elements block at U+2580–U+259F although font support for the latter is not universal.
The shape of the UDG characters is mapped to a RAM memory area and is initialized to copies of characters A-U, but can be redefined arbitrarily for example using the BASIC command
POKE. Like all characters in the system font they use an 8×8 pixel grid stored in 8 bytes. Redefining them changes their appearance in subsequent
USR with the character as the argument, e.g.
USR "A" for the first one. By default this points to the last 168 (21×8) bytes of RAM at memory addresses 65368 (0xFF58) to 65535 (0xFFFF) for a 48K Spectrum. The location is pointed to by the system variable UDG, which also can be updated if required.
The definition of the main system font (32 (space) to 127 (copyright)) are referenced by the system variable CHARS which can be found at memory address 23606/7 (0x5C36/7). It is defined as 256 bytes lower than the first byte of the space character, simplifying the formula for locating a character to CHARS+8×codepoint. The CHARS value defaults to the value 15360 (0x3C00), with the system font at the end of the Spectrum's ROM at address 15616 (0x3D00) to 16383 (0x3FFF). Entire alternative fonts can be loaded into RAM and the CHARS variable re-pointed accordingly.
In the control codes area (the C0 range), the Spectrum mostly uses proprietary controls, such as INK and PAPER to control foreground and background colour. However, the common
CR code points are the same as in ASCII. Cursor-down (0x0A, ASCII Line Feed) can be simulated with 32 spaces printed with OVER 1 (transparent overprint) and cursor-up 0x0B (ASCII Vertical Tabulation) can be simulated with 32 backspaces. The system ROM has a fault which prevents cursor-right at 0x09 (c.f. ASCII Horizontal Tabulation) from working.
Control code 0x0E is used to indicate that a floating-point number follows, to accelerate text processing. In a Sinclair BASIC program numeric constants are stored as ASCII followed by a 0x0E byte and a 5-byte binary floating point representation. When listing a BASIC program only the ASCII part is used but at runtime only the binary representation is used. Some Spectrum programs exploited this to obfuscate numbers, while others did so to save memory. For example, a BASIC line displayed as
GO TO 10 could contain the ASCII characters for digits 1 and 0 followed by a 0x0E byte and the floating-point representation of 100 instead of 10. Anyone listing that program would see the number 10, but when executed the program would jump to line 100.
Ranges 0x00–0x05, 0x07, 0x0A–0x0C, 0x0F and 0x17–0x1F are undefined. In most cases, they will produce a question mark if printed to the display. However, they may be used to represent their literal numeric values in conjunction with certain control codes: for example, INK + 0x07 sets the ink (foreground text) colour to colour number 7 (white).
|Spectrum Character Set|
|0_ keypress||0_ character||1_||2_||3_||4_||5_||6_||7_||8_||9_||A_||B_||C_||D_||E_||F_|
|_4||true video||INVERSE||$||4||D||T||d||t||(E)[b]||(U)[d]||TAN||BIN||CLOSE #||DATA||POKE|
- Different from US-ASCII.
- UDG (User-Defined Graphics) character.
- UDG in 48 BASIC, keyword SPECTRUM in 128 BASIC.
- UDG in 48 BASIC, keyword PLAY in 128 BASIC.
- In the Standard ROM
CHR$ 8fails backing from line 1 to line zero, and backing off line zero.
- In the Standard ROM
CHR$ 9does not actually move the text output position.
- Used in BASIC programs as a marker prefixing a 5-byte floating point number.
- ZX Spectrum manual, Appendix A, the character set
- ZX Spectrum manual, Chapter 25, the system variables
- Logan, Ian (1983). Understanding Your Spectrum. Melbourne House. p. 189. ISBN 086161111X.
- Wearmouth, Geoff. "An Assembly File Listing to generate a 16K ROM for the ZX Spectrum". Archived from the original on August 25, 2015.
- Swann, Richard P. "Part 4 Decrypters". HOW TO HACK on the ZX Spectrum.
- Sinclair Spectrum+ 48K Character Set From Michael Zaretski's website
- Mapping table from Sinclair Spectrum+ 48K Character Set to Unicode From the same site
- The floating point package