WDC 65C02

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
W65C02S microprocessor in a PDIP-40 package.

The Western Design Center (WDC) 65C02 microprocessor is an enhanced CMOS version of the popular nMOS-based 8-bit MOS Technology 6502. The 65C02 fixed several problems in the original 6502 and added a small number of new commands. However, its main feature was greatly lowered power usage, on the order of 10 to 20 times less than the 6502 running at the same speed.[1] This made it useful in portable computer roles and microcontroller systems in industrial settings. It has been used in some home computers, as well as in embedded applications, including medical-grade implanted devices.

Development began in 1981[a] when it was known as the 65802.[2] The first sample versions were released in early 1983.[b] WDC licensed the design to Synertek, NCR, GTE, and Rockwell Semiconductor. Rockwell's primary interest was in the embedded market and asked for several new commands to be added to aid in this role. These were later copied back into the baseline version, at which point WDC added two new commands of their own to create the W65C02. Sanyo later licensed the design as well, and Seiko Epson produced a further modified version as the HuC6280.

Early versions used 40-pin DIP packaging, and were available in 1, 2 and 4 MHz versions. Later versions were produced in PLCC and QFP, increasing in speed as well. The latest version from WDC, the W65C02S-14 runs at speeds up to 14 MHz.

Introduction and features[edit]

The 65C02 is a low cost, general-purpose 8-bit microprocessor (8-bit registers and data bus) with a 16-bit program counter and address bus. The register set is small, with a single 8-bit accumulator (A), two 8-bit index registers (X and Y), an 8-bit status register (P), and a 16-bit program counter (PC). In addition to the single accumulator, the first 256 bytes of RAM, the "zero page" ($0000 to $00FF), allow faster access through a dedicated addressing mode that requires a single byte of address instead of two. The stack lies in the next 256 bytes, page one ($0100 to $01FF), and cannot be moved or extended. The stack grows downwards with the stack pointer (S) starting at $01FF and decrementing as the stack grows.[3] It has a variable-length instruction set, varying between one and three bytes per instruction.[1]

The basic architecture of the 65C02 is identical to the original 6502, and can be considered a low-power implementation of that design. At 1 MHz, the most popular speed for the original 6502, the 65C02 requires only 20 mW, while the original uses 450 mW, a reduction of over twenty times.[4] The manually optimized core and low power use is intended to make the 65C02 well suited for low power system-on-chip (SoC) designs.[1]

A Verilog hardware description model is available for designing the W65C02S core into an application-specific integrated circuit (ASIC) or a field-programmable gate array (FPGA).[5] As is common in the semiconductor industry, WDC offers a development system, which includes a developer board, an in-circuit emulator (ICE) and a software development system.[6]

The W65C02S–14 is the production version as of 2019, and is available in PDIP, PLCC and QFP packages. The maximum officially supported ϕ2 (primary) clock speed is 14 MHz, indicated by the –14 part number suffix. The "S" designation indicates that the part has a fully static core, a feature that allows ϕ2 to be slowed down or fully stopped in either the high or low state with no loss of data.[7] Typical microprocessors not implemented in CMOS have dynamic cores and will lose their internal register contents (and thus crash) if they are not continuously clocked at a rate between some minimum and maximum specified values.

65C02 registers
15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 (bit position)
Main registers
  A Accumulator
Index registers
  X X Index Register
  Y Y Index Register
0 0 0 0 0 0 0 1 SP Stack Pointer
Program counter
PC Program Counter
Status register
  N V - B D I Z C Program Status Register

General logic features[edit]

Die photograph of a Sitronix ST2064B microcontroller showing embedded W65C02S core in the upper right

Logic features[edit]

  • Vector pull (VPB) output indicates when interrupt vectors are being addressed
  • Memory lock (MLB) output indicates to other bus masters when a read-modify-write instruction is being processed
  • WAit-for-Interrupt (WAI) and SToP (STP) instructions reduce power consumption, decrease interrupt latency and enable synchronization with external events

Electrical features[edit]

  • Supply voltage specified at 1.71 V to 5.25 V
  • Current consumption (core) of 0.15 and 1.5 mA per MHz at 1.89 V and 5.25 V respectively
  • Variable length instruction set, enabling code size optimization over fixed length instruction set processors, results in power savings
  • Fully static circuitry allows stopping the clock to conserve power

Clocking features[edit]

The W65C02S may be operated at any convenient supply voltage (VDD) between 1.8 and 5 volts (±5%). The data sheet AC characteristics table lists operational characteristics at 5 V at 14 MHz, 3.3 V or 3 V at 8 MHz, 2.5 V at 4 MHz, and 1.8 V at 2 MHz. This information may be an artifact of an earlier data sheet, as a graph indicates that typical devices are capable of operation at higher speeds than suggested by the AC characteristics table, and that reliable operation at 20 MHz should be readily attainable with VDD at 5 volts, assuming the supporting hardware will allow it.

The W65C02S support for arbitrary clock rates allows it to use a clock that runs at a rate ideal for some other part of the system, such as 13.5 MHz (digital SDTV luma sampling rate), 14.31818 MHz (NTSC colour carrier frequency × 4), 14.75 MHz (PAL square pixels), 14.7456 (baud rate crystal), etc., as long as VDD is sufficient to support the frequency. Designer Bill Mensch has pointed out that FMAX is affected by off-chip factors, such as the capacitive load on the microprocessor's pins. Minimizing load by using short signal tracks and fewest devices helps raise FMAX. The PLCC and QFP packages have less pin-to-pin capacitance than the PDIP package, and are more economical in the use of printed circuit board space.

WDC has reported that FPGA realizations of the W65C02S have been successfully operated at 200 MHz.

Comparison with the NMOS 6502[edit]

Basic architecture[edit]

Although the 65C02 can mostly be thought of as a low-power 6502, it also fixes several bugs found in the original and adds new opcodes that can help increase code density. It is estimated that the average 6502 assembly program can be made 10 to 15% smaller on the 65C02 and see a similar improvement in performance through avoided memory accesses.[1]

Undocumented instructions removed[edit]

The original 6502 had 56 instructions, which, when combined with different addressing modes, produced a total of 151 opcodes of the possible 256 8-bit patterns. The remaining 105 unused opcodes were undefined, with the set of codes with low-order 4-bits with 3, 7, B or F left entirely unused, the 2 having only a single opcode.[8]

The 6502 was famous for the way that some of these leftover codes actually performed actions. Due to the way the 6502's instruction decoder worked, simply setting certain bits in the opcode would cause parts of the instruction processing to take place. Some of these opcodes would immediately crash the processor, while other performed useful functions and were even given unofficial assembler mnemonics by users.[9]

The 65C02 added a number of new opcodes that used up a number of these previously "undocumented instruction" slots, for instance, $FF was now used for the new BBS instruction (see below). Those that remained truly unused were set to perform NOPs. Programs that took advantage of these codes will not work on the 65C02, but these codes were always documented as non-operational and should not have been used.[1]

Bug fixes[edit]

The original 6502 had several bugs when initially launched. Among the most notorious was that the ROR, rotate right, was broken due to a problem in the chip. MOS addressed this by not documenting the instruction. This bug was fixed early in the production run and was generally not an issue for the vast majority of machines using the processor.[10]

In contrast, another bug that remained in the design for its lifetime involved the commonly-used jump command, JMP, when using indirect addressing. In this mode, the address of the JMP was looked up in another memory location. For instance, JMP ($1234) would first fetch the value in memory locations $1234 and $1235, and then use those 16 bits as the actual memory location to jump to. However, if the initial address ended in $FF, the boundary of a memory page, the JMP took the most significant byte of the 16-bit address from $00 of the original page rather than $00 of the new page. So for instance, JMP ($12FF) would get the first byte at $12FF and the second, incorrectly, from $1200 rather than $1300. This was fixed in the 65C02.[1]

Another bug in the original 6502 concerned the behaviour of the (D)ecimal flag in the status register, which was left undefined after a reset or interrupt. This meant that programmers were forced to set the flag to a known value, and one finds a CLD instruction (CLear Decimal) in almost all interrupt handlers. The 65C02 automatically set or reset this flag correctly after pushing the status register onto the stack.[11]

A related problem occurred while operating in decimal mode, where the (N)egative, o(V)erflow and (Z)ero flags were not updated properly, left with the values of the underlying binary calculations, not the decimal values. There were ways to address this in code, but only at the cost of additional instructions. The 65C02 addresses this problem and sets these flags correctly, at the cost of a single clock cycle.[11]

New addressing modes[edit]

The 6502 has two indirect addressing modes which dereference through 16-bit addresses stored in page zero:

  • Indexed indirect, e.g. LDA ($10,X), adds the X register to the given page zero address before reading the 16-bit vector, useful when there is an array of pointers in page zero.
  • Indirect indexed LDA ($10),Y adds the Y register to the 16-bit vector read from the given page zero address, performing pointer-offset addressing.

A downside of this model is that if indexing is not needed, one of the index registers must still be set to zero and used. The 65C02 added a non-indexed indirect addressing mode LDA ($10) to all instructions that used indexed indirect and indirect indexed modes, freeing up the index registers.[12]

The 6502's JMP instruction had a unique (among 6502 instructions) addressing mode known as "absolute indirect" that read a 16-bit value from a given memory address and then jumped to the address in that 16-bit value. For instance, if memory location $A000 holds $34 and $A001 holds $12, JMP ($A000) would read those two bytes, construct the value $1234, and then jump to that location.

One common use for indirect addressing is to build branch tables, a list of entry points for subroutines that can be accessed using an index. For instance, a device driver might list the entry points for OPEN, CLOSE, READ, etc in a table at $A000. To access the READ function, one would use something similar to JMP ($A004), as it is the 3rd entry in the table (index 2) and each entry is 16-bits. If the driver is updated and the subroutine code moves in memory, any existing code will still work as long as it uses the table to look them up.

The 65C02 added the new "indexed absolute indirect" mode which eased the use of branch tables. This mode added the value of the X register to the absolute address and took the 16-bit address from the resulting location. For instance, to access the READ function from the table above, one would store 4 in X, then JMP ($A000,X). This style of access makes accessing branch tables simpler as a single base address is used in conjunction with an 8-bit value.[12]

New and modified instructions[edit]

In addition to the new addressing modes, the "base model" 65C02 also added a set of new instructions.[13]

  • INC and DEC with no parameters now increment or decrement the accumulator. This was an odd oversight in the original instruction set, which only included INX/DEX,INY/DEY and INC addr/DEC addr. Some assemblers use the alternate forms INA/DEA or INC A/DEC A.[13]
  • STZ addr, STore Zero in addr. Replaces the need to LDA #0;STA addr and doesn't change the value in the accumulator. As this task is common in most programs, using STZ can reduce code size, both by eliminating the LDA as well as any code needed to save the value of the accumulator, typically a PHA PLA pair.[14]
  • PHX,PLX,PHY,PLY, push and pull the X and Y registers to/from the stack. Previously, only the accumulator and status register had push and pull instructions. X and Y could only be stacked by moving them to the accumulator first with TXA or TYA, thereby corrupting the accumulator contents, then using PHA.[15]
  • BRA, branch always. Operates like a JMP but uses a 1-byte relative address like other branches, saving a byte. The speed is often faster than the 4 cycle absolute JMP using only 3 cycles unless a page is crossed which would make the BRA version 1 cycle longer (4 cycles).[12] As the address is relative, it is also useful when writing relocatable code,[14] a common task in the era before memory management units.

Bit manipulation instructions[edit]

The initial design for the 65C02 was modified by Rockwell, who was mostly interested in using the 6502 as the basis for embedded processors. In these roles, it is common for device drivers to communicate with the CPU by encoding status as bits in a single byte, in a fashion similar to the CPU's own status register. This makes the various bit manipulation instructions very common in embedded applications.[14]

Normally, one tests bits by ANDing the desired pattern with the memory location holding the data, and then branching based on the status register's (Z)ero flag. So for instance, if address $1234 was the status register for a device, and bit 3 held the "ready" status, then one could implement a "continue if ready" with LDA $1234;AND #$08;BNE $2345 as hex $08 is the third bit, and if the bit is set then the result in the accumulator will not be equal to zero, and the branch to the routine at $2345 will be taken.[13]

As this sort of test is common, the original 6502 included a special-purpose BIT addr instruction for automating some of this. BIT did not change the accumulator (unlike ANDing) and tested bits 6 and 7 at the same time, placing the results in the (N)egative and o(V)erflow flags. As long as the device drivers used bits 6 and 7 for their most commonly tested flags, using BIT could reduce the number of tests needed. The 65C02 further improved the BIT command by adding new addressing modes, including the ability to test a constant against the accumulator instead of the pattern having to be initially stored in memory.[13]

However, Rockwell's changes went far beyond changes to BIT, adding a host of commands for directly setting and testing any bit, and combining the test, clear and branch into a single opcode. The new instructions were available from the start in Rockwell's R65C00 family,[16] but was not part of the original 65C02 specification and not found in versions made by WDC or its other licensees. These were later copied back into the baseline design, and were available in later WDC versions, denoted with a leading "W", the W65C02's.

The new instructions include:

  • SMBbit# addr/RMBbit# addr. Set or Reset bit number bit# in zero page byte addr. The bit# is often written as part of the instruction, like SMB1 $12 which sets bit 1 in zero-page address $12. Some assemblers have bit# written as part of the parameters, like SMB 1,$12, which has the advantage of allowing it to be replaced by a variable name or calculated number.[14]
  • TSB and TRB, Test and Set Bits, Test and Reset Bits. A bitmask is first stored in the accumulator with LDA and then TSB/TRB is called. The pattern is ANDed to set the processor's Z(ero) flag, and then the bits are Set or Reset. This operates similar to the BIT command, but BIT only tests against bits 6 and 7, while this tests any single bit, and these commands then change the value without a separate STA being needed. This pattern is very common when a status bit needs to be tested and then reset to indicate the condition has been handled.[14]
  • BBR bit#,offset,addr and BBS bit#,offset,addr, Branch on Bit Set/Reset. Same zero-page addressing as SMB/RMB, but now branches to addr if that bit is set/reset. This combines the three instructions of LDA/AND/BEQ into a single instruction.[14]

Low-power modes[edit]

In addition to the new commands above, WDC also added the STP and WAI instructions for supporting low-power modes.

STP, STop the Processor, halted all processing until a hardware reset was issued. This could be used to put a system to "sleep" and then rapidly wake it with a reset. However, this required some external system to maintain memory, and it was not widely used.

WAIt had a similar effect, entering low-power mode, but this instruction woke the processor up again on the reception of an interrupt. Previously, handling an interrupt generally involved running a loop to check if an interrupt has been received, sometimes known as "spinning", checking the type when one is received, and then jumping to the processing code. This meant the processor was running during the entire process.

In contrast, in the 65C02, interrupt code could be written by having a WAI followed immediately by a JSR or JMP to the handler. When the WAI was encountered, processing stopped and the processor went into low-power mode. When the interrupt was received, it immediately processed the JSR and handled the request.

This had the added advantage of slightly improving performance. In the spinning case, the interrupt might arrive in the middle of one of the loop's instructions, and to allow it to restart after returning from the handler, the processor spends one cycle to save its location. With WAI, the processor enters the low-power state in a known location where all instructions are guaranteed to be complete, so when the interrupt arrives it cannot possibly interrupt an instruction, and the processor can safely continue without spending a cycle saving state.

65SC02[edit]

The 65SC02 is a variant of the WDC 65C02 without bit instructions.[17]

Notable uses of the 65C02[edit]

Home computers[edit]

Video game consoles[edit]

Other products[edit]

  • TurboMaster accelerator cartridge for the Commodore 64 home computer (65C02 @ 4.09 MHz)
  • many dedicated chess computers i.e.: Mephisto MMV, Novag Super Constellation, Fidelity Elite and many more (4–20 MHz)

See also[edit]

Notes[edit]

  1. ^ Some sources, including prior versions of this article, claim 1978. This was the date that Bill Mench, the primary designer, formed WDC. Mench specifically states 1981 when talking about the design in 1984.
  2. ^ Wagner's June 1983 article mentions it being available for “several months”. Given typical publication delays at that point this may date it to late 1982.

References[edit]

Citations[edit]

  1. ^ a b c d e f Wagner 1983, p. 204.
  2. ^ McGeever, Christine (5 November 1984). "16-bit Apple II chips due". InfoWorld. pp. 21–22.
  3. ^ Koehn, Philipp (2 March 2018). "6502 Stack" (PDF).
  4. ^ Taylor & Watford 1984, p. 174.
  5. ^ "6502 CPU Projects in HDL (for FPGA)".
  6. ^ "W65C02DB Developer Board".
  7. ^ "W65C02S-14".
  8. ^ Parker, Neil. "The 6502/65C02/65C816 Instruction Set Decoded". Neil Parker's Apple II page.
  9. ^ Vardy, Adam (22 August 1995). "Extra Instructions Of The 65XX Series CPU".
  10. ^ Steil, Michael (2010-09-28). "Measuring the ROR Bug in the Early MOS 6502".
  11. ^ a b "Differences between NMOS 6502 and CMOS 65c02". Retrieved 27 February 2018. N, V, and Z flags were incorrect after decimal operation (but C was ok).
  12. ^ a b c Clark, Bruce. "65C02 Opcodes". Cite error: The named reference "opcodes" was defined multiple times with different content (see the help page).
  13. ^ a b c d Wagner 1983, p. 200.
  14. ^ a b c d e f Wagner 1983, p. 203.
  15. ^ Wagner 1983, pp. 200-201.
  16. ^ Wagner 1983, p. 199.
  17. ^ Zaks, Rodnay. Programming the 6502. p. 348.
  18. ^ http://archaicpixels.com/HuC6280

Bibliography[edit]

  • Wagner, Robert (June 1983). "Assembly Lines". Softtalk. pp. 199–204.
  • Taylor, Simon; Watford, Bob (July 1984). "6502 revival". Personal Computer World. pp. 174–175.

Further reading[edit]

External links[edit]