POWER4
This article includes a list of references, related reading, or external links, but its sources remain unclear because it lacks inline citations. (March 2014) |
General information | |
---|---|
Launched | 2001 |
Designed by | IBM |
Performance | |
Max. CPU clock rate | 1.1 GHz to 1.9 GHz |
Cache | |
L1 cache | 64+32 kB/core |
L2 cache | 1.41 MB/chip |
L3 cache | 32 MB off chip |
Architecture and classification | |
Technology node | 180 nm to 130 nm |
Instruction set | PowerPC (PowerPC v.2.00/01) |
Physical specifications | |
Cores |
|
History | |
Predecessors | POWER3, RS64 |
Successor | POWER5 |
POWER, PowerPC, and Power ISA architectures |
---|
NXP (formerly Freescale and Motorola) |
IBM |
|
IBM/Nintendo |
Other |
Related links |
Cancelled in gray, historic in italic |
The POWER4 is a microprocessor developed by International Business Machines (IBM) that implemented the 64-bit PowerPC and PowerPC AS instruction set architectures. Released in 2001, the POWER4 succeeded the POWER3 and RS64 microprocessors, and was used in RS/6000 and AS/400 computers, ending a separate development of PowerPC microprocessors for the AS/400. The POWER4 was a multicore microprocessor, with two cores on a single die, the first non-embedded microprocessor to do so.[1] POWER4 Chip was first commercially available multiprocessor chip.[2] The original POWER4 had a clock speed of 1.1 and 1.3 GHz, while an enhanced version, the POWER4+, reached a clock speed of 1.9 GHz. The PowerPC 970 is a derivative of the POWER4.
Functional layout
The POWER4 has a unified L2 cache, divided into three equal parts. Each has its own independent L2 controller which can feed 32 bytes of data per cycle.[clarification needed] The Core Interface Unit (CIU) connects each L2 controller to either the data cache or instruction cache in either of the two processors. The Non-Cacheable (NC) Unit is responsible for handling instruction serializing functions and performing any noncacheable operations in the storage topology. There is an L3 cache controller, but the actual memory is off-chip. The GX bus controller controls I/O device communications, and there are two 4-byte wide GX buses, one incoming and the other outgoing. The Fabric Controller is the master controller for the network of buses, controlling communications for both L1/L2 controllers, communications between POWER4 chips {4-way, 8-way, 16-way, 32-way} and POWER4 MCM’s. Trace-and-Debug, used for First Failure Data Capture, is provided. There is also a Built In Self Test function (BIST) and Performance Monitoring Unit (PMU). Power-on reset (POR) is supported.
Execution units
The POWER4 implements a superscalar microarchitecture through high-frequency speculative out-of-order execution using eight independent execution units. They are: two floating-point units (FP1-2), two load-store units (LD1-2), two fixed-point units (FX1-2), a branch unit (BR), and a conditional-register unit (CR). These execution units can complete up to eight operations per clock (not including the BR and CR units):
- each floating point unit can complete one fused multiply–add per clock (two operations),
- each load–store unit can complete one instruction per clock,
- each fixed-point unit can complete one instruction per clock.
The pipeline stages are:
- Branch Prediction
- Instruction Fetch
- Decode, Crack and Group Formation
- Group Dispatch and Instruction Issue
- Load–Store Unit Operation
- Load Hit Store
- Store Hit Load
- Load Hit Load
- Instruction Execution Pipeline
Multi-chip configuration
The POWER4 also came in a configuration using a multi-chip module (MCM) containing four POWER4 dies in a single package, with up to 128 MB of shared L3 ECC cache per MCM.
Parametrics
Clock GHz | 1.3 GHz | |
---|---|---|
Power | 115 W | 1.5 V @ 1.1 GHz |
Transistors | 174 million | |
Gate L | 90 nm | |
Gate oxide | 2.3 nm | |
Metal-layer | pitch | thickness |
M1 | 500 nm | 310 nm |
M2 | 630 nm | 310 nm |
M3-M5 | 630 nm | 420 nm |
M6(MQ) | 1260 nm | 920 nm |
M7(LM) | 1260 nm | 920 nm |
Dielectric | ~4.2 | |
Vdd | 1.6 V |
POWER4+
The POWER4+ was an improved version of the POWER4 that ran at up to 1.9 GHz. It contained 184 million transistors, measured 267 mm2, and was fabricated in a 0.13 µm SOI CMOS process with eight layers of copper interconnect.
Notes
- ^ "IBM's Server Processors: The RS64 and the POWER". The CPU Shack Museum. 2011-01-24. Retrieved 2015-04-17.
- ^ William Stallings, Computer Organization and Architecture, Seventh Edition, -pp 44
References
- "Power4 Focuses on Memory Bandwidth". (6 October 1999). Microprocessor Report.
- "IBM's Power4 Unvieling Continues". (20 November 2000). Microprocessor Report.
- "POWER4 System Microarchitecture" (PDF). IBM. Retrieved 2012-06-07.
- J. M. Tendler, J. S. Dodson, J. S. Fields, Jr., H. Le, and B. Sinharoy (2002). "POWER4 system microarchitecture". IBM Journal of Research and Development. 46 (1): 5–26. doi:10.1147/rd.461.0005. ISSN 0018-8646. Retrieved 2006-07-21.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - J. D. Warnock, J. M. Keaty, J. Petrovick, J. G. Clabes, C. J. Kircher, B. L. Krauter, P. J. Restle, B. A. Zoric, and C. J. Anderson (2002). "The circuit and physical design of the POWER4 microprocessor". IBM Journal of Research and Development. 46 (1): 27–52. doi:10.1147/rd.461.0027. ISSN 0018-8646. Retrieved 2006-07-21.
{{cite journal}}
: CS1 maint: multiple names: authors list (link)