SPARC64 V

From Wikipedia, the free encyclopedia
Jump to: navigation, search
SPARC64 V
Produced From 2001 to 2003
Designed by Fujitsu
Max. CPU clock rate 1.10 GHz to 1.35 GHz
Instruction set SPARC V9
Cores 1

SPARC64 V refers to two unique microprocessors, the SPARC64 V "Zeus" developed by Fujitsu,[1] and an earlier design developed by HAL Computer Systems that never made it into production. The HAL design was canceled in mid-2001 when HAL, a subsidiary of Fujitsu, was closed. The SPARC64 V developed by Fujitsu is a replacement for the HAL design.

History[edit]

The first SPARC64 V microprocessors were fabricated in December 2001.[2] They operated at 1.1 to 1.35 GHz. Fujitsu's 2003 SPARC64 roadmap showed that the company planned a 1.62 GHz version for release in late 2003 or early 2004, but it was canceled in favor of the SPARC64 V+.[3] The SPARC64 V was used by Fujitsu in their PRIMEPOWER servers.

The SPARC64 V was presented at Microprocessor Forum 2002 by Aiichiro Inoue, the director of the Processor Development Division of the Development Department at Fujitsu.[4] At introduction, it had the highest clock frequency of both SPARC implementations and 64-bit server microprocessor in production; and the highest SPEC rating of any SPARC implementation.[4]

Description[edit]

The SPARC64 V is a four-issue superscalar microprocessor with out-of-order execution. It was based on the Fujitsu GS8900 mainframe microprocessor.[5]

Pipeline[edit]

The SPARC64 V fetches up to eight instructions from the instruction cache during the first stage and places them into a 48-entry instruction buffer. In the next stage, four instructions are taken from this buffer, decoded and issued to the appropriate reserve stations. The SPARC64 V has six reserve stations, two that serve the integer units, one for the address generators, two for the floating-point units, and one for branch instructions. Each integer, address generator and floating-point unit has an eight-entry reserve station. Each reserve station can dispatch an instruction to its execution unit. Which instruction is dispatched firstly depends on operand availability and then its age. Older instructions are given higher priority than newer ones. The reserve stations can dispatch instructions speculatively (speculative dispatch). That is, instructions can be dispatched to the execution units even when their operands are not yet available but will be when execution begins. During stage six, up to six instructions are be dispatched.

Register read[edit]

The register files are read during stage seven. The SPARC architecture has separate register files for integer and floating-point instructions. The integer register file has eight register windows. The JWR contains 64 entries and has eight read ports and two write ports. The JWR contains a subset of the eight register windows, the previous, current and next register windows. Its purpose is reduce the size of register file so that the microprocessor can operate at higher clock frequencies. The floating-point register file contains 64 entries and has six read ports and two write ports.

Execution[edit]

Execution begins during stage nine. There are six execution units, two for integer, two for loads and stores, and two for floating-point.[6] The two integer execution units are designated EXA and EXB. Both have an arithmetic logic unit (ALU) and a shift unit, but only EXA has multiply and divide units. Loads and stores are executed by two address generators (AGs) designated AGA and AGB. These are simple ALUs used to calculate virtual addresses.

The two floating-point units (FPUs) are designated FLA and FLB. Each FPU contains an adder and a multiplier, but only FLA has a graphics unit attached. They execute add, subtract, multiply, divide, square root and multiply–add instructions. Unlike its successor SPARC64 VI, the SPARC64 V performs the multiply–add with separate multiplication and addition operations, thus with up to two rounding errors.[7] The graphics unit executes Visual Instruction Set (VIS) instructions, a set of single instruction, multiple data (SIMD) instructions. All instructions are pipelined except for divide and square root, which are executed using iterative algorithms. The FMA instruction is implemented by reading three operands from the operand register, multiplying two of the operands, forwarding the result and the third operand to the adder, and adding them to produce the final result.

Results from the execution units and loads are not written to the register file. To maintain program order, they are written to update buffers, where they reside until committed. The SPARC64 V has separate update buffers for integer and floating-point units. Both have 32 entries each. The integer register has eight read ports and four write ports. Half of the write ports are used for results from the integer execution units and the other half by data returned by loads. The floating-point update buffer has six read ports and four write ports.

Commit takes place during stage ten at the earliest. The SPARC64 V can commit up to four instructions per cycle. During stage eleven, results are written to the register file, where it becomes visible to software.[8]

Cache[edit]

The SPARC64 V has two-level cache hierarchy. The first level consists of two caches, an instruction cache and a data cache. The second level consists of an on-die unified cache.

The level 1 (L1) caches each have a capacity of 128 KB. They are both two-way set associative and have 64-byte line size. They are virtually indexed and physically tagged. The instruction cache is accessed via a 256-bit bus. The data cache is accessed with two 128-bit buses. The data cache consists of eight banks separated by 32-bit boundaries. It uses a write-back policy. The data cache writes to the L2 cache with its own 128-bit unidirectional bus.

The second level cache has a capacity of 1 or 2 MB and the set associativity depends on the capacity.

System bus[edit]

The microprocessor has a 128-bit system bus that operates at 260 MHz. The bus can operate in two modes, single-data rate (SDR) or double-data (DDR) rate, yielding a peak bandwidth of 4.16 or 8.32 GB/s, respectively.

Physical[edit]

The SPARC64 V consisted of 191 million transistors, of which 19 million are contained in logic circuits.[9] It was fabricated by unnamed foundry[10] in a 0.13 µm, eight-layer copper metallization, complementary metal–oxide–semiconductor (CMOS) silicon on insulator (SOI) process. The die measured 18.14 mm by 15.99 mm for a die area of 290 mm2.[11]

Electrical[edit]

At 1.3 GHz, the SPARC64 V has a power dissipation of 34.7 W.[9] The Fujitsu PrimePower servers that use the SPARC64 V supply a slightly higher voltage the microprocessor to enable it to operate at 1.35 GHz. The increased power supply voltage and operating frequency increased the power dissipation to ~45 W.[12]

SPARC64 V+[edit]

SPARC64 V+
Produced 2004
Designed by Fujitsu
Max. CPU clock rate 1.65 GHz to 2.16 GHz
Instruction set SPARC V9
Cores 1

The SPARC64 V+, code-named "Olympus-B", is a further development of the SPARC64 V. Improvements over the SPARC64 V included higher clock frequencies of 1.82 to 2.16 GHz and a larger secondary cache with a capacity of 3 or 4 MB.[1]

The first SPARC64 V+, a 1.89 GHz version, was shipped in September 2004 for the Fujitsu PrimePower 650 and 850. In December 2004, a 1.82 GHz version was shipped in the PrimePower 2500. In February 2006, four versions were introduced: 1.65 and 1.98 GHz versions with 3 MB of L2 cache shipped in the PrimePower 250 and 450; and 2.08 and 2.16 GHz versions with 4 MB of L2 cache shipped in mid-range and high-end models.[13]

It contained approximately 400 million transistors on a die with dimensions of 18.46 mm by 15.94 mm for a die area of 294.25 mm2. It was fabricated in a 90 nm CMOS process with ten levels of copper interconnect.[5]

HAL SPARC64 V[edit]

The HAL SPARC64 V was a complex design. It was a very wide superscalar microprocessor with superspeculation, an instruction trace cache, and split L2 caches. Another feature was a very small, but very fast primary data cache with a capacity of 8 KB. It consisted of 65 million transistors on a 380 mm2 die fabricated by Fujitsu in their CS85 process, a 0.17 µm CMOS process with six levels of copper interconnect.

Notes[edit]

  1. ^ a b "Fujitsu Draws Sparc64 Roadmap Past 2010"
  2. ^ "Microarchitecture and Performance Analysis of a SPARC-V9 Microprocessor for Enterprise Server Systems".
  3. ^ "Fujitsu-Siemens upgrades PrimePower Unix servers"
  4. ^ a b "Fujitsu's SPARC64 V Is Real Deal" p. 1.
  5. ^ a b "SPARC64 V Processor For UNIX Server"
  6. ^ "Fujitsu's SPARC V Is Real Deal", p. 2.
  7. ^ "SPARC64 VI Extensions" page 56, Fujitsu Limited, Release 1.3, 27 March 2007
  8. ^ "Microarchitecture and Performance Analysis of a SPARC-V9 Microprocessor for Enterprise Server Systems", p. 4.
  9. ^ a b "A 1.3GHz Fifth Generation SPARC64 Microprocessor", p. 702.
  10. ^ "Fujitsu's SPARC V IS Real Deal", p. 3.
  11. ^ A 1.3GHz Fifth Generation SPARC64 Microprocessor", p. 702.
  12. ^ "A 1.3GHz Fifth Generation SPARC64 Microprocessor", p. 705.
  13. ^ "Fujitsu-Siemens Cranks the Clock on Sparc V Chips for PrimePowers"

References[edit]