From Wikipedia, the free encyclopedia
Jump to: navigation, search

The Cray-3 was a vector supercomputer intended to be Cray Research's successor to the Cray-2. The system was to be the first major application of gallium arsenide (GaAs) semiconductors in computing. The project was not considered a success, and the parent company in Minneapolis decided to end work on the Cray-3 in favour of their own design, the Cray C90. The Cray-3 project was spun off to the newly formed Cray Computer Corporation, but only one Cray-3 was delivered, and never paid for. Seymour Cray moved onto the Cray-4 design, but the company went bankrupt before the project was completed.



Cray generally set himself the goal of producing new machines with ten times the performance of the previous models. Although the machines did not always meet this goal,[1] this was a useful technique in defining the project and clarifying what sort of process improvements would be needed to meet it. Cray had always attacked the problem of increased speed with three simultaneous advances: more functional units to give the system higher parallelism, tighter packaging to decrease signal delays, and faster components to allow for a higher clock speed. Of the three, Cray was normally least aggressive on the last issue, his designs tended to use only components that were already in widespread use, as opposed to leading-edge designs.

For the Cray-3, he decided to set an even higher performance improvement goal, an increase of 12x over the Cray-2.[2] For the Cray-2 they had introduced a novel 3D-packaging system for its integrated circuits to allow higher densities,[3] and it appeared that there was some room for improvement in this process. But for a 12x performance increase, packaging alone would not be enough. The Cray-2 appeared to be pushing the limits of speed of silicon-based transistors at 4.1 ns (244 MHz), and it didn't appear that anything more than another 2x would be possible. If the goal of 12x would be met, more radical changes would be needed, and a "high tech" approach would have to be used.[2]

Cray had intended to use gallium arsenide circuitry in the Cray-2, which would not only offer much higher switching speeds, but also used less energy and thus ran cooler as well. At the time the Cray-2 was being designed, the state of GaAs manufacturing simply wasn't up to the task of supplying a supercomputer.[3] By the mid-1980s, things had changed and Cray decided it was the only way forward.[4] Given a lack of investment on the part of large chip makers, Cray decided the only solution was to invest in a GaAs chipmaking startup, GigaBit Logic, and use them as an internal supplier.[5]


Typical module layout, with a 4x4 arrangement of "submodules", stacked 4-deep. The metal connectors on the bottom are power connections.

Development of the Cray-3 started in 1988, originally slated for delivery in 1991.[6] This was during a time when the supercomputer market was rapidly shrinking from 50% annual growth in 1980, to 10% in 1988.[4]

During 1989 the company was in the process of developing both the Cray-3 and C90, two machines of roughly similar power. The Cray-3 was designed to be compatible with the Cray-2, while the C90 was compatible with the Cray Y-MP and earlier machines.[7] With only 25 Cray-2s sold, management decided that the Cray-3 should be put on "low priority" development. This was not the first time this had happened, and, as in the past, Cray decided to form his own company to continue development of his design.[8] The result was Cray Computer Corporation, which Cray had no equity stake in, and worked under contract.[9]

By 1991, development was behind schedule.[10] Development slowed even more when Lawrence Livermore National Laboratory cancelled its order for the first machine,[11] in favor of the C90. Several executives, including the CEO, left the company. The company then announced they would be looking for a customer that needed a smaller version of the machine, with four to eight processors.[12]

The first (and only) customer system (serial number S5, named Graywolf) was not delivered to NCAR until May 1993. NCAR's model was configured with 4 processors and a 128 MWord (64-bit words, 1 GB) common memory.[13] Once in production it was learned that the square root code contained a bug, and one of their four CPUs was not running reliably. Replacements to fix both problems were developed. NCAR had not yet paid for the machine when CCC folded in March 1995, after burning through about $300m of financing. NCAR's machine was officially decommissioned the next day. In practice, two of the processors were removed and the machine was used unofficially for some time after that.

Seven system cabinets, or "tanks", (with serial numbers S1 to S7) were built for Cray-3 machines (most for smaller two-CPU machines), but NCAR's was the only one ever delivered. Three of the smaller tanks were used on the Cray-4 project, essentially a Cray-3 with 64 faster CPUs running at 1 ns (1 GHz). Another was used for the Cray-3/SSS project.

The failure of the Cray-3 seems to have little to do with the machine itself, and everything to do with the changing political and technical climate. The machine was being designed during the collapse of the Warsaw Pact and ending of the cold war, which led to a massive downsizing in "large machine" supercomputer purchases.[14][12] At the same time, the market was increasingly investing in massively parallel designs. Cray was critical of this approach, and was quoted by the Wall Street Journal as saying that MPP systems have not yet proven their supremacy over vector computers, noting the difficulty many users have had programming for large parallel machines. "I don't think they'll ever be universally successful, at least not in my lifetime".[14]


Logical design[edit]

The Cray-3 system architecture comprised a foreground processing system, up to 16 background processors and up to 2 gigawords (16 GB) of common memory.[15] The foreground system was dedicated to input/output and system management. It included a 32-bit processor and four synchronous data channels for mass storage and network devices, primarily via HiPPI channels.

Each background processor consisted of a computation section, a control section and local memory. The computation section performed 64-bit scalar, floating point and vector arithmetic. The control section provided instruction buffers, memory management functions, and a real-time clock. 16 kwords (128 kbytes) of high-speed local memory was incorporated into each background processor for use as temporary scratch memory.[15]

Common memory consisted of silicon CMOS SRAM, organized into octants of 64 banks each, with up to eight octants possible. The word size was 64-bits plus eight error-correction bits, and total memory bandwidth was rated at 128 gigabytes per second.[15]

CPU design[edit]

Complete processor "brick". The modules are visible inside, mounted vertically.

As with previous designs, the core of the Cray-3 consisted of a number of "modules", each containing several circuit boards packed with parts. In order to increase density, the individual GaAs chips were not "packaged", and instead several were mounted directly with ultrasonic gold bonding to a board approximately 1 inch square. The boards were then turned over and mated to a second board carrying the electrical wiring, with wires on this card running through holes to the "bottom" (opposite the chips) side of the chip carrier where they were bonded, hence sandwiching the chip between the two layers of board. These "submodules" were then stacked four-deep and, as in the Cray-2, wired to each other to make a 3D circuit.[13]

Unlike the Cray-2, the Cray-3 modules also included edge connectors. 16 such submodules were connected together in a 4×4 array to make a single module measuring 121 × 107 × 7 mm (approximately 4 inches square by 0.25 inch deep). Even with this advanced packaging the circuit density was low even by 1990s standards, at about 96,000 gates per cubic inch.[15] Modern CPUs offer gate counts of millions per square inch, and the move to 3D circuits is still just being considered in 2011.[16]

Thirty-two such modules were then stacked and wired together with a mass of twisted-pair wires into a single processor. The basic cycle time was 2.11 ns, or 474 MHz, allowing each processor to reach about 0.948 GFLOPS, and a 16 processor machine a theoretical 15.17 GFLOP. Key to the high performance was the high-speed access to main memory, which allowed each process to burst up to 8 GB/s.[17]

Mechanical design[edit]

The modules were held together in an aluminum chassis known as a "brick". The bricks were immersed in liquid fluorinert for cooling, as in the Cray-2. A four-processor system with 64 memory modules dissipated about 88 kW of power.[13] The entire four-processor system was about 20" tall and front-to-back, and a little over two feet wide.

For systems with up to four processors, the processor assembly sat under a translucent bronzed acrylic cover at the top of a cabinet 42 inches (1.1 m) wide, 28 inches (0.71 m) deep and 50 inches (1.3 m) high,[15] with the memory below it, and then the power supplies and cooling systems on the bottom. Eight and 16-processors system would have been housed in a larger octagonal cabinet. All in all, the Cray-3 was considerably smaller than the Cray-2, itself relatively small compared to other supercomputers.

In addition to the system cabinet, a Cray-3 system also needed one or two (depending on number of processors) system control pods (or "C-Pods"), 52.5 inches (1.33 m) square and 55.3 inches (1.40 m) high, containing power and cooling control equipment.[15]

System configurations[edit]

The following possible Cray-3 configurations were officially specified:[15]

Name CPUs Memory (Mwords) I/O Modules
Cray-3/1-256 1 256 1
Cray-3/2-256 2 256 1
Cray-3/4-512 4 512 3
Cray-3/4-1024 4 1024 3
Cray-3/4-2048 4 2048 3
Cray-3/8-1024 8 1024 7
Cray-3/8-2048 8 2048 7
Cray-3/16-2048 16 2048 15


The Cray-3 ran the Colorado Springs Operating System (CSOS) which was based upon Cray Research's UNICOS operating system version 5.0. A major difference between CSOS and UNICOS was that CSOS was ported to standard C with all PCC extensions that were used in UNICOS removed.[18] Much of the software available under the Cray-3 was derived from Cray Research and included for instance the X Window System, vectorizing FORTRAN and C compilers, NFS and a TCP/IP stack.[15][18]



  1. ^ MacKenzie, pg. 141
  2. ^ a b MacKenzie, pg. 153-154
  3. ^ a b Hill, pg. 9
  4. ^ a b MacKenzie, pg. 154
  5. ^ James Peltz, "GigaBit Logic Negotiating Sale With Cray Computers", LA Times, 23 January 1990
  6. ^ "Cray Computer Corp. 8-K to Nov 95", EDGAR
  7. ^ Donald A. MacKenzie. "Knowing Machines: Essays on Technical Change". 1998. p. 154-155.
  8. ^ Hill, pg. 10
  9. ^ "Chief Executive Quits At Cray Computer", The New York Times, 17 April 1992
  10. ^ "Cray Computer Is Behind Schedule ", The New York Times, 17 December 1991
  11. ^ "Cray Loses Only Order For Product", The New York Times, 24 December 1991
  12. ^ a b "Cold War's End Hits Cray Computer", The New York Times, 21 February 1992
  13. ^ a b c Lynda Lester, "The making of a CRAY-3", Ninth SCD User Conference, June 1993
  14. ^ a b Michael Allen, "Pushing Big Iron: Seymour Cray's Woes Reflect Tough Times for Supercomputers", Wall Street journal, 1998
  15. ^ a b c d e f g h CRAY-3 Supercomputer Systems (brochure). Colorado Springs, CO: Cray Computer Corporation. 1993. 
  16. ^ Jared Newman, "Intel's 3D Transistor: Why It Matters", PCWorld, 5 May 2011
  17. ^ Aad van der Steen, "Short Description of Architectures in the TOP500: The Cray Computer Corporation Cray-3", TOP500, 14 november 1995
  18. ^ a b "CRAY-3 Software Introduction Manual" (PDF). 


External links[edit]