Jump to content

Larrabee (microarchitecture)

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Dila (talk | contribs) at 01:46, 9 August 2008. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Template:Future chip

The Larrabee GPU architecture, unveiled at the SIGGRAPH conference in August 2008.

Larrabee is a many-core processor designed by Intel Corporation to compete in the high-performance parallel computing market, alongside ATI and NVIDIA. Some sources[1][2][3] report that the original architecture was inherited by Intel from the The Pentagon.

It is a general purpose processor, with an architecture similar to that of the Cell Broadband Engine, but with the capacity to support many more processing cores, in order to facilitate the implementation of complex parallel applications, such as image processing, physical simulation, and financial analysis.

Comparison with current GPUs and CPUs

While Intel already manufactures graphics accelerators, they are low-performance devices that support the majority of desktop applications, but do not provide the power for graphics intensive applications, such as video gaming.

Larrabee will differ from other GPUs currently on the market such as the GeForce 200 Series and the Radeon 4000 series in that it will use the popular x86 instruction set for its CPU cores instead of a proprietary graphics-focused instruction set, and will feature cache coherency across all its cores like a multi-core CPU (and unlike current GPUs).[4] Larrabee is thus expected to be more flexible than current GPUs.

Larrabee has a fully programmable pipeline, in contrast to current generation graphics cards which are only partially programmable.

The Larrabee itself is also different from the traditional GPU, since it utilize different sets of multicore (quad) processing communicating than traditional streamline or linear stream processing. As of 2009-2010 it is likely that Intel Larrabee might compete with AMD Bulldozer and the mainstream graphic cards market. However, the movement of AMD and Intel doesn't present a transition stage of graphic cards, as new hardware component are still needed for development such as the power management of graphic card seen in Tegra LongRun2.

Also the standards of graphic versus hardware design are still questionable, as some companies maybe believe building a graphic accelerator is at best like ATI FirePro V5700 graphics accelerator, while DirectX11 believe that API can replace need for graphic accelerator[5]In addition to graphics acceleration, Intel is also positioning Larrabee for the GPGPU and high-performance computing markets. Intel plans to have engineering samples of Larrabee ready by the end of 2008, with public release in late 2009 or 2010.[6]

Larrabee is being designed explicitly for general purpose GPU (GPGPU) or stream processing tasks, in addition to traditional rasterized 3D graphics (DirectX/OpenGL) for games. For example, Larrabee might perform ray tracing or physics processing,[7] in real time for games or offline for scientific research as a component of a supercomputer.[8] Video encoding, image processing, and in general any task which scales easily to multicore CPUs today will likely be well-suited to run on Larrabee.

Larrabee is not being designed by Intel's graphics division, which is responsible for Intel's current integrated graphics products. Instead it is being worked on by Intel's Hillsboro, Oregon design team, whose last major design was the Pentium 4.

The x86 processor cores in Larrabee will be different in several ways from the cores in current Intel CPUs such as the Core 2 Duo, taking lessons from Intel's ongoing "Tera-scale" research, as exemplified by their "Polaris" 80-core processor demonstrator. Larrabee's x86 cores will be much simpler than those on a Core 2 processor, not using out-of-order execution. This will allow them to be much smaller, so more can fit on a single chip. Other differences include the addition of a new set of extended SIMD instructions similar to SSE but more focused on graphics applications, and 4-way simultaneous multithreading for each core.

Larrabee will not be Intel's first discrete GPU. In the late 1990s, Intel subsidiary Real3D created an AGP and PCI graphics accelerator, the Intel740. However, Intel's participation in the graphics hardware market has subsequently been limited to integrated graphics chips under the Intel GMA brand. Although the low cost and power consumption of Intel GMA chips make them suitable for small laptops and less demanding tasks, they lack the 3D graphics processing power to compete with NVIDIA and AMD/ATI for a share of the high-end gaming computer market, the HPC market, or a place in popular home games consoles, which Larrabee aims to have.

Preliminary specifications

2006

Ed Davis (an ex-Intel employee at the time who never worked on Larrabee) made a presentation about Larrabee on March 7, 2006 and the slides were posted on the web, but later redacted to purge Larrabee information.[9][10] These specifications are undoubtedly outdated by now, but they represent the most detailed concrete information about Larrabee available. According to this presentation, Larrabee will be mounted in a 49.5mm×49.5mm package, will be available on a PCI Express 2.0 card, and will have a TDP greater than 150 W.

The presentation contained a comparison table with the following information:

  • 1.7-2.5 GHz clock speed
  • 16-24 in-order x86 cores
  • Without SSE: 3.4-5.0 DP GFLOPS/core (2 DP FP/clock), 54.4-120 DP GFLOPS/chip
  • With SSE: 13.6-40 DP GFLOPS/core (16 DP FP/clock), 217.6-960 DP GFLOPS/chip
  • 32 KB L1 cache/core, (1 clock latency)
  • 512 KB L2 cache/core (8-12 MB total), (10 clocks latency)
    • Each core has 256 KB of the L2 cache that is read-write for it, but read-only by the other cores
  • 64 bytes cache line width
  • 256 bytes/cycle Ring bus bandwidth
  • 1-2 GB GDDR RAM
  • 128 GB/s memory bandwidth
  • 17 GB/s memory bandwidth per QuickPath link with 50 ns latency

2007

A June 2007 PC Watch article suggests that the first Larrabee chips will feature 32 x86 processor cores and come out in 2009, fabricated on a 45 nanometer process. Chips with a few defective cores due to yield issues will be sold as a 24-core version. Later in 2010 Larrabee will be shrunk for a 32 nanometer fabrication process which will enable a 48 core version.[11]

Larrabee will have an extra-wide 512-bit vector processing unit for each core, much wider than SSE (128 bits) and also wider than AVX (256 bits). It is unknown whether Larrabee will use a variant of the AVX instruction set and retain SSE compatibility, or use a new and incompatible set of extended instructions.[12]

Larrabee will probably be available in a server-oriented version which will sit directly in motherboard sockets using Intel's QuickPath interconnect, competitor to AMD's HyperTransport; this may open the possibility of creating a Larrabee-only computer without a companion traditional x86 processor such as the Core 2 Duo.[13]

Fudzilla has posted several short articles about Larrabee, claiming that Larrabee may have a TDP as large as 300W,[14] that Larrabee will use a 12-layer PCB and has a cooling system that "is meant to look similar to what you can find on high-end Nvidia cards today,"[15] that Larrabee will use GDDR5 memory, that it is targeted to have 2 SP teraflops of computing power,[16] and that it doesn't have to use DirectX, but uses a direct mode.[17] Fudzilla also claims a Summer 2009 release date.[18]

2008

In a July 2008 interview, Intel's Pat Gelsinger stated that Larrabee’s x86 cores will be based on Intel’s P54C architecture, which was last seen in the original Pentium chips, such as the Pentium 75, in the early 1990s.[19] Intel later said that Gelsinger had not revealed any details about the number or type of cores in Larrabee. However, other sources have confirmed the news, adding that the P54C has been adapted by the Pentagon for rad-hard applications, and this revised P54C was in turn adapted for Larrabee.[20] Those sources have also claimed a 4 MB coherent L2 cache, and 3-operand instructions capability.

On August 3, 2008, Intel present more information to analysts and journalists. Some interesting functionalities were identified.[21][22]

On August 12, 2008, Intel will present a paper describing Larrabee at SIGGRAPH.[23][24] The paper is said to contain a comparison of performance between Larrabee and Core 2 Duo, which reveals that the single-threaded performance of one of Larrabee's cores is roughly half that of a "Core 2" core, while the overall performance per watt of a Larrabee chip is 20× better than a Core 2 Duo chip.[20]

Software-based scheduling

The advantage of software-based scheduling is, that performance can be dynamically allocated based on performance requirements and the number of cores available. This could enable future chips to put unused cores into idle power-mode in order to improve power efficiency. Evidence of this can be found in the Teraflops Research Chip presentation by Intel.[25] The information provided shows that individual cores can be switched off and that processes can be relocated to other cores within the same processor to lower overall temperature. This ability could enable future chips to cool down more efficiently.

Execution threads

A typical hyper-threaded Intel processor simulates 2 processors per physical core. Intel's Larrabee simulates 4 processors per physical core. This way transistors built for the x86 architecture are used very effectively, thus reducing the penalty of x86 architecture being used solely for floating point calculations as much as possible.

Floating point calculations

Larrabee will fully support the IEEE standards for single and double precision floating-point arithmetics. This could possibly be required by advanced ray-tracing, and is important for scientific calculations.

Ring network

AMD/ATI's Radeon R700 abandoned this architecture in favour of the more traditional crossbar and internal hub due to complexity. This makes it interesting to see how the ring network will work within the Larrabee.

References

  1. ^ Stokes,Jon (2005-04-30). "Intel's Larrabee GPU based on secret Pentagon tech". Encyclopedia of things. Ars Technica. Retrieved 2008-08-08.
  2. ^ Pentagon used Intel Larabee
  3. ^ Intel Larrabee based on a Pentagon Pentium chip
  4. ^ "Intel Corporation's Multicore Architecture Briefing". Intel. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  5. ^ AMD announces support DirectX11, paragraph 2
  6. ^ "Larrabee: Samples in Late 08, Products in 2H09/1H10". beyond3d.com. Retrieved 2008-01-17. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  7. ^ Stokes, Jon. "Intel picks up gaming physics engine for forthcoming GPU product". Ars Technica. Retrieved 2007-09-17. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  8. ^ Stokes, Jon. "Clearing up the confusion over Intel's Larrabee". Ars Technica. Retrieved 2007-06-01. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  9. ^ "Clearing up the confusion over Intel's Larrabee, part II". Ars Technica. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  10. ^ "Tera Tera Tera" (PDF). Michigan State University. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  11. ^ "Intel is promoting the 32 core CPU "Larrabee"". pc.watch.impress.co.jp. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)Template:Jatranslation
  12. ^ "Clearing up the confusion over Intel's Larrabee". Ars Technica. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  13. ^ Stokes, Jon. "Clearing up the confusion over Intel's Larrabee, part II". Ars Technica. Retrieved 2008-01-16. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  14. ^ "Larrabee to launch at 300W TDP". fudzilla.com. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  15. ^ "Larrabee will use a 12-layer PCB". fudzilla.com. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  16. ^ "Larrabee will use GDDR5 memory". fudzilla.com. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  17. ^ "Larrabee doesn't need DirectX". fudzilla.com. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  18. ^ "Larrabee set to launch in summer 2009". fudzilla.com. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  19. ^ "UPDATED: Rumour control: Larrabee based on 32 original Pentium cores". custompc.co.uk. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  20. ^ a b "Intel's Larrabee GPU based on secret Pentagon tech, sorta [Updated]". Ars Technica. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  21. ^ "Intel details future 'Larrabee' graphics chip". CNET Networks. Retrieved 2008-08-04. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  22. ^ "Intel teases new Larrabee details". tgdaily.com. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  23. ^ "Larrabee: A Many-Core x86 Architecture for Visual Computing". SIGGRAPH. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  24. ^ "Larrabee: a many-core x86 architecture for visual computing". Association for Computing Machinery. Retrieved 2008-08-06. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)
  25. ^ "Clocks and Power Management". AnandTech. Retrieved 2008-08-04. {{cite web}}: Italic or bold markup not allowed in: |publisher= (help)