Nvidia Tesla

From Wikipedia, the free encyclopedia
  (Redirected from Nvidia tesla)
Jump to: navigation, search
nVidia Tesla
Nvidia Tesla GPU

The Tesla graphics processing unit (GPU) is nVidia's third brand of GPUs. It is based on high-end GPUs from the G80 (and on), as well as the Quadro lineup. Tesla is nVidia's first dedicated General Purpose GPU. The Tesla series takes its name from pioneering Serbian electrical engineer Nikola Tesla.

Contents

[edit] Tesla overview

Because of their very high computational power (measured in floating point operations per second or FLOPS) compared to previous microprocessors, the Tesla products target the high performance computing market.[1] The lack of ability to output images to a display[2] was the main difference between Tesla products and the consumer level GeForce cards and the professional level Quadro cards, but the latest Tesla C-class products include one Dual-Link DVI port[3]. (C. For equivalent single precision output, Fermi-based nVidia Geforce cards have four times less dual-precision performance. Tesla products primarily operate[4]:

  • in simulations and in large scale calculations (especially floating-point calculations)
  • for high-end image generation for applications in professional and scientific fields
  • with the use of OpenCL or CUDA.

As of 2011 nVidia Teslas power the second-fastest supercomputer in the world, Tianhe-1A, in Tianjin, China.

[edit] Specifications and configurations

Configuration Model # of GPUs Core clock
in MHz (each)
Shaders Memory Processing Power (peak)
GFLOPs[5]
Compute capability4 TDP watts Form factor
and features
Thread Processors (total) Clock in MHz (each) Bandwidth max (GB/s) Bus type Bus width (bit, each GPU) Total size (MiB) Clock (MHz) Single Precision(SP) Total(MUL+ADD+SF) Single Precision(SP) MAD(MUL+ADD) Double Precision(DP) FMA
GPU Computing
Processor1
C870 1 600 128 1350 76.8 GDDR3 384 1536 1600 518.4 345.6 0 1.0 170.9 Full-height video card
Deskside Supercomputer1 D870 2 600 2 × 128 (256) 1350 153.6 GDDR3 384 3072 1600 1036.8 691.2 0 1.0 520 Deskside system or Rack unit
GPU Computing
Server1
S870 4 600 4 × 128 (512) 1350 307.2 GDDR3 384 6144 1600 2073.6 1382.4 0 1.0 1U Rack
C1060
Computing Processor 2
C1060 1 602 240 1300 102.4 GDDR3 512 4096 1600 933.12 622.08 77.76 1.3 187.8 2 slot video card
S1075 1U[6]
GPU Computing
Server3,4
S1070 4 602 4 × 240 (960) 1440 409.6 GDDR3 512 16384 1600 4147.2 2764.8 345.6 1.3 1U Rack
IEEE 754-2008 capabilities
C2050/C2070/C2075
GPU Computing Processor
C2050/C2070/C2075 1 575 448 1150 144 GDDR5 384 3072/61445 3000 1288 1030.46 515.2 2.0 238/247/225 Full-height video card
IEEE 754-2008 FMA capabilities
M2050
GPU Computing Module
M2050 1 575 448 1150 148.4 GDDR5 384 30725 1546 1288 1030.46 515.2 2.0 225 Computing Module
IEEE 754-2008 FMA capabilities
M2070/M2070Q[7]
GPU Computing Module
M2070/M2070Q 1 575 448 1150 150.336 GDDR5 384 61445 1566 1288 1030.46 515.2 2.0 225 Computing Module
IEEE 754-2008 FMA capabilities
M2090[8][9][10]
GPU Computing Module
M2090 1 650 512 1300 177 GDDR5 384 61445 1850 1331  ? 665 2.0 225 Computing Module
IEEE 754-2008 FMA capabilities
S2050 1U
GPU Computing
System
S2050 4 575 4 × 448 (1792) 1150 4 × 148.4 (593.6) GDDR5 384 122885 3092 5152 4121.66 2060.8 2.0 900 1U Rack
IEEE 754-2008 FMA capabilities

Notes

  • 1 Specifications not specified by NVIDIA are assumed to be based on the GeForce 8800GTX
  • 2 Specifications not specified by NVIDIA are assumed to be based on the GeForce GTX 285
  • 3 A host system/server is required to connect to the 1U GPU computing server by the PCI Express card (similar set-up as the Nvidia Quadro Plex)
  • 4 Core architecture version according to the CUDA programming guide.
  • 5 With ECC on, a portion of the dedicated memory is used for ECC bits, so the available user memory is reduced by 12.5%. (e.g. 3 GB total memory yields 2.625 GB of user available memory.)
  • 6 Fermi implements the new fused multiply–add (FMA) instruction for both 32-bit single-precision and 64-bit double-precision floating point numbers (GT200 supported FMA only in double precision) that improves upon multiply-add by retaining full precision in the intermediate stage.[11]
  • For the basic specifications of Tesla, refer to the GPU Computing Processor specifications.
  • Performance figures are for single-precision except where noted.
  • NVIDIA Tesla Supercomputers are also available with up to 8x Fermi GPUs from Manufacturers.

[edit] See also

[edit] References

[edit] External links

Personal tools
Namespaces

Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages