GeForce 900 series

From Wikipedia, the free encyclopedia
  (Redirected from GeForce 900 Series)
Jump to: navigation, search
GeForce 900 Series
Release date September 2014
Codename Maxwell
Models GeForce Series
  • GeForce GT Series
  • GeForce GTX Series
Cards
Mid-range GeForce GTX 960
High-end GeForce GTX 970
Enthusiast GeForce GTX 980
Rendering support
Direct3D DirectX 12(feature level 11_3 and 12_0)[1][2][3][4]
OpenCL 1.2[5]
OpenGL OpenGL 4.5
History
Predecessor GeForce 700 series
Successor GeForce 1000 series

The GeForce 900 Series is a family of graphics processing units developed by Nvidia, used in desktop and laptop PCs. It serves as the high-end introduction for the Maxwell architecture (GM-codenamed chips), named after the Scottish theoretical physicist James Clerk Maxwell.

The Maxwell microarchitecture, the successor to Kepler microarchitecture, will for the first time feature an integrated ARM CPU of its own.[6] This will make Maxwell GPUs more independent from the main CPU according to Nvidia's CEO Jen-Hsun Huang.[7] Nvidia expects three major things from the Maxwell architecture: improved graphics capabilities, simplified programming as well as better energy-efficiency compared to the GeForce 700 Series and GeForce 600 Series [8]

Maxwell was announced in September 2010.[9] The first GeForce consumer-class products based on the Maxwell architecture were released in early 2014.[10] Nvidia is expected to release the Maxwell-powered Tesla accelerator cards as well as Quadro professional graphics cards based on this architecture in late 2014. Eventually, Maxwell architecture will be used for mobile application processors that belong to the Erista family of Tegra SoCs.

First generation Maxwell (GM10x)[edit]

First generation Maxwell GM107/GM108 provides few consumer-facing additional features; Nvidia instead focused on power efficiency. Nvidia increased the amount of L2 cache from 256 KiB on GK107 to 2 MiB on GM107, reducing the memory bandwidth needed. Accordingly, Nvidia cut the memory bus from 192 bit on GK106 to 128 bit on GM107, further saving power.[11] Nvidia also changed the streaming multiprocessor design from that of Kepler (SMX), naming it SMM. The structure of the warp scheduler is inherited from Kepler, which allows each scheduler to issue up to two instructions that are independent from each other and are in order from the same warp. The layout of SMM units is partitioned so that each of the 4 warp schedulers in an SMM controls 1 set of 32 FP32 CUDA cores, 1 set of 8 load/store units, and 1 set of 8 special function units. This is in contrast to Kepler, where each SMX has 4 schedulers that schedule to a shared pool of 6 sets of 32 FP32 CUDA cores, 2 sets of 16 load/store units, and 2 sets of 16 special function units.[12] These units are connected by a crossbar that uses power to allow the resources to be shared.[12] This crossbar is removed in Maxwell.[12] Texture units and FP64 CUDA cores are still shared.[11] SMM allows for a finer-grain allocation of resources than SMX, saving power when the workload isn't optimal for shared resources. Nvidia claims a 128 CUDA core SMM has 90% of the performance of a 192 CUDA core SMX.[11] Also, each Graphics Processing Cluster, or GPC, contains up to 4 SMX units in Kepler, and up to 5 SMM units in first generation Maxwell.[11]

GM107 supports CUDA Compute Capability 5.0 compared to 3.5 on GK110/GK208 GPUs and 3.0 on GK10x GPUs. Dynamic Parallelism and HyperQ, two features in GK110/GK208 GPUs, are also supported across the entire Maxwell product line.

Maxwell provides native shared memory atomic operations for 32-bit integers and native shared memory 32-bit and 64-bit compare-and-swap (CAS), which can be used to implement other atomic functions.

NVENC[edit]

Main article: Nvidia NVENC

Maxwell-based GPUs also contain the NVENC SIP block introduced with Kepler. Nvidia's video encoder, NVENC, is 1.5 to 2 times faster than on Kepler-based GPUs meaning it can encode video at 6 to 8 times playback speed.[11]

PureVideo[edit]

Main article: Nvidia PureVideo

Nvidia also claims an 8 to 10 times performance increase in PureVideo Feature Set E video decoding due to the video decoder cache paired with increases in memory efficiency. However, H.265 is not supported for full hardware decoding, relying on a mix of hardware and software decoding.[11] When decoding video, a new low power state "GC5" is used on Maxwell GPUs to conserve power.[11]

Second generation Maxwell (GM20x)[edit]

Second generation Maxwell introduced a several new technologies: Dynamic Super Resolution,[13] Third Generation Delta Color Compression,[14] Multi-Pixel Programming Sampling,[15] Nvidia VXGI (Real-Time-Voxel-Global Illumination),[16] VR Direct,[17][18][19] Multi-Projection Acceleration,[14] and Multi-Frame Sampled Anti-Aliasing(MFAA)[20] however support for CSAA was removed.[21] HDMI 2.0 support was also added.[22][23]

Second generation Maxwell also changed the ROP to memory controller ratio from 8:1 to 16:1.[24] However, some of the ROPs are generally idle in the GTX 970 because there are not enough enabled SMMs to give them work to do and therefore reduces its maximum fill rate.[25]

Second generation Maxwell also has up to 4 SMM units per GPC, compared to 5 SMM units per GPC.[26]

GM204 supports CUDA Compute Capability 5.2 compared to 5.0 on GM107/GM108 GPUs, 3.5 on GK110/GK208 GPUs and 3.0 on GK10x GPUs.[24][14][27]

Maxwell second generation GM20x GPUs have an upgraded NVENC which supports HEVC encoding and adds support for H.264 encoding resolutions at 1440p/60FPS & 4K/60FPS compared to NVENC on Maxwell first generation GM10x GPUs which only supported H.264 1080p/60FPS encoding.[19]

Future[edit]

After Maxwell, the next architecture is code-named Pascal.[28] Nvidia has announced that the Pascal GPU will feature stacked DRAM, Unified Memory, and NVLink.[28]

Products[edit]

GeForce 900 (9xx) series[edit]

  • 1 Shader Processors : Texture mapping units : Render output units
  • 2 Pixel fillrate is calculated as the number of ROPs multiplied by the base core clock speed
  • 3 Texture fillrate is calculated as the number of TMUs multiplied by the base core clock speed.
  • 4 Single precision performance is calculated as 2 times the number of shaders multiplied by the base core clock speed.
  • 5 Double precision performance of the GTX 980 and GTX 970 are both 1/32 of single-precision performance.[29]
  • 6 SLI support connecting up to 4 identical GPUs card for a 4-way SLI configuration. Those support 4-way SLI can support 3-way & 2-way SLI, however a Dual-GPUs card is already 2-way SLI configuration internally therefore they support 4-way SLI with an identical Dual-GPUs card but do not support 3-way SLI.
Model Launch Code name Fab (nm) Transistors (Million) Die size (mm) GPU count Bus interface Memory (MiB) Core config1 Clock speeds Fillrate Memory API support (version) Processing Power (GFLOPS) GFLOPS/W Single Precision TDP (watts) SLI support6 Release Price (USD)
Base core clock (MHz) Boost core clock (MHz) Memory (MT/s) Pixel (GP/s)2 Texture (GT/s)3 Bandwidth (GB/s) Bus type Bus width (bit) DirectX OpenGL OpenCL Single precision4 Double precision5
GeForce GTX 960[citation needed] GM204 28 5200 398 1 PCIe 3.0 x16 4096  ????:???:?? 6008 192 GDDR5 256 12.0[3][4] 4.5 1.2 2-way $2??
GeForce GTX 970 [30] Sep 18, 2014 GM204 28 5200 398 1 PCIe 3.0 x16 4096 1664:104:64 1050 1178 7010 67.2 109.2 224 GDDR5 256 12.0[3][4] 4.5 1.2[31] 3494 109 24.1 145 3-way $329
GeForce GTX 980 [32] Sep 18, 2014 GM204 28 5200 398 1 PCIe 3.0 x16 4096 2048:128:64 1126 1216 7010 72.1 144 224 GDDR5 256 12.0[3][4] 4.5 1.2[5] 4612 144 28.0 165 4-way $549

GeForce 900M (9xxM) series[edit]

Some implementations may use different specifications.

Model Launch Code name Fab (nm) Transistors (Million) Die size (mm) GPU count Bus interface Memory (MiB) Core config1 Clock speeds Fillrate Memory API support (version) Processing Power (GFLOPS) GFLOPS/W Single Precision TDP (watts) SLI support6 Release Price (USD)
Base core clock (MHz) Boost core clock (MHz) Memory (MT/s) Pixel (GP/s)2 Texture (GT/s)3 Bandwidth (GB/s) Bus type Bus width (bit) DirectX OpenGL OpenCL Single precision4 Double precision5
GeForce GTX 970M [33] Oct 07, 2014 GM204 28 5200 398 1 PCIe 3.0 x16 3072
6144
1280:80:48 924 993 5012 44.4 73.9 120 GDDR5 192 12.0[3][4] 4.5 1.2[5] 2365 73.9 Unknown Unknown Yes Unknown
GeForce GTX 980M [34] Oct 07, 2014 GM204 28 5200 398 1 PCIe 3.0 x16 4096
8192
1536:96:64 1038 1127 5012 66.4 99.6 160 GDDR5 256 12.0[3][4] 4.5 1.2[5] 3189 99.6 Unknown Unknown Yes Unknown

Chipset table[edit]

See also[edit]

References[edit]

  1. ^ http://blogs.nvidia.com/blog/2014/09/19/maxwell-and-dx12-delivered/
  2. ^ http://blogs.msdn.com/b/directx/archive/2014/09/18/directx-12-lights-up-nvidia-s-maxwell-editor-s-day.aspx
  3. ^ a b c d e f http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review/4
  4. ^ a b c d e f http://www.anandtech.com/show/8544/microsoft-details-direct3d-113-12-new-features
  5. ^ a b c d http://www.techpowerup.com/gpudb/2621/geforce-gtx-980.html
  6. ^ Nvidia Maxwell to be first GPU with ARM CPU in 2013, Guru3d.com
  7. ^ Nvidia Maxwell Graphics Processors to Have Integrated ARM General-Purpose Cores., xbitlabs.com
  8. ^ Nvidia: Next-Generation Maxwell Architecture Will Break New Grounds.., xbitlabs.com
  9. ^ http://www.anandtech.com/show/3939/gtc-2010-reporters-notebook-day-1-nvidia-announces-future-gpu-families-for-2011-and-2013
  10. ^ http://www.geforce.com/whats-new/articles/introducing-the-geforce-gtx-750-class
  11. ^ a b c d e f g Smith, Ryan; T S, Ganesh (18 February 2014). "The NVIDIA GeForce GTX 750 Ti and GTX 750 Review: Maxwell Makes Its Move". AnandTech. Archived from the original on 18 February 2014. Retrieved 18 February 2014. 
  12. ^ a b c http://www.anandtech.com/show/7764/the-nvidia-geforce-gtx-750-ti-and-gtx-750-review-maxwell/3
  13. ^ http://www.geforce.com/whats-new/articles/dynamic-super-resolution-instantly-improves-your-games-with-4k-quality-graphics
  14. ^ a b c http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF
  15. ^ http://www.geforce.com/hardware/technology/mfaa/technology
  16. ^ http://www.geforce.com/whats-new/articles/maxwells-voxel-global-illumination-technology-introduces-gamers-to-the-next-generation-of-graphics
  17. ^ http://www.geforce.com/whats-new/articles/maxwell-architecture-gpus-the-only-choice-for-virtual-reality-gaming
  18. ^ http://blogs.nvidia.com/blog/2014/09/18/maxwell-virtual-reality/
  19. ^ a b http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review/5
  20. ^ http://www.geforce.com/whats-new/articles/multi-frame-sampled-anti-aliasing-delivers-better-performance-and-superior-image-quality
  21. ^ http://forums.realhardwarereviews.com/news/new-nvidia-maxwell-chips-do-not-support-fast-csaa/
  22. ^ http://www.geforce.com/whats-new/articles/maxwell-architecture-gtx-980-970
  23. ^ http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review
  24. ^ a b http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review/3
  25. ^ http://techreport.com/blog/27143/here-another-reason-the-geforce-gtx-970-is-slower-than-the-gtx-980
  26. ^ http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review/3
  27. ^ http://devblogs.nvidia.com/parallelforall/maxwell-most-advanced-cuda-gpu-ever-made/
  28. ^ a b http://blogs.nvidia.com/blog/2014/03/25/gpu-roadmap-pascal/
  29. ^ Smith, Ryan (September 18, 2014). "The NVIDIA GeForce GTX 980 Review: Maxwell Mark 2". AnandTech. p. 1. Retrieved September 19, 2014. 
  30. ^ GeForce GTX 970 | Specifications | GeForce
  31. ^ http://www.techpowerup.com/gpudb/2620/geforce-gtx-970.html
  32. ^ GeForce GTX 980 | Specifications | GeForce
  33. ^ GeForce GTX 970M | Specifications | GeForce
  34. ^ GeForce GTX 980M | Specifications | GeForce

External links[edit]