Performance per watt
In computing, performance per watt is a measure of the energy efficiency of a particular computer architecture or computer hardware. Literally, it measures the rate of computation that can be delivered by a computer for every watt of power consumed.
System designers building parallel computers, such as Google's hardware, pick CPUs based on their performance per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself.
The performance and power consumption metrics used depend on the definition; reasonable measures of performance are FLOPS, MIPS, or the score for any performance benchmark. Several measures of power usage may be employed, depending on the purposes of the metric; for example, a metric might only consider the electrical power delivered to a machine directly, while another might include all power necessary to run a computer, such as cooling and monitoring systems. The power measurement is often the average power used while running the benchmark, but other measures of power usage may be employed (e.g. peak power, idle power).
For example, the early UNIVAC I computer performed approximately 0.015 operations per watt-second (performing 1,905 operations per second (OPS), while consuming 125 kW). The Fujitsu FR-V VLIW/vector processor system on a chip in the 4 FR550 core variant released 2005 performs 51 Giga-OPS with 3 watts of power consumption resulting in 17 billion operations per watt-second. This is an improvement by over a trillion times in 54 years.
Most of the power a computer uses is converted into heat, so a system that takes fewer watts to do a job will require less cooling to maintain a given operating temperature. Reduced cooling demands makes it easier to quiet a computer. Lower energy consumption can also make it less costly to run, and reduce the environmental impact of powering the computer (see green computing). If installed where there is limited climate control, a lower power computer will operate at a lower temperature, which may make it more reliable. In a climate controlled environment, reductions in direct power use may also create savings in climate control energy.
Computing energy consumption is sometimes also measured by reporting the energy required to run a particular benchmark, for instance EEMBC EnergyBench. Energy consumption figures for a standard workload may make it easier to judge the effect of an improvement in energy efficiency.
Performance (in operations/second) per watt can also be written as operations/watt-second, or operations/joule, since 1 watt = 1 joule/second.
FLOPS per watt
FLOPS (Floating Point Operations Per Second) per watt is a common measure. Like the FLOPS it is based on, the metric is usually applied to scientific computing and simulations involving many floating point calculations.
However in early 2014, NVIDIA released the Tegra K1 mobile SOC containing a GPU with over 326 GFLOPS peak perf at roughly 10 Watts, obtaining over 50,000 MFLOPS/watt and thus is roughly 25x more efficient than even the Blue Gene/Q!
Kalray has developed a 256 core VLIW CPU that achieves 25 GFLOPS/watt. Next generation is expected to achieve 75 GFLOPS/watt.
As of November 2014[update], the L-CSC supercomputer of the Helmholz Association at the GSI in Darmstadt Germany tops the current Green500 list with 5271 MFLOPS/W and was the first cluster to surpass an efficiency of 5 GFLOPS/W. It runs on Intel Xeon E5-2690 Processors with the Intel Ivy Bridge Architecture and AMD FirePro™ S9150 GPU Accellerators. It uses in rack watercooling and Cooling Towers to reduce the energy required for cooling. 
As of June 2013[update], the Eurotech supercomputer Eurora at Cineca tops the current Green500 list with 3208 LINPACK MFLOPS/W. The Cineca Eurora supercomputer is equipped with two Intel Xeon E5-2687W CPUs and two PCI-e connected NVIDIA Tesla K20 accelerators per node. Water cooling and electronics design allows for very high densities to be reached with a peak performance of 350 TFlop/s per rack.
As of November 2012[update], an Appro International, Inc. Xtreme-X supercomputer (Beacon) tops the current Green500 list with 2499 LINPACK MFLOPS/W. Beacon is deployed by NICS of the University of Tennessee and is a GreenBlade GB824M, Xeon E5-2670 based, eight cores (8C), 2.6 GHz, Infiniband FDR, Intel Xeon Phi 5110P computer.
Graphics processing units (GPU) have continued to increase in energy usage, while CPUs designers have recently focused on improving performance per watt. High performance GPUs may now be the largest power consumer in a system. Measures like 3DMark2006 score per watt can help identify more efficient GPUs. However that may not adequately incorporate efficiency in typical use, where much time is spent doing less demanding tasks.
With modern GPUs, energy usage is an important constraint on the possible power. GPU designs are usually highly scalable, allowing the manufacturer to put multiple chips on the same video card, or to use multiple video cards that work in parallel. Peak performance of any system is essentially limited by the amount of power it can draw and the amount of heat it can dissipate. Consequently, performance per watt of a GPU design translates directly into peak performance of a system that uses that design.
Since GPUs may also be used for some general purpose computation, sometimes their performance is measured in terms also applied to CPUs, such as FLOPS per watt.
While performance per watt is useful, absolute power requirements are also important. Claims of improved performance per watt may be used to mask increasing power demands. For instance, though newer generation GPU architectures may provide better performance per watt, continued performance increases can negate the gains in efficiency, and the GPUs continue to consume large amounts of power.
Benchmarks that measure power under heavy load may not adequately reflect typical efficiency. For instance, 3DMark stresses the 3D performance of a GPU, but many computers spend most of their time doing less intense display tasks (idle, 2D tasks, displaying video). So the 2D or idle efficiency of the graphics system may be at least as significant for overall energy efficiency. Likewise, systems that spend much of their time in standby or soft off are not adequately characterized by just efficiency under load. To help address this some benchmarks, like SPECpower, include measurements at a series of load levels.
The efficiency of some electrical components, such as voltage regulators, decreases with increasing temperature, so the power used may increase with temperature. Power supplies, motherboards, and some video cards are some of the subsystems affected by this. So their power draw may depend on temperature, and the temperature or temperature dependence should be noted when measuring.
Performance per watt also typically does not include full life-cycle costs. Since computer manufacturing is energy intensive, and computers often have a relatively short lifespan, energy and materials involved in production, distribution, disposal and recycling often make up significant portions of their cost, energy use, and environmental impact.
Energy required for climate control of the computer's surroundings is often not counted in the wattage calculation, but can be significant.
Other energy efficiency measures
SWaP (space, wattage and performance) is a Sun Microsystems metric for data centers, incorporating energy and space.
SWaP = Performance / (Space × Power)
Where performance is measured by any appropriate benchmark, and space is size of the computer.
- Energy efficiency benchmarks
- Average CPU power (ACP) – Measure of power consumption when running several standard benchmarks
- EEMBC – EnergyBench
- SPECpower – Benchmark for web servers running Java (Server Side Java Operations per Joule)
- Data center infrastructure efficiency (DCIE)
- GeForce 9 Series – GPU list, has energy use and theoretical FLOPS
- IT energy management
- Koomey's law
- Landauer's principle
- Low-power electronics
- Power usage effectiveness (PUE)
Notes and references
- Fujitsu Develops Multi-core Processor for High-Performance Digital Consumer Products Fujitsu
- FR-V Single-Chip Multicore Processor:FR1000 Fujitsu
- "The Green500 List". Green500.
- "Nvidia Tegra K1 In-Depth: The Power Of An Xbox In A Mobile SoC?".
- "Government unveils world's fastest computer". CNN. Archived from the original on 2008-06-10.
performing 376 million calculations for every watt of electricity used.
- "IBM Roadrunner Takes the Gold in the Petaflop Race".
- "Top500 Supercomputing List Reveals Computing Trends".
IBM... BlueGene/Q system .. setting a record in power efficiency with a value of 1,680 Mflops/watt, more than twice that of the next best system.
- "IBM Research A Clear Winner in Green 500".
- "Intel squeezes 1.8 TFlops out of one processor". TG Daily.
- "Teraflops Research Chip". Intel Technology and Research.
- Joel Adams. "Microwulf: Power Efficiency". Microwulf: A Personal, Portable Beowulf Cluster.
- "MPPA MANYCORE - Many-core processors - KALRAY - Agile Performance".
- "The Green 500".
- "Green 500 list ranks supercomputers". iTnews Australia.
- "The Green500 List - November 2014".
- "Eurotech Eurora, the PRACE prototype deployed by Cineca and INFN, scores first in Green500 list". Cineca. Cineca. Retrieved 28 June 2013.
- "Eurora - Aurora Tigon - Top500 list". top500.org. Retrieved 28 June 2013.
- "University of Tennessee Supercomputer Sets World Record for Energy Efficiency". National Institute for Computational Sciences News. University of Tennessee & Oak Ridge National Laboratory. Retrieved 21 November 2012.
- Atwood, Jeff (2006-08-18). "Video Card Power Consumption".
- "Video card power consumption". Xbit Labs.
- Tim Smalley. "Performance per What?". Bit Tech. Retrieved 2008-04-21.
- "SPEC launches standardized energy efficiency benchmark". ZDNet.
- Mike Chin. "Asus EN9600GT Silent Edition Graphics Card". Silent PC Review. p. 5. Retrieved 2008-04-21.
- MIke Chin. "80 Plus expands podium for Bronze, Silver & Gold". Silent PC Review. Retrieved 2008-04-21.
- Mike Chin. "Life Cycle Analysis and Eco PC Review". Eco PC Review.
- Eric Williams (2004). "Energy intensity of computer manufacturing: hybrid assessment combining process and economic input-output methods". Environ. Sci. Technol. 38 (22): 6166–74. doi:10.1021/es035152j. PMID 15573621.
- Wu-chun Feng (2005). "The Importance of Being Low Power in High Performance Computing". CT Watch Quarterly 1 (5).
- Greenhill, David. "SWaP Space Watts and Power". US EPA Energystar. Retrieved 14 November 2013.
- Wu-Chun Feng (October 2003). "Making a case for Efficient Supercomputing". ACM Queue 1 (7).