Energy proportional computing
Energy proportionality is a measure of the relationship between power consumed in a computer system, and the rate at which useful work is done (its utilization, which is one measure of performance). If the overall power consumption is proportional to the computer's utilization, then the machine is said to be energy proportional. Equivalently stated, for an idealized energy proportional computer, the overall energy per operation (a measure of energy efficiency) is constant for all possible workloads and operating conditions. The concept was first proposed in 2007 by Google engineers Luiz André Barroso and Urs Hölzle, who urged computer architects to design servers that would be much more energy efficient for the datacenter setting. Energy proportional computing is currently an area of active research, and has been highlighted as an important design goal for cloud computing. There are many technical challenges remaining in the design of energy proportional computers. Furthermore, the concept of energy proportionality is not inherently restricted to computing. Although countless energy efficiency advances have been made in non-computing disciplines, they have not been evaluated rigorously in terms of their energy proportionality.
Background in energy sustainability
Sustainable energy is the ideal that society should serve its energy needs without negatively impacting future generations, and which various organizations, governments, and individuals have been advocating. To meet this ideal, efficiency improvements are required in three aspects of the energy ecosystem:
Since our need for energy generation and storage are driven by our demand, more efficient ways of consuming energy can drive large improvements in energy sustainability. Efforts in sustainable energy consumption can be classified at a high level by the three following categories:
- Recycle: Capture and recover wasted energy to do more work, that would otherwise be lost as heat.
- Reuse: Amortize the cost of energy generation, storage, and delivery by sharing energy and its infrastructure among different loads.
- Reduce: Reduce demand for energy by doing more work with less energy (improve consumption efficiency), or not doing the work at all by changing behavior.
Many efforts in making energy consumption more sustainable are focused on the "reduce" theme for unpredictable and dynamic workloads (which are commonly encountered in computing). This can be considered as power management. These efforts can be lumped into two general approaches, which are not specific to computing, but commonly applied in that domain:
- Idle power-down: This technique exploits gaps in workload demand to shut off components that are idle. When shut down, components cannot do any useful work. The problems unique to this approach are: (1) it costs time and energy to transition between active and idle power-down states, (2) no work can be done in the off state, so power-up must be done to handle a request, and (3) predicting idle periods and adapting appropriately by choosing the right power state at any moment is difficult.
- Active performance scaling: Unlike idle-power down, this approach allows work to be done in any state, all of which are considered active, but with different power/performance tradeoffs. Usually, slower modes consume less power. The problems unique to this approach are: (1) it is difficult to determine which combination of states is the most energy efficient for an application, and (2) the energy efficiency improvements are usually not as lucrative as those from idle power-down modes.
In practice, both types of approaches are used commonly and mixed together.
Motivation for energy proportionality
Until about 2010, computers were far from energy proportional for two primary reasons. A critical issue is high static power, which means that the computer consumes significant energy even when it is idle. High static power is common in servers owing to their architectural, circuit, and manufacturing optimizations that favor very high performance instead of low power. High static power relative to the maximum loaded power results in low dynamic range, poor energy proportionality, and thus, very low efficiency at low to medium utilizations. This can be acceptable for traditional high-performance computing systems and workloads, which try to extract the maximum utilization possible out of the machines, where they are most efficient. However, in modern datacenters that run popular and large-scale cloud computing applications, servers spend most of their time around 30% utilization, and are rarely running under maximum load, which is a very energy-inefficient operating point for typical servers.
The second major reason is that the various hardware operating states for power management can be difficult to use effectively. This is because deeper low power states tend to have larger transition latency and energy costs than lighter low power states. For workloads that have frequent and intermittent bursts of activity, such as web search queries, this prevents the use of deep lower power states without incurring significant latency penalties, which may be unacceptable for the application.
Energy proportional computer hardware could solve this problem by being efficient at mid-utilization levels, in addition to efficient peak performance and idle states (which can afford to use deep low power sleep modes). However, achieving this goal will require many innovations in computer architecture, microarchitecture, and perhaps circuits and manufacturing technology. The ultimate benefit would be improved energy efficiency, which would allow for cheaper computer hardware, datacenter provisioning, power utility costs, and overall total cost of ownership (TCO).
Research in energy proportional computing
The CPU was the first and most obvious place for researchers to focus on for energy efficiency and low power. This is because traditionally it has been the biggest consumer of power in computers. Owing to many innovations in low power technology, devices, circuits, microarchitecture, and electronic design automation, today's CPUs are now much improved in energy efficiency. This has led to the situation where CPUs no longer dominate energy consumption in a computer.
Some more well-known examples of the many innovations in CPU energy efficiency include the following:
- Clock gating: The clock distribution to entire functional units in the processor is blocked, thus saving dynamic power from the capacitive charging and discharging of synchronous gates and wires.
- Power gating: Entire functional units of the processor are disconnected from the power supply, thus consuming effectively zero power.
- Multiple voltage domains: Different portions of the chip are supplied from different voltage regulators, such that each can be individually controlled for scaling or gating of the power supply.
- Multi-threshold voltage designs: Different transistors in the design use different threshold voltages in order to optimize delay and/or power.
- Dynamic frequency scaling (DFS): The clock frequency of the processor is adjusted statically or dynamically to achieve different power/performance tradeoffs.
- Dynamic voltage scaling (DVS): The supply voltage of the processor is adjusted statically or dynamically to achieve different power/reliability/performance tradeoffs.
- Dynamic voltage/frequency scaling (DVFS): Both voltage and frequency are varied dynamically to achieve better power/performance tradeoffs than either DFS or DVS alone can provide.
Note that all of the above innovations for CPU power consumption preceded Barroso and Hölzle's paper on energy proportionality. However, most of them have contributed some combination of the two broad types of power management mentioned above, namely, idle power-down and active performance scaling. These innovations have made CPUs scale their power relatively well in relation to their utilization, making them the most energy-proportional of computer hardware components. Unlike CPUs, most other computer hardware components lack power management controls, especially those that enable active performance scaling. CPUs are touted as a good example of energy-proportional computer engineering that other components should strive to emulate.
Memory was cited as one of the major system components that has traditionally been very energy disproportional. Memory tends to have relatively high static power due to extremely high transistor counts and densities. Furthermore, because memory is often left idle either due to cache-friendly workloads or low CPU utilization, a large proportion of energy use is due to the static power component.
Traditionally, dynamic voltage and frequency scaling on main memory DRAM has not been possible due to limitations in the DDR JEDEC standards. However, these limitations exist because the conventional wisdom in memory design is that large design margins are needed for good yield under worst-case manufacturing process variations, voltage fluctuations, and temperature changes. Thus, scaling voltage and frequency, which is commonly done in CPUs, is considered difficult, impractical, or too risky for data corruption to apply in memories.
Nevertheless, DVFS has been recently proposed for the DDR3 memory bus interface independently by two research groups in 2011 to scale memory power with throughput. Because the memory bus voltage and frequency are independent of internal DRAM timings and voltages, scaling this interface should have no effect on memory cell integrity. Furthermore, David et al. claim their approach improves energy proportionality because the memory bus consumes a lot of static power that is independent of the bus utilization.
Another research group proposed trading off memory bandwidth for lower energy per bit and lower power idle modes in servers by using mobile-class LPDDR2 DRAMs. This would increase memory energy proportionality without affecting performance for datacenter workloads that are not sensitive to memory bandwidth. The same group also proposed redesigning the DDR3 interface to better support energy proportional server memory without sacrificing peak bandwidth.
Networks are emphasized as a key component that are very energy disproportional and contribute to poor cluster and datacenter-level energy proportionality, especially as other components in a server and datacenter become more energy proportional. The main reason they are not energy proportional is because networking elements are conventionally always on due to the way routing protocols are designed, and the unpredictability of message traffic. Clearly, links cannot be shut down entirely when not in use due to the adverse impact this would make on routing algorithms (the links would be seen as faulty or missing, causing bandwidth and load balancing issues in the larger network). Furthermore, the latency and energy penalties that are typically incurred from switching hardware to low power modes would likely degrade both overall network performance and perhaps energy. Thus, like in other systems, energy proportionality of networks will require the development of active performance scaling features, that do not require idle power-down states to save energy when utilization is low.
In recent years, efforts in green networking have targeted energy-efficient Ethernet (including the IEEE 802.3az standard), and many other wired and wireless technologies. A common theme is overall power reduction by low idle power and low peak power, but their evaluation in terms of energy proportionality at the link, switch, router, cluster, and system-levels are more limited. Adaptive link rate is a popular method for energy-aware network links.
Some authors have proposed that to make datacenter networks more energy proportional, the routing elements need greater power dynamic range. They proposed the use of the flattened butterfly topology instead of the common folded Clos network in use in datacenters (also known as the fat tree) to improve overall power efficiency, and to use adaptive link rates to adjust link power in relation to utilization. They also propose predicting future link utilization to scale data rates in anticipation.
Nevertheless, to make networks more energy proportional, improvements need to be made at several layers of abstraction.
Storage and databases
Data storage is another category of hardware that has traditionally been very energy disproportional. Although storage technologies are non-volatile, meaning that no power is required to retain data, the interface on the storage devices are typically powered up for access on demand. For example, in hard drives, although the data is stored in a non-volatile magnetic state, the disk is typically kept spinning at constant RPM, which requires considerable power. This is in addition to the solid-state electronics that maintain communications with the rest of the computer system, such as the Serial ATA interface commonly found in computers.
A common emerging technique for energy-aware and energy proportional data storage is that of consolidation, namely, that data should be aggregated to fewer storage nodes when throughput demands are low. However, this is not a trivial task, and it does not solve the fundamental issue of energy disproportionality within a single server. For this, hardware design innovations are needed at the individual storage unit level. Even modern solid state drives (SSDs) made with flash memory have shown signs of energy disproportionality.
Databases are a common type of workload for datacenters, and they have unique requirements that make use of idle low-power states difficult. However, for "share-nothing" databases, some have proposed dynamic scaling of the databases as "wimpy nodes" are powered up and down on demand. Fortunately, researchers have claimed that for these share-nothing databases, the most energy efficient architecture is also the highest-performing one. However, this approach does not address the fundamental need for energy proportionality at the individual component level, but approximates energy proportionality at the aggregate level.
Datacenter infrastructure: Power supplies and cooling
Power supplies are a critical component of a computer, and historically have been very power inefficient. However, modern server-level power supplies are achieving over 80% power efficiency across a wide range of loads, although they tend to be least efficient at low utilizations. Nevertheless, as workloads in datacenters tend to utilize servers in the low to medium range, this region of the operation is inefficient for the server power supplies and datacenter-scale uninterruptible power supplies (UPSes). Innovations are needed to make these supplies much more efficient at the typical region of operation.
Like power supplies, datacenter and server-level cooling tends to be most efficient at high loads. Coordinating server power management of the traditional components along with active cooling is critical to improving overall efficiency.
System and datacenter-level
Perhaps the most efforts in energy proportionality have been targeted at the system, cluster, and datacenter scale. This is because improvements in aggregate energy proportionality can be accomplished largely with software reorganization, requiring minimal changes to the underlying hardware. However, this relies on the assumption that a workload can scale up and down across multiple nodes dynamically based on aggregate demand. Many workloads cannot achieve this easily due to the way data may be distributed across individual nodes, or the need for data sharing and communication among many nodes even to serve a single request. Note that aggregate energy proportionality can be achieved with this scheme even if individual nodes are not energy proportional
Various application, middleware, OS, and other types of software load balancing approaches have been proposed to enable aggregate energy proportionality. For instance, if individual workloads are contained entirely within virtual machines (VMs), then the VMs can be migrated over the network to other nodes at runtime as consolidation and load balancing are performed. However, this can incur significant delay and energy costs, so the frequency of VM migration cannot be too high.
Researchers have proposed improving the low power idle states of servers, and the wake-up/shutdown latencies between active and idle modes, because this is an easier optimization goal than active performance scaling. If servers could wake up and shutdown at a very fine time granularity, then the server would become energy proportional, even if active power is constant at all utilizations.
Others have proposed hybrid datacenters, such as KnightShift, such that workloads are migrated dynamically between high-performance hardware and low-power hardware based on the utilization. However, there are many hardware and software technical challenges to this approach. These can include the hardware and software support for heterogeneous computing, shared data and power infrastructure, and more.
A study from 2011 argues that energy proportional hardware is better at mitigating the energy inefficiences of software bloat, a prevalent phenomenon in computing. This is because the particular hardware components that bottlenecks overall application performance depends on the application characteristics, i.e., which parts are bloated. If non-bottlenecked components are very energy disproportional, then the overall impact of software bloat can make the system less efficient. For this reason, energy proportionality can important across a wide range of hardware and software applications, not just in datacenter settings.
- Barroso, L. A.; Hölzle, U. (2007). "The Case for Energy-Proportional Computing". Computer. 40 (12): 33–37. doi:10.1109/mc.2007.443.
- Armbrust, M.; Stoica, I.; Zaharia, M.; Fox, A.; Griffith, R.; Joseph, A. D.; Katz, R.; Konwinski, A.; Lee, G.; Patterson, D.; Rabkin, A. (2010). "A view of cloud computing". Communications of the ACM. 53 (4): 50. doi:10.1145/1721654.1721672.
- Barroso, Luiz André; Clidaras, Jimmy; Hölzle, Urs (2013). The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second edition. Morgan Claypool. p. 80. doi:10.2200/S00516ED2V01Y201306CAC024. ISBN 9781627050098. S2CID 26474390.
- Barroso, L. A.; Hölzle, U. (2009). "The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines". Synthesis Lectures on Computer Architecture. 4 (1): 1–108. doi:10.2200/s00193ed1v01y200905cac006.
- V. Tiwari, D. Singh, S. Rajgopal, G. Mehta, R. Patel, and F. Baez, "Reducing power in high-performance microprocessors," in Proceedings of the 35th annual conference on Design automation conference - DAC ’98. New York, New York, USA: ACM Press, May 1998, pp. 732–737. [Online]. Available: http://dl.acm.org/citation.cfm?id=277044.277227
- Q. Wu, M. Pedram, and X. Wu, "Clock-gating and its application to low power design of sequential circuits," IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 47, no. 3, pp. 415–420, Mar. 2000. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=841927
- N. H. E. Weste and D. M. Harris, CMOS VLSI Design: A Circuits and Systems Perspective, 4th ed. Addison-Wesley, 2011.
- Z. Hu, A. Buyuktosunoglu, V. Srinivasan, V. Zyuban, H. Jacobson, and P. Bose, "Microarchitectural techniques for power gating of execution units," in Proceedings of the 2004 international symposium on Low power electronics and design - ISLPED ’04. New York, New York, USA: ACM Press, Aug. 2004, p. 32. [Online]. Available: http://dl.acm.org/citation.cfm?id=1013235.1013249
- "A Survey Of Architectural Techniques for Near-Threshold Computing", S. Mittal, ACM JETC, 2015
- S. Herbert and D. Marculescu, "Analysis of dynamic voltage/frequency scaling in chip-multiprocessors," in Proceedings of the 2007 international symposium on Low power electronics and design - ISLPED ’07. New York, New York, USA: ACM Press, 2007, pp. 38–43. [Online]. Available: http://portal.acm.org/citation.cfm?doid=1283780.1283790
- Gupta, P.; Agarwal, Y.; Dolecek, L.; Dutt, N.; Gupta, R. K.; Kumar, R.; Mitra, S.; Nicolau, A.; Rosing, T. S.; Srivastava, M. B.; Swanson, S.; Sylvester, D. (2013). "Underdesigned and Opportunistic Computing in Presence of Hardware Variability". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 32 (1): 8–23. CiteSeerX 10.1.1.353.6564. doi:10.1109/tcad.2012.2223467.
- Q. Deng, D. Meisner, L. Ramos, T. F. Wenisch, and R. Bianchini, "MemScale: active low-power modes for main memory," ACM SIGPLAN Notices, vol. 46, no. 3, pp. 225–238, Feb. 2011. [Online]. Available: http://doi.acm.org/10.1145/1961296.1950392
- H. David, C. Fallin, E. Gorbatov, U. R. Hanebutte, and O. Mutlu, "Memory power management via dynamic voltage/frequency scaling," in Proceedings of the 8th ACM international conference on Autonomic computing - ICAC ’11. New York, New York, USA: ACM Press, Jun. 2011, p. 31. [Online]. Available: http://dl.acm.org/citation.cfm?id=1998582.1998590
- Malladi, K. T.; Lee, B. C.; Nothaft, F. A.; Kozyrakis, C.; Periyathambi, K.; Horowitz, M. (2012). "Towards energy-proportional datacenter memory with mobile DRAM". ACM SIGARCH Computer Architecture News. 40 (3): 37. CiteSeerX 10.1.1.365.2176. doi:10.1145/2366231.2337164.
- K. T. Malladi, I. Shaeffer, L. Gopalakrishnan, D. Lo, B. C. Lee, and M. Horowitz, "Rethinking DRAM Power Modes for Energy Proportionality," in 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, Dec. 2012, pp. 131–142. [Online]. Available: http://dl.acm.org/citation.cfm?id=2457472.2457492
- Abts, D.; Marty, M. R.; Wells, P. M.; Klausler, P.; Liu, H. (2010). "Energy proportional datacenter networks". ACM SIGARCH Computer Architecture News. 38 (3): 338. CiteSeerX 10.1.1.308.136. doi:10.1145/1816038.1816004.
- Bianzino, A. P.; Chaudet, C.; Rossi, D.; Rougier, J.-L. (2012). "A Survey of Green Networking Research". IEEE Communications Surveys & Tutorials. 14 (1): 3–20. arXiv:1010.3880. doi:10.1109/surv.2011.113010.00106.
- Shu, G.; Choi, W. S.; Saxena, S.; Kim, S. J.; Talegaonkar, M.; Nandwana, R.; Elkholy, A.; Wei, D.; Nandi, T. (2016-01-01). 23.1 A 16Mb/s-to-8Gb/s 14.1-to-5.9pJ/b source synchronous transceiver using DVFS and rapid on/off in 65nm CMOS. 2016 IEEE International Solid-State Circuits Conference (ISSCC). pp. 398–399. doi:10.1109/ISSCC.2016.7418075. ISBN 978-1-4673-9466-6.
- H. Amur, J. Cipar, V. Gupta, G. R. Ganger, M. A. Kozuch, and K. Schwan, "Robust and flexible power-proportional storage," in Proceedings of the 1st ACM symposium on Cloud computing - SoCC ’10. New York, New York, USA: ACM Press, Jun. 2010, p. 217. [Online]. Available: http://dl.acm.org/citation.cfm?id=1807128.1807164
- A. Verma, R. Koller, L. Useche, and R. Rangaswami, "SRCMap: energy proportional storage using dynamic consolidation," in FAST’10 Proceedings of the 8th USENIX conference on File and storage technologies. USENIX Association, Feb. 2010, p. 20. [Online]. Available: http://dl.acm.org/citation.cfm?id=1855511.1855531
- T. Härder, V. Hudlet, Y. Ou, and D. Schall, "Energy Efficiency Is Not Enough, Energy Proportionality Is Needed!" in DASFAA Workshops, ser. Lecture Notes in Computer Science, J. Xu, G. Yu, S. Zhou, and R. Unland, Eds., vol. 6637. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 226–239. [Online]. Available: https://doi.org/10.1007%2F978-3-642-20244-5
- D. Tsirogiannis, S. Harizopoulos, and M. A. Shah, "Analyzing the energy efficiency of a database server," in Proceedings of the 2010 international conference on Management of data - SIGMOD ’10. New York, New York, USA: ACM Press, Jun. 2010, p. 231. [Online]. Available: http://dl.acm.org/citation.cfm?id=1807167.1807194
- Meisner, D.; Gold, B. T.; Wenisch, T. F. (2009). "PowerNap: Eliminating Server Idle Power". ACM SIGARCH Computer Architecture News. 37 (1): 205. doi:10.1145/2528521.1508269.
- S. Greenberg, E. Mills, and B. Schudi, "Best Practices for Data Centers: Lessons Learned From Benchmarking 22 Data Centers," Lawrence Berkeley National Laboratory, Tech. Rep., 2006.
- N. Tolia, Z. Wang, M. Marwah, C. Bash, P. Ranganathan, and X. Zhu, "Delivering Energy Proportionality with Non Energy-Proportional Systems – Optimizing the Ensemble," 2008. [Online]. Available: https://www.usenix.org/legacy/event/hotpower08/tech/full\_papers/tolia/tolia\_html/[permanent dead link]
- X. Zheng and Y. Cai, "Achieving Energy Proportionality in Server Clusters," International Journal of Computer Networks (IJCN), vol. 1, no. 2, pp. 21–35, 2010.
- Chun, B.-G.; Iannaccone, G.; Iannaccone, G.; Katz, R.; Lee, G.; Niccolini, L. (2010). "An energy case for hybrid datacenters". ACM SIGOPS Operating Systems Review. 44 (1): 76. CiteSeerX 10.1.1.588.2938. doi:10.1145/1740390.1740408.
- D. Wong and M. Annavaram, "KnightShift: Scaling the Energy Proportionality Wall through Server-Level Heterogeneity," in 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, Dec. 2012, pp. 119–130. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6493613
- S. Bhattacharya, K. Rajamani, K. Gopinath, and M. Gupta, "The interplay of software bloat, hardware energy proportionality and system bottlenecks," in Proceedings of the 4th Workshop on Power-Aware Computing and Systems - HotPower ’11. New York, New York, USA: ACM Press, Oct. 2011, pp. 1–5. [Online]. Available: http://dl.acm.org/citation.cfm?id=2039252.2039253