The earliest attempts to reduce the amount of power required by an electronic device were related to the development of the wristwatch. As of 2013, processors specifically designed for wristwatches are the lowest-power processors manufactured today -- often 4-bit, 32 kHz processors.
The density and speed of integrated-circuit computing elements have increased exponentially for several decades, following a trend described by Moore's Law. While it is generally accepted that this exponential improvement trend will end, it is unclear exactly how dense and fast integrated circuits will get by the time this point is reached. Working devices have been demonstrated which were fabricated with a MOSFET transistor channel length of 6.3 nanometres using conventional semiconductor materials, and devices have been built that used carbon nanotubes as MOSFET gates, giving a channel length of approximately one nanometre. The density and computing power of integrated circuits are limited primarily by power-dissipation concerns.
The overall power consumption of a new personal computer has been increasing at about 22% growth per year. This increase in consumption comes even though the energy consumed by a single CMOS logic gate to change state has fallen exponentially with the Moore's law shrinking of process feature size. 
An integrated-circuit chip contains many capacitive loads, formed both intentionally (as with gate-to-channel capacitance) and unintentionally (between conductors which are near each other but not electrically connected). Changing the state of the circuit causes a change in the voltage across these parasitic capacitances, which involves a change in the amount of stored energy. As the capacitive loads are charged and discharged through resistive devices, an amount of energy comparable to that stored in the capacitor is dissipated as heat:
The effect of heat dissipation on state change is to limit the amount of computation that may be performed on a given power budget. While device shrinkage can reduce some parasitic capacitances, the number of devices on an integrated-circuit chip has increased more than enough to compensate for reduced capacitance in each individual device. Some circuits – dynamic logic, for example – require a minimum clock rate in order to function properly, wasting "dynamic power" even when it has nothing to do. Other circuits – most famously, the RCA 1802, but also many later chips such as the WDC 65C02, the 80C85, the Freescale 68HC11 and some other CMOS chips – use "fully static logic" that has no minimum clock rate, but can "stop the clock" and hold their state indefinitely. When the clock is stopped such circuits use no dynamic power but they still have a small, static power consumption caused by leakage current.
As circuits shrink, subthreshold leakage current becomes more important. This leakage current results in power consumption, even when no switching is taking place (static power consumption); with modern chips, this current is frequently more than 50% of power used by the IC.
Reducing power loss
Loss from subthreshold leakage can be reduced by raising the threshold voltage and lowering the supply voltage. Both these changes slow the circuit down significantly, and some modern low-power circuits use dual supply voltages to provide speed on critical parts of the circuit and lower power on non-critical paths. Some circuits even use different transistors (with different threshold voltages) in different parts of the circuit, in an attempt to further reduce power consumption without significant performance loss.
Another method used to reduce static power consumption is power gating: the use of sleep transistors to disable entire blocks when not in use. Systems which are dormant for long periods of time and "wake up" to perform a periodic activity are often in an isolated location monitoring an activity. These systems are generally battery- or solar-powered; power consumption is a key design factor. By shutting down a functional but leaky block until it is used, leakage current can be reduced significantly. For some embedded systems that only function for short periods at a time, this can dramatically reduce power consumption.
Two other approaches exist to lowering the power cost of state changes. One is to reduce the operating voltage of the circuit, as in a dual-voltage CPU, or to reduce the voltage change involved in a state change (making a state change only, changing node voltage by a fraction of the supply voltage—low voltage differential signaling, for example). This approach is limited by thermal noise within the circuit. There is a characteristic voltage (proportional to the device temperature and to the Boltzmann constant), which the state switching voltage must exceed in order for the circuit to be resistant to noise. This is typically on the order of 50–100 mV, for devices rated to 100 degrees Celsius external temperature (about 4 kT, where T is the device's internal temperature in kelvins and k is the Boltzmann constant).
The second approach is to attempt to provide charge to the capacitive loads through paths that are not primarily resistive. This is the principle behind adiabatic circuits. The charge is supplied either from a variable-voltage inductive power supply, or by other elements in a reversible-logic circuit. In both cases, the charge transfer must be primarily regulated by the non-resistive load. As a practical rule of thumb, this means the change rate of a signal must be slower than that dictated by the RC time constant of the circuit being driven. In other words, the price of reduced power consumption per unit computation is a reduced absolute speed of computation. In practice although adiabatic circuits have been built, they have been difficult to use to reduce computation power substantially in practical circuits.
Finally, there are several techniques used to reduce the number of state changes associated with a given computation. For clocked- logic circuits the technique of clock gating is used, to avoid changing the state of functional blocks that are not required for a given operation. As a more-extreme alternative, the asynchronous logic approach implements circuits in such a way that a specific externally supplied clock is not required. While both of these techniques are used to varying extents in integrated circuit design, the limit of practical applicability for each appears to have been reached.
Wireless communication elements
This can be achieved by using power aware protocols and joint power control systems.
If current trends continue, "Energy costs, now about 10% of the average IT budget, could rise to 50% ... by 2010".
The weight and cost of power supply and cooling systems generally depends on the maximum possible power that could be used at some instant. There are two ways to prevent a system from being permanently damaged by excessive heat. Most desktop computers design power and cooling systems around the worst-case CPU power dissipation at the maximum frequency, maximum workload, and worst-case environment. To reduce weight and cost, many laptop computers systems choose to use a much lighter, lower-cost cooling system designed around a much lower Thermal Design Power, that is somewhat above expected maximum frequency, typical workload, and typical environment. Typically such systems reduce (throttle) the clock rate when the CPU die temperature gets too hot, reducing the power dissipated to a level that the cooling system can handle.
- Acorn RISC Machine
- AMULET microprocessor
- Microchip nanoWatt XLP Microcontrollers
- TI's MSP430 Microcontrollers
- EFM32 Microcontrollers
- STMicroelectronics' STM32 Microcontrollers
- CPU power dissipation
- Cool Chips (symposium)
- Common Power Format
- Performance per watt
- Power management
- Green computing
- Dynamic frequency scaling
- Dynamic voltage scaling
- Operand isolation
- Paul DeMone. "The Incredible Shrinking CPU: Peril of Proliferating Power". 2004. 
- "How to use optional wireless power-save protocols to dramatically reduce power consumption" by Bill McFarland 2008.
- "Averting the IT Energy Crunch" by Rachael King)
- "High-level design synthesis of a low power, VLIW processor for the IS-54 VSELP Speech Encoder" by Russell Henning and Chaitali Chakrabarti (implies that, in general, if you know exactly what algorithm you want to run, hardware designed to specifically to run that algorithm will use less power than general-purpose hardware running that algorithm at the same speed).
- CRISP: A Scalable VLIW Processor for Low Power Multimedia Systems by Francisco Barat 2005
- A Loop Accelerator for Low Power Embedded VLIW Processors by Binu Mathew and Al Davis
- Ultra-Low Power Design by Jack Ganssle