Q Electrical Dinamic Power
Q Electrical Dinamic Power
For properly sized and ratioed gates, the contribution to the overall dynamic power due
to Pshortcircuit is of the order of 10-20%.
2.
Switching power This is the power consumed due to charging and discharging
dissipation:
of capacitive loads when the circuit has some activities due to change in inputs. The
capacitive load at different circuit gates depends upon the fanout of the gate, output
capacitance, and wiring capacitances. It may be noted that a node with load capacitance
might not switeh when the clock is switching. To take care of this, a quantity called
switching activity (a) is often used. It determines how often switching occurs on a node
with load capacitance. If VoD is the supply voltage, Vawing is the change in voltage level
of the switched capacitance, C is the capacitance being switched and f is the frequency
of operation, the switching power is given by,
Pswitching=C X VDD x Vswing X a x f
Subthreshold leakage: When gate voltage is below threshold voltage but very eloSe to
It is caused by
,Subthreshold conduction current flows between source and drain.
202 Low Power Embedded System Design
Reverse biased
Junction BTBT
Bulk
Fig. 11.1 Leakage components in a transistor.
11.2.1
Algorithmic Power Minimization
lt mainly focuses on reducing the number of
plementation. For example, in operations requiring larger power in a target
iun
many processors, the cost of an
may be different from
hrst
a
logical addition/subtraction operatlo
operation. Thus, to check "whether x is equal to y", one
pertorm subtraction operation followed by
a
ay
the other
hand, if the logical operation takes checking the status register for
zero-bit.
uSing a comparison lesser power, 2 may be directly compared
instruction. The following are some of the wit
for selecting a important issues to be juageu
particular algorithm from alternatives:
1. Memory
reference: This is very important as
memory is normally off-chip from the
processor. A large number of accesses to the
memory mean good amount of actlvILy
11.2 Power Reduction Techniques
203
the address/data
bus lines. The memory access
pattern is also important. If the access
Dattern sequential, only the least significant bits of address bus
is
change, whereas for
random access through the memory, most of the address bits will switch, thus creating
higher power dissipation.
2. Presence of cache memory: The presence and structure of cache
memory plays an im-
portant role. Cache can be fruitfully utilized to reduce both execution time and power
of an implementation if the
underlying algorithm has got locality in its behaviour. The
locality may be both temporal and spatial in nature. While a
to the fact that a temporal locality refers
memory location accessed at some time is also
in near future, spatial locality means if a likely to be accessed
memory location is accessed at some time, its
neighbouring locations are also likely to be accessed in near future. Thus,
inside the CPU cache saves not only the caching them
memory access time, but also the bus energy
consumption is reduced.
3. Recomputation vs. memory load/store: Normal power minimization techniques at
rithm level attempt to reduce the number of arithmetic algo-
operations. However, it may so
happen that to reduce the number of operations, some repeatedly
tion is done only once and stored at a performed computa-
memory location. Later, as and when necessary
it is reloaded from the
memory. This may lead to increased power consumption due to
extra memory accesses. If the
operands are already available in CPU registers or on-chip
cache, it may be better to recompute the value, instead of loading it from
memory, fromn
power consumption point of view.
4. Compiler optimization technique: The typical techniques used by an optimizing
can be used to reduce power compiler
consumption of a piece of code. The strategies involve
strength reduction, common suberpression elimination, minimizing memory
Loop unrolling is also often beneficial as it reduces loop overhead. traffic etc.
5. Number representation: This is another area for
algorithmic power trade-off. The fol-
lowing points may be noted:
Fited vs.
floating point representation: Fixed point operations are much sinmpler
than floating point Thus, it
ones.
normally leads to power saving, though accuracy
may suffer.
Sign-nagnitude vs. 2's co0mplement: Selection of
sign-magnitude representation may
have significant power saving over 2's
complement, if input samples are uncorrelated
and range is minimized.
Precision of operations: This is inmportant, since having lower
precision allows one
to reduce the size of space needed to store the values. A
typical example of this
is to reduce the number of bits in mantissa
portion in several signal processing
applications including speech and image to improve circuit delay and power.
of supply voltage). Since all such systems are operating simultaneously, total power saving is
1/n of the original power. This has been shown in Fig. 11.2(b). A problem with the scheme
is that the hardware is duplicated with other necessary multiplexing and demultiplexing logic.
In this the schene,
Another possible architectural modification often suggested is pipelining.
functional block of Fig. 11.2(a) is divided into a sequence of sub-blocks, each of approximately
same delay. Thus, if the number of sub-blocks be n, from pipelining principle, the overall
system can produce output at a rate of about nx f. Now, if the supply voltage of individual
stages is reduced bya factor of n, power reduces by a factor of 1/n. However, we need to
accommodate extra latches between the stages for proper synchronization between them. This
introduces some overhead in terms of area, performance and power as well. The scherne has
been shown in Fig. 11.2(c).
Input
Vin
Sub-biock 0
Input
Latch
Vin
Sub-biock 1
V/n| V/n V/n
Latch
Supply
voltage
Input Copy 0 Copy 1 Copy (n-1)
Latch
Original
block Vin
Mod-n
Counter Sub-biock (n-1)
1, Static dynamic logic families: CMOS logic can be realized as static or dynamic
vs.
circuit, output is always precharged to 1. Thus, power will be consumed whenever dynamic
output is zero. Hence, the probability of a power consuming transition is
0.25, which is
higher than a static gate. However, dynamic gate has lower input
by a factor of 2 to 3) compared to static gate, as the capacitance (almost
p-network
effective capacitance that a dynamic gate sees is much lower.
is absent. Hence, the
in distributing the
But, the power consumed
precharging signal also needs to be considered.
2. Glitches and hazards: This is another
potential source of power consumption, particularly
in static CMOS circuits. A
glitch at the output of a gate can come due to the differences
in arrival times of input signals. A typical example of it is AND-OR-INVERT based
implementation the function f ab + ac. The circuit is shown in Fig. 11.3(a).
of =
(a)
D-
(b)
Fig. 11.3 (a) Circuit with hazard, (b) Circuit without hazard.
3. Technology mapping: The logic synthesis library often contains different
of the implementations
same logic module. They normally differ in terms of area,
delay, power, ete. A logic
synthesis procedure targeted to power minimization may choose implementations that
require higher area or delay, but score better in terms of power. For example, consider
a
four-input AND function. Two possible implementations are shown in Fi8 1.a
and Fig. 11.4(b), respectively. The ON-probabilities of the gates are also shown. Total
+0.9375 x 0.0625 = 0.3555. Thus. the first implementation consumes more power than
the second one. In this case, though there is no area penalty, the second implementation
has one gate delay more than the first one.
p 0.25
P=0.25
P 0.0625 = 0.125
p 0.5 p 0.5
P 0.0625
P 0.25
(a) (b)
Fig. 11.4 Two different implementations of 4-input AND
State
Enable register
Clock CilockClock
Fig. 11.5 Clock gating of FSM.
VOD
Circuit Sleep
transistor
Active-
going tu0 a low power state takes time. The longer the duration for which we walnt to
shutdown higher is the time taken duriug
a system, reactivatio
avoiding a power-down mode will cost unnecessary power.
frequent power-down mode will affect system pertormance
A naive approach may be to power-down a system whenever there s o eqest This
will definitely affect performance severely. A more sophisticated nethod is to 1 pred ttre
shutdow. In this approach, the goal is to predict the next arrival of service qnest anut wake
up the system just before that. Prediction can be made in severd dtferent ways us foellws.
208 Low Power Embedded System Design
gn
1. Ficed times: If the system does not receive any service request during an interval of
length TON, it shuts down for a fixed period of time TorF. Choice of Ton and Topr
behaviour.
may be made experimentally by studying system
2. Analysing system state: In this approach, there is a constant monitoring of the ser-
vice requests. The monitoring is done via a power manager that observes the system
Power
Manager
Power
management
commands
Status Status Status
Service Service
Provider Requestor
Queue
Applications
Kernel Power
ACPI driver management
AML interpreter
Device
drivers ACPI
ACPI tables
ACPI registers
ACPI BIOOs
Hardware platform