u-4
u-4
html
ALU derives its name because it performs arithmetic and logical operations.
A simple ALU design is constructed with Combinational circuits. ALUs that
perform multiplication and division are designed around the circuits
developed for these operations while implementing the desired algorithm.
More complex ALUs are designed for executing Floating point, Decimal
operations and other complex numerical operations. These are called
Coprocessors and work in tandem with the main processor.
The design specifications of ALU are derived from the Instruction Set
Architecture. The ALU must have the capability to execute the instructions of
ISA. An instruction execution in a CPU is achieved by the movement of
data/datum associated with the instruction. This movement of data is
facilitated by the Datapath. For example, a LOAD instruction brings data
from memory location and writes onto a GPR. The navigation of data over
datapath enables the execution of LOAD instruction. We discuss Datapath
more in details in the next chapter on Control Unit Design. The trade-off in
ALU design is necessitated by the factors like Speed of execution, hardware
cost, the width of the ALU.
Combinational ALU
A primitive ALU supporting three functions AND, OR and ADD is explained in
figure 11.1. The ALU has two inputs A and B. These inputs are fed to AND
gate, OR Gate and Full ADDER. The Full Adder also has CARRY IN as an input.
The combinational logic output of A and B is statically available at the output
of AND, OR and Full Adder. The desired output is chosen by the Select
function, which in turn is decoded from the instruction under execution.
Multiplexer passes one of the inputs as output based on this select function.
Select Function essentially reflects the operation to be carried out on the
operands A and B. Thus A and B, A or B and A+B functions are supported by
this ALU. When ALU is to be extended for more bits the logic is duplicated for
as many bits and necessary cascading is done. The AND and OR logic are
part of the logical unit while the adder is part of the arithmetic unit.
Figure 11.1 A Primitive ALU supporting AND, OR and ADD function
The simplest ALU has more functions that are essential to support the ISA of
the CPU. Therefore the ALU combines the functions of 2's complement,
Adder, Subtractor, as part of the arithmetic unit. The logical unit would
generate logical functions of the form f(x,y) like AND, OR, NOT, XOR etc.
Such a combination supplements most of a CPU's fixed point data processing
instructions.
Figure 11.2
ALU Symbol
So far what we have seen is a primitive ALU. ALU can be as complex as the
variety of functions that are carried out by the ALU. The powerful modern
CPUs have powerful and versatile ALUs. Modern CPUs have multiple ALU to
improve efficiency.
Additionally, engineers can design the ALU to perform any type of operation. However, ALU
becomes more costly as the operations become more complex because ALU destroys more
heat and takes up more space in the CPU. This is the reason to make powerful ALU by
engineers, which provides the surety that the CPU is fast and powerful as well.
The calculations needed by the CPU are handled by the arithmetic logic unit (ALU); most of
the operations among them are logical in nature. If the CPU is made more powerful, which
is made on the basis of the ALU is designed. Then it creates more heat and takes more
power or energy. Therefore, it must be moderation between how complex and powerful ALU
is and not be more costly. This is the main reason the faster CPUs are more costly; hence,
they take much power and destroy more heat. Arithmetic and logic operations are the main
operations that are performed by the ALU; it also performs bit-shifting operations.
Although the ALU is a major component in the processor, the ALU's design and function
may be different in the different processors. For case, some ALUs are designed to perform
only integer calculations, and some are for floating-point operations. Some processors
include a single arithmetic logic unit to perform operations, and others may contain
numerous ALUs to complete calculations. The operations performed by ALU are:
o Logical Operations: The logical operations consist of NOR, NOT, AND, NAND,
OR, XOR, and more.
o Bit-Shifting Operations: It is responsible for displacement in the locations of the
bits to the by right or left by a certain number of places that are known as a
multiplication operation.
o Arithmetic Operations: Although it performs multiplication and division, this
refers to bit addition and subtraction. But multiplication and division operations
are more costly to make. In the place of multiplication, addition can be used as a
substitute and subtraction for division.
ALU input gets signals from the external circuits, and in response, external electronics get
outputs signals from ALU.
Data: Three parallel buses are contained by the ALU, which include two input and output
operand. These three buses handle the number of signals, which are the same.
Opcode: When the ALU is going to perform the operation, it is described by the operation
selection code what type of operation an ALU is going to perform arithmetic or logic
operation.
Output: The results of the ALU operations are provided by the status outputs in the
form of supplemental data as they are multiple signals. Usually, status signals like
overflow, zero, carry out, negative, and more are contained by general ALUs. When the
ALU completes each operation, the external registers contained the status output
signals. These signals are stored in the external registers that led to making them
available for future ALU operations.
o Input: When ALU once performs the operation, the status inputs allow ALU to
access further information to complete the operation successfully. Furthermore,
stored carry-out from a previous ALU operation is known as a single "carry-in"
bit.
Configurations of the ALU
The description of how ALU interacts with the processor is given below. Every arithmetic
logic unit includes the following configurations:
Accumulator
The intermediate result of every operation is contained by the accumulator, which means
Instruction Set Architecture (ISA) is not more complex because there is only required to
hold one bit.
Generally, they are much fast and less complex but to make Accumulator more stable; the
additional codes need to be written to fill it with proper values. Unluckily, with a single
processor, it is very difficult to find Accumulators to execute parallelism. An example of an
Accumulator is the desktop calculator.
Stack
Whenever the latest operations are performed, these are stored on the stack that holds
programs in top-down order, which is a small register. When the new programs are added
to execute, they push to put the old programs.
Register-Register Architecture
It includes a place for 1 destination instruction and 2 source instructions, also known as a 3-
register operation machine. This Instruction Set Architecture must be more in length for
storing three operands, 1 destination and 2 sources. After the end of the operations, writing
the results back to the Registers would be difficult, and also the length of the word should
be longer. However, it can be caused to more issues with synchronization if write back rule
would be followed at this place.
AC and the appended bit Qn+1 are initially cleared to 0 and the
sequence SC is set to a number n equal to the number of bits in the
multiplier. The two bits of the multiplier in Qn and Qn+1are
inspected. If the two bits are equal to 10, it means that the first 1 in
a string has been encountered. This requires subtraction of the
multiplicand from the partial product in AC. If the 2 bits are equal to
01, it means that the first 0 in a string of 0’s has been encountered.
This requires the addition of the multiplicand to the partial product
in AC. When the two bits are equal, the partial product does not
change. An overflow cannot occur because the addition and
subtraction of the multiplicand follow each other. As a consequence,
the 2 numbers that are added always have a opposite signs, a
condition that excludes an overflow. The next step is to shift right
the partial product and the multiplier (including Qn+1). This is an
arithmetic shift right (ashr) operation which AC and QR to the right
and leaves the sign bit in AC unchanged. The sequence counter is
decremented and the computational loop is repeated n times.
Product of negative numbers is important, while multiplying
negative numbers we need to find 2’s complement of the number to
change its sign, because it’s easier to add instead of performing
binary subtraction. product of two negative number is demonstrated
below along with 2’s complement.
Example – A numerical example of booth’s algorithm is shown
below for n = 4. It shows the step by step multiplication of -5 and -7.
BR = -5 = 1011,
BR' = 0100, <-- 1's Complement (change the values 0 to 1 and 1
to 0)
BR'+1 = 0101 <-- 2's Complement (add 1 to the Binary value
obtained after 1's complement)
QR = -7 = 1001 <-- 2's Complement of 0111 (7 = 0111 in Binary)
The explanation of first step is as follows: Qn+1
The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point
computation which was established in 1985 by the Institute of Electrical and Electronics Engineers
(IEEE). The standard addressed many problems found in the diverse floating point implementations that
made them difficult to use reliably and reduced their portability. IEEE Standard 754 floating point is the
most common representation today for real numbers on computers, including Intel-based PC’s, Macs,
and most Unix platforms.
There are several ways to represent floating point number but IEEE 754 is the most efficient in most
cases. IEEE 754 has 3 basic components:
IEEE 754 numbers are divided into two based on the above three components: single precision and
double precision.
TYPES SIGN BIASED EXPONENT NORMALISED MANTISA BIAS
Example –
85.125
85 = 1010101
0.125 = 001
85.125 = 1010101.001
=1.010101001 x 2^6
sign = 0
1. Single precision:
133 = 10000101
= 0 10000101 01010100100000000000000
2. Double precision:
1029 = 10000000101
= 0 10000000101 0101010010000000000000000000000000000000000000000000
Special Values: IEEE has reserved some values that can ambiguity.
Zero –
Zero is a special value denoted with an exponent and mantissa of 0. -0 and +0 are distinct
values, though they both are equal.
Denormalised –
If the exponent is all zeros, but the mantissa is not then the value is a denormalized number.
This means this number does not have an assumed leading one before the binary point.
Infinity –
The values +infinity and -infinity are denoted with an exponent of all ones and a mantissa of all
zeros. The sign bit distinguishes between negative infinity and positive infinity. Operations with
infinite values are well defined in IEEE.
0 0 exact 0
255 0 Infinity
0 not 0 denormalised
Similar for Double precision (just replacing 255 by 2049), Ranges of Floating point numbers:
The range of positive floating point numbers can be split into normalized numbers, and denormalized
numbers which use only a portion of the fractions’s precision. Since every floating-point number has a
corresponding, negated value, the ranges above are symmetric around zero.
There are five distinct numerical ranges that single-precision floating-point numbers are not able to
represent with the scheme presented so far:
1. Negative numbers less than – (2 – 2-23) × 2127 (negative overflow)
3. Zero
Overflow generally means that values have grown too large to be represented. Underflow is a less
serious problem because is just denotes a loss of precision, which is guaranteed to be closely
approximated by zero.
Table of the total effective range of finite IEEE floating-point numbers is shown below:
Binary Decimal
Special Operations –
Operation Result
n ÷ ±Infinity 0
±nonZero ÷ ±0 ±Infinity
Infinity + Infinity
+Infinity
Infinity – -Infinity
-Infinity – Infinity
– Infinity
-Infinity + – Infinity
±0 ÷ ±0 NaN
±Infinity × 0 NaN
To evaluate the number 32.75 using single precision IEEE 754 representation, follow these steps:
1. Sign Bit:
o Since 32.7532.7532.75 is positive, the sign bit is 000.
2. Exponent:
o The exponent (5) needs to be biased. For single precision, the bias is 127:
Now we can combine the sign bit, biased exponent, and mantissa:
Putting it all together, the single precision IEEE 754 representation of 32.7532.7532.75 is: