Part 5 Floating Point Add Sub Mul
Part 5 Floating Point Add Sub Mul
1
Floating Point Representation
(IEEE 754 Standard)
• a binary floating-point number can be represented by
• A sign for the number
• Some significant bits
• A signed scale factor exponent for an implied base of 2
• The basic IEEE format is a 32-bit representation
• The leftmost bit represents the sign, S, for the number.
• The next 8 bits, E , represent the signed exponent of the scale factor (with an implied base of 2),
and the remaining 23 bits, M , are the fractional part of the significant bits. The full 24-bit string,
B, of significant bits, called the mantissa, always has a leading 1
• when the binary point is placed to the right of the first significant bit, the number is said to be
normalized
2
Floating Point Representation
(IEEE 754 Standard)
3
IEEE 754 Standard (Single Precision)
• Instead of the actual signed exponent, E, the value stored in the
exponent field is an unsigned integer E´ = E + 127.
• This is called the excess-127 format. Thus, E´ is in the range 0 ≤ E´ ≤
255.
• The end values of this range, 0 and 255, are used to represent special
values.
• Therefore, the range of E´ for normal values is 1 ≤ E´ ≤ 254.
• This means that the actual exponent, E, is in the range −126 ≤ E ≤ 127.
The use of the excess-127 representation for exponents simplifies
comparison of the relative sizes of two floating-point numbers.
4
Floating Point Representation
5
(IEEE 754 Standard) Double Precision
6
Double Precision
• The double-precision format has increased exponent and mantissa
ranges.
• The 11-bit excess-1023 exponent E´ has the range 1 ≤ E´ ≤ 2046 for
normal values, with 0 and 2047 used to indicate special values, as
before.
• Thus, the actual exponent E is in the range −1022 ≤ E ≤ 1023,
providing scale factors of 2−1022 to 21023 (approximately 10±308).
The 53-bit mantissa provides a precision equivalent to about 16
decimal digits
7
Normalized Value
8
Why normalized form?
• Simplifies the exchange of data that includes floating-point numbers
• Simplifies the arithmetic algorithms to know that the numbers will
always be in this form
• Increases the accuracy of the numbers that can be stored in a word,
since each unnecessary leading 0 is replaced by another significant
digit to the right of the decimal point
9
Floating Point Arithmetic
• Add/Subtract Rule
1. Choose the number with the smaller exponent and shift its mantissa right a
number of steps equal to the difference in exponents.
2. Set the exponent of the result equal to the larger exponent.
3. Perform addition/subtraction on the mantissas and determine the sign of the
result.
4. Normalize the resulting value, if necessary
10
Floating point addition in Decimal
• Add 2.9400 × 102 to 4.3100 × 104.
• We rewrite 2.9400 × 102 as 0.0294 × 104
• perform addition of the mantissas to get 4.3394 × 104.
11
Floating Point Binary Representation
• 85.125
• 85 = 1010101
• 0.125 = 001
• 85.125 = 1010101.001
• =1.010101001 x 2^6
• sign = 0
• 1. Single precision:
• biased exponent 127+6=133
• 133 = 10000101
• Normalised mantisaa = 010101001
• we will add 0's to complete the 23 bits
•
• The IEEE 754 Single precision is:
• =0 10000101 01010100100000000000000
12
Floating Point Binary Representation
• Double precision:
• 85.125
• 85 = 1010101
• 0.125 = 001
• 85.125 = 1010101.001 =1.010101001 x 2^6
• sign = 0
• biased exponent 1023+6=1029
• 1029 = 10000000101
• Normalised mantisa = 010101001
• we will add 0's to complete the 52 bits
• The IEEE 754 Double precision is: = 0 10000000101
0101010010000000000000000000000000000000000000000000
13
Example
• Perform the following arithmetic operation using floating point arithmetic, In
each case, show how the numbers would be stored using IEEE single-precision
format
14
15
Example 2
16
17
Multiply Rule
• 1.Add the exponents and subtract 127 to maintain the excess-127
representation.
• 2.Multiply the mantissas and determine the sign of the result.
• 3. Normalize the resulting value, if necessary
18
Multiplication
• Add the biased exponent
19
Divide Rule
• Subtract the exponents and add 127 to maintain the excess-127
representation.
• Divide the mantissas and determine the sign of the result.
• Normalize the resulting value, if necessary.
20