0% found this document useful (0 votes)

22 views

Aula Ch3

This document provides an overview of computer arithmetic, including integer and floating point representations and operations. It discusses multiplication and division algorithms, parallelism techniques like SIMD, and optimizations for graphics and multimedia workloads. Key topics covered include floating point number formats and operations, parallel hardware designs for faster multiplication and division, and issues that can arise like overflow, underflow, and lack of associativity in floating point addition. The document concludes by noting limitations of finite-precision binary representations and importance of the ISA in defining number interpretations and arithmetic capabilities.

Uploaded by

Charles Chaves

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

Aula Ch3

Uploaded by

Charles Chaves

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 21

EEL580 - Arquitetura de Computadores: Arithmetic for

Computers

Docente: Diego L. C. Dutra

Sumário

● Introduction
● Multiplication
● Division
● Floating Point
● Parallelism and Computer Arithmetic
● Real Stuff
● Going Faster
● Fallacies and Pitfalls
● Concluding Remarks
Introduction

● Operations on integers
○ Addition and subtraction
■ Overflow if result out of range
○ Multiplication and division
○ Dealing with overflow
● Floating-point real numbers
○ Representation and operations
● Arithmetic for Multimedia
○ Graphics and media processing operates on vectors of 8-bit and 16-bit data
■ Use 64-bit adder, with partitioned carry chain
■ Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors
○ SIMD (single-instruction, multiple-data)
■ Saturating operations
■ On overflow, result is largest representable value
● c.f. 2s-complement modulo arithmetic
■ E.g., clipping in audio, saturation in video
Multiplication: Basic
Multiplication: Optimized Multiplier

● Faster: Uses multiple adders

● Perform steps in parallel: add/shift ○ Cost/performance tradeoff

● One cycle per partial-product addition

○ That’s ok, if frequency of multiplications is low
● Can be pipelined
○ Several multiplication performed in parallel
Division: Basic
Division: Optimized Divider

● Faster Division
○ Can’t use parallel hardware as in multiplier
■ Subtraction is conditional on sign of
remainder
■ We still use parallel hardware but w/
diminish returns
○ Faster dividers (e.g. Newton-Raphson)
generate multiple quotient bits per step
■ Still require multiple steps

● One cycle per partial-remainder

subtraction
● Looks a lot like a multiplier!
○ Same hardware can be used for both
RISC-V Multiplication & Division

● RISC-V Division: Overflow and division-by-zero don’t produce errors

○ Just return defined results
○ Faster for the common case of no error
Floating Point

single: 8 bits single: 23 bits

● Representation for non-integral double: 52 bits
double: 11 bits
numbers
○ Including very small and very large S Exponent Fraction
numbers
● Types float and double in C
● Defined by IEEE Std 754-1985 ● IEEE Floating-Point Format
● Developed in response to divergence ○ S: sign bit (0 ⇒ non-negative, 1 ⇒ negative)
○ Normalize significand: 1.0 ≤ |significand| < 2.0
of representations ■ Always has a leading pre-binary-point 1 bit, so
○ Portability issues for scientific code no need to represent it explicitly (hidden bit)
● Now almost universally adopted ■ Significant is Fraction with the “1.” restored
○ Exponent: excess representation: actual exponent +
● Two representations Bias
○ Single precision (32-bit) ■ Ensures exponent is unsigned
○ Double precision (64-bit) ■ Single: Bias = 127; Double: Bias = 1203
Floating Point

● Relative precision ● Infinities and NaNs

○ all fraction bits are significant ○ Exponent = 111...1, Fraction = 000...0
○ Single: approx 2–23 ■ ±Infinity
■ Equivalent to 23 × log102 ≈ 23 × 0.3 ■ Can be used in subsequent
≈ 6 decimal digits of precision calculations, avoiding need for
○ Double: approx 2–52 overflow check
■ Equivalent to 52 × log102 ≈ 52 × 0.3 ○ Exponent = 111...1, Fraction ≠ 000...0
≈ 16 decimal digits of precision ■ Not-a-Number (NaN)
● Denormal Numbers ■ Indicates illegal or undefined
○ Exponent = 000...0 ⇒ hidden bit is 0 result
● e.g., 0.0 / 0.0
■ Can be used in subsequent
○ Smaller than normal numbers calculations
■ allow for gradual underflow, with
diminishing precision
○ Denormal with fraction = 000...0
Floating Point: Adder HW

Step 1

Step 2

Step 3

Step 4
Floating Point: HW

● FP Adder Hardware FP multiplier

○ Much more complex than integer adder
○ Doing it in one clock cycle would take too long
■ Much longer than integer operations
■ Slower clock would penalize all instructions
○ FP adder usually takes several cycles
■ Can be pipelined
● FP Arithmetic Hardware & Accurate Arithmetic
○ FP multiplier is of similar complexity to FP adder
■ But uses a multiplier for significands instead of an adder
○ FP arithmetic hardware usually does
■ Addition, subtraction, multiplication, division, reciprocal, square-root
■ FP ↔ integer conversion
○ Operations usually takes several cycles
■ Can be pipelined
○ IEEE Std 754 specifies additional rounding control
■ Extra bits of precision (guard, round, sticky)
■ Choice of rounding modes
■ Allows programmer to fine-tune numerical behavior of a computation
■ Not all FP units implement all options: Trade-off between hardware complexity,
performance, and market requirements
Floating Point Instructions in RISC-V
Floating Point Example: °F to °C

● C code:

● Compiled RISC-V code:

Floating Point Example: Array Multiplication

● C= C+A× B
○ All 32 × 32 matrices, 64-bit double-precision
elements
● C code:
○ Addresses of c, a, b in x10, x11, x12, and i, j, k in
x5, x6, x7

● RISC-V code:
Real Stuff: Parallelism in Computer Arithmetic and SIMD

● Graphics and audio applications can take advantage of performing simultaneous

operations on short vectors
○ Example: 128-bit adder:
■ Sixteen 8-bit adds
■ Eight 16-bit adds
■ Four 32-bit adds
● Also called data-level parallelism, vector parallelism, or Single Instruction,
Multiple Data (SIMD)
● Streaming SIMD Extension 2 (SSE2) on x86_64
Going Faster: Subword Parallelism and Matrix Multiply

● Unoptimized C code:

● Optimized C code:
Fallacies and Pitfalls

● Fallacy: Right Shift and Division

○ Left shift by i places multiplies an integer by 2 i and right shift divides by 2 i?
■ Only for unsigned integers
● Pitfall: Floating-point addition is not associative.
○ Parallel programs may interleave operations in unexpected orders
Assumptions of associativity may fail
○ Need to validate parallel programs under varying degrees of parallelism

● Fallacy: Parallel execution strategies that work for integer data

types also work for floating-point data types.
Fallacies and Pitfalls

● Fallacy: Only theoretical mathematicians care about floating-point accuracy.

○ Important for scientific code
○ But for everyday consumer use?
○ “My bank balance is out by 0.0002¢!” ☹
○ The Intel Pentium FDIV bug
○ The market expects accuracy
○ See Colwell, The Pentium Chronicles
○ Recall costs$500 million,
Concluding Remarks

● Bits have no inherent meaning The frequency of the RISC-V instructions for the
○ Interpretation depends on the SPEC CPU2006 benchmarks.
instructions applied
● Computer representations of
numbers
○ Finite range and precision
○ Need to account for this in programs
● ISAs support arithmetic
○ Signed and unsigned integers
○ Floating-point approximation to reals
● Bounded range and precision
○ Operations can overflow and underflow
Questions

TestMAX DFT Boundary Scan User Guide
No ratings yet
TestMAX DFT Boundary Scan User Guide
228 pages
Ga RP65X8 1.0 - HM370
No ratings yet
Ga RP65X8 1.0 - HM370
78 pages
List of SP Flash Tool Errors & Their Solutions: Phone Is Totally Dead
No ratings yet
List of SP Flash Tool Errors & Their Solutions: Phone Is Totally Dead
38 pages
Lect 13
No ratings yet
Lect 13
41 pages
Computer Architecture: Nguyễn Trí Thành
No ratings yet
Computer Architecture: Nguyễn Trí Thành
55 pages
Chapter 3 Arithmetic For Computers
No ratings yet
Chapter 3 Arithmetic For Computers
43 pages
Arithmetic For Computers
No ratings yet
Arithmetic For Computers
43 pages
Chapter 3
No ratings yet
Chapter 3
49 pages
IT3030E CA Chap3 Arithmetics
No ratings yet
IT3030E CA Chap3 Arithmetics
39 pages
Computer Architecture: Arithmetic For Computers
No ratings yet
Computer Architecture: Arithmetic For Computers
52 pages
Unit 3
No ratings yet
Unit 3
49 pages
Chapter 03
100% (1)
Chapter 03
49 pages
Arithmetic For Computers: The Hardware/Software Interface 5
No ratings yet
Arithmetic For Computers: The Hardware/Software Interface 5
49 pages
CHUONG 4 2
No ratings yet
CHUONG 4 2
48 pages
Computer Organization and Architecture: William Stallings
No ratings yet
Computer Organization and Architecture: William Stallings
7 pages
Kiến Trúc Máy Tính CS2009: Khoa Khoa học và Kỹ thuật Máy tính BM Kỹ thuật Máy tính Võ Tấn Phương
No ratings yet
Kiến Trúc Máy Tính CS2009: Khoa Khoa học và Kỹ thuật Máy tính BM Kỹ thuật Máy tính Võ Tấn Phương
50 pages
Chapter 03 RISC V
No ratings yet
Chapter 03 RISC V
56 pages
Chapter 03
No ratings yet
Chapter 03
51 pages
Patterson6e_MIPS_Ch03_PPT_r2
No ratings yet
Patterson6e_MIPS_Ch03_PPT_r2
56 pages
CA - 04 - Arithmetic Operations
No ratings yet
CA - 04 - Arithmetic Operations
25 pages
Chapter 3 Arithmetic For Computers
No ratings yet
Chapter 3 Arithmetic For Computers
49 pages
Floating Point Multipliers: Simulation & Synthesis Using VHDL
No ratings yet
Floating Point Multipliers: Simulation & Synthesis Using VHDL
40 pages
Arithmetic For Computers: Computer Organization and Design
No ratings yet
Arithmetic For Computers: Computer Organization and Design
57 pages
Computer Org Chapter - 03 J
No ratings yet
Computer Org Chapter - 03 J
61 pages
Aritmética - Arq. Mic.
No ratings yet
Aritmética - Arq. Mic.
53 pages
Chapter 3 Merged
No ratings yet
Chapter 3 Merged
81 pages
3 Risc V Alu Arch Basics
No ratings yet
3 Risc V Alu Arch Basics
127 pages
Arithmetic For Computers: The Hardware/Software Interface
No ratings yet
Arithmetic For Computers: The Hardware/Software Interface
59 pages
BiD 09
No ratings yet
BiD 09
56 pages
Chapter 03
No ratings yet
Chapter 03
57 pages
L09 - Floating-Point & Logic
No ratings yet
L09 - Floating-Point & Logic
59 pages
Chapter 3 Arithmetic For Computers-New
No ratings yet
Chapter 3 Arithmetic For Computers-New
32 pages
Chapter 3: Arithmetic For Computers
100% (1)
Chapter 3: Arithmetic For Computers
39 pages
Division: Check For 0 Divisor Long Division Approach
No ratings yet
Division: Check For 0 Divisor Long Division Approach
27 pages
Lecture 05 - Arithmetic for Computers
No ratings yet
Lecture 05 - Arithmetic for Computers
18 pages
Arithmetic For Computers
No ratings yet
Arithmetic For Computers
48 pages
Chapter 03 RISC V
No ratings yet
Chapter 03 RISC V
52 pages
Floating Point Arithmetic: Numbers
No ratings yet
Floating Point Arithmetic: Numbers
14 pages
Energy Efficient High Speed Floating Point Arithmetic Unit: Somya Kumawat, Arpan Shah, Ramesh Bharti
No ratings yet
Energy Efficient High Speed Floating Point Arithmetic Unit: Somya Kumawat, Arpan Shah, Ramesh Bharti
3 pages
Computer Architecture and Organization: The Central Processing Unit
100% (1)
Computer Architecture and Organization: The Central Processing Unit
126 pages
Computer Architecture&O ECEG 3163 04 Computer Arithmetic
No ratings yet
Computer Architecture&O ECEG 3163 04 Computer Arithmetic
46 pages
L12 Representation of Numbers
No ratings yet
L12 Representation of Numbers
27 pages
Floating Point Arithmetic: Numbers
No ratings yet
Floating Point Arithmetic: Numbers
41 pages
Arithmetic & Logic Unit
No ratings yet
Arithmetic & Logic Unit
58 pages
Pooja Vashisth
No ratings yet
Pooja Vashisth
35 pages
Floating Point Arithmetic Sun
No ratings yet
Floating Point Arithmetic Sun
74 pages
Floating Point & fixed point Representation_BCA II
No ratings yet
Floating Point & fixed point Representation_BCA II
24 pages
MIPS Architecture - BITS Pilani
No ratings yet
MIPS Architecture - BITS Pilani
58 pages
COD - Unit-3 - N - 4 - PPT AJAY Kumar
No ratings yet
COD - Unit-3 - N - 4 - PPT AJAY Kumar
93 pages
06 Arithmetic
No ratings yet
06 Arithmetic
39 pages
f31 Book Arith Pres Pt5
No ratings yet
f31 Book Arith Pres Pt5
93 pages
09 Arithmetic
No ratings yet
09 Arithmetic
41 pages
f31 Book Arith Pres Pt1
No ratings yet
f31 Book Arith Pres Pt1
91 pages
181
No ratings yet
181
11 pages
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
From Everand
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Digital Equipment Corporation
No ratings yet
Pic® Micro Principles V11
From Everand
Pic® Micro Principles V11
Clive W. Humphris
No ratings yet
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
C Programming
From Everand
C Programming
Netra
No ratings yet
Pic® Micro Principles Teachers Pack V11
From Everand
Pic® Micro Principles Teachers Pack V11
Clive W. Humphris
No ratings yet
The Number "e"
From Everand
The Number "e"
Archive Classics
No ratings yet
Pic® Micro Principles on Your Mobile
From Everand
Pic® Micro Principles on Your Mobile
Clive W. Humphris
No ratings yet
Learn the Pic® Micro on Your Smartphone
From Everand
Learn the Pic® Micro on Your Smartphone
Clive W. Humphris
No ratings yet
Nt Bosch Edc17cp57 Irom Tc1793 Gpt Mercedes 1033
No ratings yet
Nt Bosch Edc17cp57 Irom Tc1793 Gpt Mercedes 1033
8 pages
8 Series Chipset PCH Spec Update
No ratings yet
8 Series Chipset PCH Spec Update
21 pages
Computer Architecture
No ratings yet
Computer Architecture
14 pages
Simatic Box PC 840: First Steps in Commissioning
No ratings yet
Simatic Box PC 840: First Steps in Commissioning
20 pages
Msaad HP5130 Factory Default
No ratings yet
Msaad HP5130 Factory Default
4 pages
MIG Torch 1-14
100% (1)
MIG Torch 1-14
14 pages
Wiring Instructions For Series It Itb and Its Connectors
100% (1)
Wiring Instructions For Series It Itb and Its Connectors
52 pages
HD-VP410H Specification v1.0
No ratings yet
HD-VP410H Specification v1.0
9 pages
1.1 Background: Keywords: World Clock, Time Zone, Microcontroller, Location's Choice, Display
No ratings yet
1.1 Background: Keywords: World Clock, Time Zone, Microcontroller, Location's Choice, Display
14 pages
Installation Guide Open Mind Software En
No ratings yet
Installation Guide Open Mind Software En
35 pages
ASL Operator Reference
No ratings yet
ASL Operator Reference
82 pages
CH-1 Introduction To Computer System and Programming
No ratings yet
CH-1 Introduction To Computer System and Programming
21 pages
BVKJDB
No ratings yet
BVKJDB
14 pages
FUJITSU Server PRIMERGY RX2540 M5 Rack Server: Data Sheet
No ratings yet
FUJITSU Server PRIMERGY RX2540 M5 Rack Server: Data Sheet
15 pages
cc2
No ratings yet
cc2
35 pages
Microprocessor 4
No ratings yet
Microprocessor 4
26 pages
Leviton Catalogo Network Solutions
No ratings yet
Leviton Catalogo Network Solutions
244 pages
Course Outline CSE332.12
No ratings yet
Course Outline CSE332.12
1 page
CSSE ModuleDraft - V11 - Student
No ratings yet
CSSE ModuleDraft - V11 - Student
76 pages
Computer Architecture Solved Paper 2024
No ratings yet
Computer Architecture Solved Paper 2024
38 pages
Risc-V: RISC-V (Pronounced "Risk-Five") Is A ISA Standard
No ratings yet
Risc-V: RISC-V (Pronounced "Risk-Five") Is A ISA Standard
11 pages
ZX Next Dev Guide r2
No ratings yet
ZX Next Dev Guide r2
231 pages
A5SL-010_611-612_IND_RwC
No ratings yet
A5SL-010_611-612_IND_RwC
2 pages
Logical Micro Instructions in Computer Organization and Architecture
No ratings yet
Logical Micro Instructions in Computer Organization and Architecture
9 pages
Seminar On: Random Access Memory (Ram)
No ratings yet
Seminar On: Random Access Memory (Ram)
17 pages
[FREE PDF sample] Multicore and GPU Programming An Integrated Approach 2nd Edition Gerassimos Barlas ebooks
100% (4)
[FREE PDF sample] Multicore and GPU Programming An Integrated Approach 2nd Edition Gerassimos Barlas ebooks
40 pages
Developing Efficient Graphics Software
No ratings yet
Developing Efficient Graphics Software
132 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Aula Ch3

Uploaded by

Aula Ch3

Uploaded by

EEL580 - Arquitetura de Computadores: Arithmetic for

Docente: Diego L. C. Dutra

● Faster: Uses multiple adders

● One cycle per partial-product addition

● One cycle per partial-remainder

● RISC-V Division: Overflow and division-by-zero don’t produce errors

single: 8 bits single: 23 bits

● Relative precision ● Infinities and NaNs

● FP Adder Hardware FP multiplier

● Compiled RISC-V code:

● Graphics and audio applications can take advantage of performing simultaneous

● Fallacy: Right Shift and Division

● Fallacy: Parallel execution strategies that work for integer data

● Fallacy: Only theoretical mathematicians care about floating-point accuracy.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.