CSA Performance
CSA Performance
Performance
(CST 363 -2)
Lecture - 03
Objective
• Von Neumann Architecture
• How programs are translated into the machine
language
– And how the hardware executes them
• The hardware/software interface
theermanikkirathu
5/18/2023 2
Why Computer Architecture
• Improve Performance
• Build a New Computer
• Buy a New Computer
• Some solutions to Problem
5/18/2023 3
Von Neumann Architecture –
Hardware components
• The hardware for a Von Neumann machine
consists of three principle components;
processor, memory, and I/O facilities.
5/18/2023 5
5/18/2023 6
Integrated Circuit
• Integrated circuit Also called a chip.
iniaththal
Transistor
Integrated Circuit
5/18/2023 7
Computer Revolution
Munnettam
• Server computers
– Network based
– High capacity, performance, reliability
– Range from small servers to building sized
• Supercomputers
– High-end scientific and engineering calculations
– Highest capability but represent a small fraction of the overall
computer market
• Embedded computers
– Hidden as components of systems
– Stringent power/performance/cost constraints
5/18/2023 9
Post PC
• Personal Mobile Device (PMD)
– Battery operated
– Connects to the Internet
– Hundreds of dollars
– Smart phones, tablets, electronic glasses
• Cloud computing
– Warehouse Scale Computers (WSC)
– Software as a Service (SaaS)
– Portion of software run on a PMD and a portion run in
the Cloud
– Amazon and Google
5/18/2023 10
Performance
• Algorithm Valimurai
5/18/2023 11
Idea’s
• Design for Moore’s Law
• Use abstraction to simplify design
• Make the common case fast
• Performance via parallelism
• Performance via pipelining
• Performance via prediction
padinilai
• Hierarchy of memories
Panineekkam
5/18/2023 12
Software Abstraction
5/18/2023 13
How your Program works
• Application software
– Written in high-level language
• System software
– Compiler: translates HLL code to machine code
– Operating System: service code
• Handling input/output
• Managing memory and storage
• Scheduling tasks & sharing resources
• Hardware
– Processor, memory, I/O controllers
5/18/2023 14
How your Program Works
• High-level language
– Level of abstraction closer to problem
domain
– Provides for productivity and portability
• Assembly language
– Textual representation of instructions
• Hardware representation
– Binary digits (bits)
– Encoded instructions and data
5/18/2023 15
Hardware Abstraction
5/18/2023 16
Computer Hardware
5/18/2023 17
Processor
5/18/2023 18
Processor
• Data path: performs operations on data
• Control: sequences data path, memory, ...
• Cache memory
– Small fast SRAM memory for immediate access
to data
5/18/2023 19
A/L Operating Systems 20
A/L Operating Systems 21
A/L Operating Systems 22
Processor
5/18/2023 23
Performance
• Response time
– How long it takes to do a task
• Throughput Munnettam
5/18/2023 24
Performance
• Define Performance = 1/Execution Time
• “X is n time faster than Y”
Performanc e X Performanc e Y
= Execution time Y Execution time X = n
• Elapsed time
– Total response time, including all aspects
• Processing, I/O, OS overhead, idle time
– Determines system performance
• CPU time
– Time spent processing a given job
• Discounts I/O time, other jobs’ shares
– Comprises user CPU time and system CPU time
– Different programs are affected differently by CPU
and system performance
5/18/2023 26
• Response time also called execution time is
the total time required for the computer to
complete a task, including disk accesses,
memory accesses, I/O activities, operating
system overhead, CPU execution time, and so
on.
• Throughput also called bandwidth. Another
measure of performance, it is the number of
tasks completed per unit time.
5/18/2023 27
CPU Execution Time
• Also called CPU time.
• The actual time the CPU spends computing for
a specific task.
• Does not include time spent waiting for I/O or
running other programs.
5/18/2023 28
Clock Cycles
Kurikkappadukirathu
5/18/2023 29
CPU Clock
• Clock period: duration of a clock cycle
– e.g., 250ps = 0.25ns = 250×10–12s
• Clock frequency (rate): cycles per second
– e.g., 4.0GHz = 4000MHz = 4.0×109Hz
5/18/2023 30
System Bus, Data Bus
5/18/2023 31
CPU Time
• Performance improved by
– Reducing number of clock cycles
– Increasing clock rate
– Hardware designer must often trade off clock rate
against cycle count
B = I 600ps = 1.2
CPU Time
CPU Time I 500ps
5/18/2023
A 35
Clock Cycles Per Instruction
• Which is the average number of clock cycles
each instruction takes to execute.
• Is often abbreviated as CPI.
5/18/2023 36
Instruction Count
• Instruction Count is the number of
instructions executed by the program.
5/18/2023 37
CPI
• If different instruction classes take different
numbers of cycles
n
Clock Cycles = (CPIi Instruction Count i )
i=1
Relative frequency
Clock Cycles n
Instruction Count i
CPI = = CPIi
Instruction Count i=1 Instruction Count
5/18/2023 38
CPI Example
• Alternative compiled code sequences using
instructions in classes A, B, C
Class A B C
CPI for class 1 2 3
IC in sequence 1 2 1 2
IC in sequence 2 4 1 1
Sequence 1: IC = 5 Sequence 2: IC = 6
Clock Cycles Clock Cycles
= 2×1 + 1×2 + 2×3 = 4×1 + 1×2 + 1×3
= 10 =9
Avg. CPI = 10/5 = 2.0 Avg. CPI = 9/6 = 1.5
5/18/2023 39
Summary
• Performance depends on
– Algorithm: affects IC, possibly CPI
– Programming language: affects IC, CPI
– Compiler: affects IC, CPI
– Instruction set architecture: affects IC, CPI, Tc
5/18/2023 40