0% found this document useful (0 votes)
25 views40 pages

CSA Performance

Uploaded by

Karai Nava
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views40 pages

CSA Performance

Uploaded by

Karai Nava
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Computer System Architecture

Performance
(CST 363 -2)
Lecture - 03
Objective
• Von Neumann Architecture
• How programs are translated into the machine
language
– And how the hardware executes them
• The hardware/software interface
theermanikkirathu

• What determines program performance


– And how it can be improved
• How hardware designers improve performance
• What is parallel processing

5/18/2023 2
Why Computer Architecture
• Improve Performance
• Build a New Computer
• Buy a New Computer
• Some solutions to Problem

5/18/2023 3
Von Neumann Architecture –
Hardware components
• The hardware for a Von Neumann machine
consists of three principle components;
processor, memory, and I/O facilities.

•Both programs and data are stored in the


memory
5/18/2023 4
Computer
• It’s a Programmable multi task electronic
device
• Computer Architecture is plan of the overall
functionality and Design.

5/18/2023 5
5/18/2023 6
Integrated Circuit
• Integrated circuit Also called a chip.
iniaththal

• A device combining dozens to millions of


transistors.

Transistor
Integrated Circuit

5/18/2023 7
Computer Revolution
Munnettam

• Progress in computer technology


– Underpinned by Moore’s Law
• Makes novel applications feasible
– Computers in automobiles
– Cell phones
– Human genome project
– World Wide Web
– Search Engines
• Computers are pervasive
5/18/2023 8
Classes
• Personal computers
– General purpose, variety of software
– Subject to cost/performance tradeoff parimttam

• Server computers
– Network based
– High capacity, performance, reliability
– Range from small servers to building sized

• Supercomputers
– High-end scientific and engineering calculations
– Highest capability but represent a small fraction of the overall
computer market

• Embedded computers
– Hidden as components of systems
– Stringent power/performance/cost constraints

5/18/2023 9
Post PC
• Personal Mobile Device (PMD)
– Battery operated
– Connects to the Internet
– Hundreds of dollars
– Smart phones, tablets, electronic glasses
• Cloud computing
– Warehouse Scale Computers (WSC)
– Software as a Service (SaaS)
– Portion of software run on a PMD and a portion run in
the Cloud
– Amazon and Google

5/18/2023 10
Performance
• Algorithm Valimurai

– Determines number of operations executed


• Programming language, compiler, architecture
– Determine number of machine instructions executed per
operation
• Processor and memory system
– Determine how fast instructions are executed
• I/O system (including OS)
– Determines how fast I/O operations are executed

5/18/2023 11
Idea’s
• Design for Moore’s Law
• Use abstraction to simplify design
• Make the common case fast
• Performance via parallelism
• Performance via pipelining
• Performance via prediction
padinilai
• Hierarchy of memories
Panineekkam

• Dependability via redundancy

5/18/2023 12
Software Abstraction

5/18/2023 13
How your Program works
• Application software
– Written in high-level language
• System software
– Compiler: translates HLL code to machine code
– Operating System: service code
• Handling input/output
• Managing memory and storage
• Scheduling tasks & sharing resources
• Hardware
– Processor, memory, I/O controllers

5/18/2023 14
How your Program Works
• High-level language
– Level of abstraction closer to problem
domain
– Provides for productivity and portability
• Assembly language
– Textual representation of instructions
• Hardware representation
– Binary digits (bits)
– Encoded instructions and data

5/18/2023 15
Hardware Abstraction

5/18/2023 16
Computer Hardware

5/18/2023 17
Processor

5/18/2023 18
Processor
• Data path: performs operations on data
• Control: sequences data path, memory, ...
• Cache memory
– Small fast SRAM memory for immediate access
to data

5/18/2023 19
A/L Operating Systems 20
A/L Operating Systems 21
A/L Operating Systems 22
Processor

5/18/2023 23
Performance
• Response time
– How long it takes to do a task
• Throughput Munnettam

– Total work done per unit time


• e.g., tasks/transactions/… per hour
• How are response time and throughput affected by
– Replacing the processor with a faster version?
– Adding more processors?
• We’ll focus on response time for now…

5/18/2023 24
Performance
• Define Performance = 1/Execution Time
• “X is n time faster than Y”
Performanc e X Performanc e Y
= Execution time Y Execution time X = n

• Example: time taken to run a program


• 10s on A, 15s on B
• Execution TimeB / Execution TimeA
= 15s / 10s = 1.5
• So A is 1.5 times faster than B
5/18/2023 25
Execution Time
kalinthathu

• Elapsed time
– Total response time, including all aspects
• Processing, I/O, OS overhead, idle time
– Determines system performance
• CPU time
– Time spent processing a given job
• Discounts I/O time, other jobs’ shares
– Comprises user CPU time and system CPU time
– Different programs are affected differently by CPU
and system performance

5/18/2023 26
• Response time also called execution time is
the total time required for the computer to
complete a task, including disk accesses,
memory accesses, I/O activities, operating
system overhead, CPU execution time, and so
on.
• Throughput also called bandwidth. Another
measure of performance, it is the number of
tasks completed per unit time.

5/18/2023 27
CPU Execution Time
• Also called CPU time.
• The actual time the CPU spends computing for
a specific task.
• Does not include time spent waiting for I/O or
running other programs.

5/18/2023 28
Clock Cycles
Kurikkappadukirathu

• Also referred as ticks, clock ticks, clock


periods, clocks, cycles.
• The time for one clock period, usually of the
processor clock, which runs at a constant rate.
• Clock period - the length of each clock cycle.

5/18/2023 29
CPU Clock
• Clock period: duration of a clock cycle
– e.g., 250ps = 0.25ns = 250×10–12s
• Clock frequency (rate): cycles per second
– e.g., 4.0GHz = 4000MHz = 4.0×109Hz

5/18/2023 30
System Bus, Data Bus

5/18/2023 31
CPU Time
• Performance improved by
– Reducing number of clock cycles
– Increasing clock rate
– Hardware designer must often trade off clock rate
against cycle count

CPU Time = CPU Clock Cycles  Clock Cycle Time


CPU Clock Cycles
=
Clock Rate
5/18/2023 32
CPU Time Example
• Computer A: 2GHz clock, 10s CPU time
• Designing Computer B
– Aim for 6s CPU time
– Can do faster clock, but causes 1.2 × clock cycles
• How fast must Computer B clockBbe?1.2  Clock CyclesA
Clock Cycles
Clock RateB = =
CPU Time B 6s
Clock CyclesA = CPU Time A  Clock Rate A
= 10s  2GHz = 20  10 9
1.2  20  10 9 24  10 9
Clock RateB = = = 4GHz
6s 6s
5/18/2023 33
Instruction Count & CPI
• Instruction Count for a program
– Determined by program, ISA and compiler
• Average cycles per instruction
– Determined by CPU hardware
– If different instructions have different CPI
• Average CPI affected by instruction mix
Clock Cycles = Instruction Count  Cycles per Instruction
CPU Time = Instruction Count  CPI  Clock Cycle Time
Instruction Count  CPI
=
Clock Rate
5/18/2023 34
Cycle Per Instruction
• Computer A: Cycle Time = 250ps, CPI = 2.0
• Computer B: Cycle Time = 500ps, CPI = 1.2
• Same ISA
• Which is faster, and by how much?
CPU Time = Instruction Count  CPI  Cycle Time
A A A
= I  2.0  250ps = I  500ps
CPU Time = Instruction Count  CPI  Cycle Time
B B B
= I  1.2  500ps = I  600ps

B = I  600ps = 1.2
CPU Time
CPU Time I  500ps
5/18/2023
A 35
Clock Cycles Per Instruction
• Which is the average number of clock cycles
each instruction takes to execute.
• Is often abbreviated as CPI.

5/18/2023 36
Instruction Count
• Instruction Count is the number of
instructions executed by the program.

5/18/2023 37
CPI
• If different instruction classes take different
numbers of cycles
n
Clock Cycles =  (CPIi  Instruction Count i )
i=1

Relative frequency

Clock Cycles n
 Instruction Count i 
CPI = =   CPIi  
Instruction Count i=1  Instruction Count 

5/18/2023 38
CPI Example
• Alternative compiled code sequences using
instructions in classes A, B, C
Class A B C
CPI for class 1 2 3
IC in sequence 1 2 1 2
IC in sequence 2 4 1 1

Sequence 1: IC = 5 Sequence 2: IC = 6
Clock Cycles Clock Cycles
= 2×1 + 1×2 + 2×3 = 4×1 + 1×2 + 1×3
= 10 =9
Avg. CPI = 10/5 = 2.0 Avg. CPI = 9/6 = 1.5

5/18/2023 39
Summary
• Performance depends on
– Algorithm: affects IC, possibly CPI
– Programming language: affects IC, CPI
– Compiler: affects IC, CPI
– Instruction set architecture: affects IC, CPI, Tc

5/18/2023 40

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy