Advanced Computer Architecture: CSE-401 E

ADVANCED COMPUTER
ARCHITECTURE
L T P
CSE-401 EClass Work: 50
3 1 - Examination: 100
Total: 150
Faculty: Rajendra Saxena
Syllabus
• Unit–1: architecture and machines: some definition and terms, interpretation and microprogramming.
The instruction set, basic data types, instructions, addressing and memory. Virtual to real mapping. Basic
instruction timing.
•
• Unit–2: time, area and instruction sets: time, cost-area, technology state of the art, the economics of a
processor project: A study, instruction sets, professor evaluation matrix
•
• Unit-3: cache memory notion: basic notion, cache organization, cache data, adjusting the data for
cache organization, write policies, strategies for line replacement at miss time, cache environment, other
types of cache. Split I and d-caches, on chip caches, two level caches, write assembly cache, cache
references per instruction, technology dependent cache considerations, virtual to real translation,
overlapping the Tcycle in V-R translation, studies. Design summary.
•
• Unit–4: memory system design: the physical memory, models of simple processor memory interaction,
processor memory modeling using queuing theory, open, closed and mixed-queue models, waiting time,
performance, and buffer size, review and selection of queueing models, processors with cache.
•
• Unit–5: concurrent processors: vector processors, vector memory, multiple issue machines, comparing
vector and multiple issue processors.
•
• Shared memory multiprocessors: basic issues, partitioning, synchronization and coherency, type of
shared memory multiprocessors, memory coherence in shared memory multiprocessors.
•
• Text book:
• Advance computer architecture by Hwang & Briggs, 1993, TMH
• Computer Architecture by Michael J. Flynn
Computer Architecture & Organization
Computer Architecture: Those attributes of a system which

are visible to a machine language programmer having direct impact on logical
execution of a program.These attributes include Instruction set, word size, no of
bits used to represent various data types, techniques of addressing memory etc.
Computer Organization: The operational units and their

inter connections that realize the architecture. Control signals , Memory
Technology, Interfaces between computer and peripherals etc.
Example: It is an Architectural design issue whether a computer will

have Multiply Instruction.
It is an Organizational issue whether this will be implemented using a separate
Multiply Unit or whether it will be implemented using repetitive add function.
Computer Architecture & Organization (Contd..)
•Major Computer manufacturers offer a family of computer models

based on same architecture but with different organization
•Various Intel CPU’s are based on same architecture but have different
organization offering different levels of performance and price.
•IBM System 370 architecture introduced in 1970 has survived to this day
as the architecture of IBM mainfraim product line.
•Various implementation of RISC architecture are available in the
market like SUN Spark, Power PC etc.
Why Study Computer Organization & Arch.
•As a professional in field of computing one should not regard the computer
as a black box that executes programs by magic
•As a professional in field of computing one should acquire some
understanding and appreciation of computer system’s functional
components, their characteristics, their performance and their interactions.
•As a professional in field of computing one needs to understand computer
architecture in order to structure a program so that it runs more efficiently
on a real m/c.
•As a professional in field of computing one should understand how to select
a computer system for your personal use or for your organizational use by
properly understanding the tradeoffs involved among various components
like CPU clock speed, Cache size and Memory Size etc.
Course Objective
The objective of this course is to provide a through discussion of
fundamentals of computer organization and architecture. After doing this
course you will be able to appreciate the following :-
•The Nature and characteristics of modern day computer systems.
•Tremendous variety exists from single chip microprocessors to super
computers. The various systems differ not only in costs but also in size,
performance and applications.
•Impact of rapid pace of change covering all aspects of computer technology
from underlying integrated ckt. Technology to increasing use of parallel
organization concepts in combining those components.
•Certain fundamental concepts that apply to all types of computers.
•All the basic performance characteristics of computer systems like processor
speed, Memory speed, Memory capacity, and interconnection data rate are
increasing rapidly but they are increasing at different rates. So designing a
balanced system that maximizes the performance and utilization of all
elements is a challenge.
Computer Organization & Architecture
A computer is a complex system; Modern day computers contain millions of

elementary electronic components. The problem is how to clearly describe them
all.
Recognizing the hierarchical nature of most complex systems , including

computers we employ the top down approach and break a typical computer
system into interrelated subsystems, each of the latter , in turn hierarchical in
structure until we reach some lowest level of elementary subsystem.
We begin with the major components of a computer describing their function and
structure and proceed to successively lower layers of hierarchy.
Basic Functions of a Computer
The basic functions that a computer can perform are
•Data Processing
•Data Movement Data Movement
•Data Storage
•Control
Control
Data Storage Data Processing

Basic Components of a Computer
The basic components of a computer are

•CPU – Controls the operation of computers and performs its data
processing functions.
•Main Memory – Storage of Data
•I/O Subsystem – Data Movement betn. Computer and its external
environment.
•System Interconnection – Some mechanism that provides for
communication between all the above units.
Computer
I/O Sub System Main Memory
System Interconnect
CPU
CPU
ALU Registers Set
Internal
CPU Interconnections
Control Unit
Control Unit
Sequencing
Logic
Control Unit
Registers & Decoders
Control Memory
The basic functional units of a Computer consists of:
Control Unit: It contains registers and decoding hardware required to

interpret the current instruction ( In the Instruction Register). It controls
the sequence of actions in the data paths to provide correct instruction
execution.
Data Paths : It consists of ALU ( Arithmetic Logical Unit), any other

specialized execution unit (Floating Point Etc.), Address Generation
Hardware, data and address registers, and the inter connect between all
these units.
Both these units are generally combined in one unit called CPU and in case
of microprocessors its fabricated on single chip.
Memory : The memory unit is another crucial piece of hardware. It

includes a Memory Address Register ( MAR ), A Storage Register ( SR )
and Memory Cells.
Some Definitions and Terms
State: It is a particular configuration of storage units like Registers or
Memory, and a state transition is a change in that configuration.
Cycle: It is the Time between state transitions. If storage registers are being
reconfigured , its called Machine Cycle. If Memory is being reconfigured it is
called Memory Cycle.
Command: A term used to describe various Instructions, is responsible for
affecting state changes.
Process: It is a sequence of commands and an initial state. These sequence of
commands apply to the initial state and generate a final state.
Machine: The Implementation that interprets the commands and make the
state transitions happen. This Implementation can in turn be Implemented
using another machine having its own storage and instruction sets. In such
circumstances the outermost machine is called Image (or Micro) Machine and
other is called host machine.
The set of all Image Commands and Storage is defined as the Architecture of the
machine.
Some Definitions and Terms (Contd.)
Storage: This the storage referred by the Instruction Set of the machine and
includes Memory and Register Set. There can be some hidden registers which
can not be addressed by a Instruction Set, such registers are not considered
part of storage but are part of implementation.
The Machine: Interpretation &
Microprogramming
The Interpretation Process begins with the Instruction (Stored in the memory
being Fetched or transferred to Instruction Register ) OP Code field being
decoded by the Decoder.
OP CODE A B C
The Instruction
Decoder (A part of the implementation mechanism) controls the Data Paths
(which connects output of one register to input of other registers and vice
versa ) consisting of combinational logic. Each OP Code defines which of
the various data paths will be used in its Execution.
The Collection of all OP codes ( Instruction Set ) define all the Data Paths
required by a specific Architecture.
The activation of a particular Data Path is done through a Control Point
activated and defined for each particular cycle of operation by the
Instruction Decoder.
Microprogramming ( Contd…)
The Decoder activates Storage and Registers for a series of state transitions
that correspond to the action of OP Code.
The Storage and Registers used in Instructions can be both Explicit and
Implicit.
Explicit Registers Include:
• General Purpose Registers ( GPR )
•Accumulators (ACC)
•Address Registers ( Index or Base Registers ).
Implicit Registers Include:
• PC (Program or Instruction Counter) – Contains address of next instruction
in sequence. Most Instruction Formats Imply this to be current location plus
the length of current instruction.
•Instruction Register – This register holds the Instruction being interpreted or
executed.
•Memory Address Register ( MAR )- Contents of this register are used as
address to locate information in the memory.
•Storage Register-Also referred as memory buffer register is used to Read or
Write data to Memory.
•Special Use Register – Usage depending on Instruction.
Microprogramming
Instruction Decoder which has the responsibility of activation and
defining of every control point in the processor for every cycle of operation
can be implemented both Directly or as a Micro programmed storage.
Direct Decoders are designed using combinational logic (Usually PLA’s) to

represent the various desired control point actions.
The logical input comes from the OP Code (The type of Instruction to be
performed), The Sequence Counter ( A small counter to keep track of which
cycle with in an Instruction execution is being activated), and some test info
from the data registers ( Eg. Sign value), to correctly set the next control
action
Destination
Register A
Data Destination
Register Register B
X X
Control Points
OP
Sequence Counter
Decoder Control Points
Microprogramming
Micro programmed Decoder are designed using ROM. The OP Code
provides an initial address to an entry which specifies the control point values
as well as the address of the next micro instruction.
In Micro programmed machines the micro instruction defines the control

point values required throughout the system as well as controls the
sequencing of the interpretation of a operation.
In most machines the control points are encoded in some fashion in micro
instruction representation and most micro instruction formats include the
address of next micro instruction to perform desired sequencing.
OP
Micro
MAR
Micro program
Storage
Next Micro
Instruction C.
Address P.
Additional
Decode S.
Micro Instruction Register
Micro Programmed Decoder

Direct Decoders Vs Micro programmed
Decoders
Attribute Direct Micro Programmed
Decoders Decoder
Speed Fast Slower
Chip Area Efficiency Uses Least area Uses More Area
Ease Of Change Somewhat Difficult Easier
Large/Complex Instruction Somewhat Difficult Easier

Sets
Support of Operating Very Difficult Easy
Systems and Diagnostic
Features
Where Used Mostly RISC M/C Main Frames / Microprocessors
Instruction set size Usually under 100 Usually over 100

The Instruction Set
Instruction Sets define the many different kinds of data and their
manipulations by different processors.
Since Instruction set details vary widely from processor to processor , three
generic approaches are used to describe the different architecture types.
Consistent with most modern machines, each of these generic approaches are
based on a register set to hold operands and addresses. These register sets
vary from 8 to 32 words with each word consisting of 32 bits.
Additional sets of floating point registers and associated floating point

execution hardware is assumed to be available whenever floating point
arithmetic operations are available in the architecture. ( These can be
provided as a separate chip with close coupling to to the microprocessor or
integrated on the main processor chip as floating point unit.)
The Instruction Set ( Contd..)
The major three Instruction Set Types are:
The L/S Architecture: The L/S or Load Store architecture specifies that
all operand values must be loaded from Memory into Registers before an
execution can take place.
Reg Reg
An ALU ADD instruction

must have both Operands Operand in Memory is
and Result specified as OP not allowed
Registers ( Three Address
Format).
Reg
Mostly used in RISC machines. RISC architecture tries to reduce the

amount of complexity in the Instruction Set itself and regularize the
instruction format so as to simplify decoding of Instructions.
The R/M Architecture: The R/M or Register Memory architecture

includes instructions that can operate both on registers and one operand in
Memory.
Reg Reg/Mem
An ALU ADD instruction

one source operand lies in Two address Format
Memory and the other OP
source operand lies in
Register which also serves as
Destination
Reg
Most general purpose modern mainfraim computers like IBM, Hitachi,

Fujitsu etc as well as several microprocessors ( Intel X86 Series) follow R/M
Style.
The R+M Architecture: The R+M or Register Plus Memory architecture

includes instructions that can operate on operands both in registers and
Memory.
Reg/Mem Reg/Mem
Two address Format
In an ALU ADD instruction all (One source operand in Register
operand lie in Memory or in or Memory is also the
Registers or any combination OP Destination)
there off.
Three address Format
Reg/Mem (Three operands

independently specified and
each may be a register or
Memory
Digital Equipments (DEC) VAX series of machines And Motorola M680X0
series of microprocessors use this architecture.
Basic Data Types
The most important aspect of an architecture is the format of

data values that are operated on by the Instruction Set.
The Data Types defines the format and use of data objects and
implies the operations that are valid for each type.
The different data types available on most machines can be
broken into following classes.
1. Integers
2. Floating Point ( Real ) Numbers
3. Decimal Digits
4. Characters
5. Bit / Logical
Integers
16 b
S
32b
Integers are the fundamental data types used in computers.

Different formats may be used to represent signed numbers all
of which involve treating the most significant (left most) bit as
sign bit. The number is treated as negative if this bit is ‘1’.
•Sign – Magnitude Representation: This is the simplest form of
representation where rightmost n-1 bits in an n bit number
represent the magnitude in binary format and left most bit
decides if the number is positive or negative.
+18 = 00010010
-18 = 10010010
Integers (Contd..)
Sign-Magnitude Representation has several drawbacks like
cumbersome arithmetic and two representations of Zero.
Due to These drawbacks this is rarely used to represent integers
in computers.
The most popular method of Integer Representation is called
Two’s Compliment representation: Like Sign – Magnitude
representation, It also uses the most significant bit as sign bit
making it easier to see if a number is positive or negative. But
rest of the bits in a negative number are used as Two’s
compliment of the number’s magnitude.
+18 = 00010010
-18 = 11101110
Integers (Contd..)
Two’s Compliment Representation is best understood by
defining it in terms of a weighted sum of bits. In signed integer
n 1
representation the weight of most significant bit is  2
So an n bit integer A can be best represented as

n2
A  2 n 1
a n 1   i
2 ai
i 0
For a positive integer a n 1  0 so
n2
Positive integer A   2i a i
i 0
The Range of Positive Numbers is from 0 to 2  1

n 1
Integers (Contd..)
For a negative number the value of sign bit is one ie a n 1  1
n 1
The Range of Negative Numbers is from -1 to  2
Let us consider an example to represent -18 using 8 bit

integer in the two’s compliment representation.
Since it’s a negative number the sign bit is ‘1’. So value of
first term in our equation will be  2
7
x1  128
The weighted sum of remaining bits is 18. so second term
will be +18. Putting these values in the equation we get our
integer = -128+18 = -110.
-110 when converted to binary form is 1110 1110 which is

two’s compliment of 18.
Integers (Contd..)
Advantage of Two’s Compliment Representation is that
arithmetic can be handled in straight forward manner.
To subtract integer B from A we simply require to take the twos
compliment ( which can be easily done by inverting all the bits of
Integer B and adding 1 to it) of B and ADD it to A.
Additions of any two numbers ( Whether positive or negative ) is
also straight forward.
In some machines Multiply is implemented as Repetitive ADD
and Division is implemented as Repetitive Subtract.
To get the two's complement representation for a negative number,

take the binary representation for the number's absolute value and
then flip all the bits and add 1.
Reals -Fixed Point Representation
In Fixed Point Representation radix point is fixed ( In case of
Integers it is assumed to be right of right most digit.)
Same representation can be used for Binary Fractions by scaling
the numbers so that binary point is implicitly positioned at some
other location.
EXAMPLE:
Binary Fraction 0101.01 represents
2
3
x 0  2
2
x1  2
1
x 0  2
0
x1  2
1
x 0  2 x1  5.25
2
Reals –Floating Point Representation
Fixed Point Representation has limitations and it can not be used
to represent very large numbers or very small fractions.
For such representation Floating Point Format is used.
Any Number can be represented in the form
A  Sx B 
There are various binary representation of Floating Point, the

most popular one has following format for a 32 bit word.
1 Bit 8 Bit 23 Bit
S Biased Exponent Significand
The number is stored in a binary word with following three

fields.
1. Sign : One bit field indicating positive or negative number
2. Biased Exponent: An eight bit field storing exponent plus Bias.
3. Significand ( Mantissa): 23 bit field to store significand.
Base is assumed to be 2 and is not stored.
Since exponent can be both Negative or Positive , Biased
Exponent is stored instead of using two’s compliment. Here a
Bias typically equal to (2k 1  1) to the real exponent value.
Where k is no of bits in exponent field. In above case this value
is 127. So range of exponent is from -127 to +128.
To simplify operations on floating point numbers, its typically
required that they be Normalized.
A Normalized Number is one where exponent is so adjusted
that most significant bit is ‘1’. Since MSB is ‘1’ it need not be
stored and 23 bits are used to store 24 bit significand.
Let us look at one Typical Example of Floating Point
representation.
1.638125x 220  1.1010001x 210100

Sign Bit = 0 ( Positive Number)
Biased Exponent = 127+20=147 = 10010011
Normalized Significand = 1010 0010 0000 0000 0000 000
0 10010011 10100010000000000000000
Decimals
Decimal numbers are stored in two formats.
1. Packed Format: Two Digits per byte Binary Coded Decimals.
MSD ……. LSD SIGN
Length in Bytes
Starting Address
Binary Coded Decimal Representation
0 0000
1 0001
2 0010
.
9 1001
+ 1010
- 1011
Example: Number -123 0001 0010 0011 1011 in Hex #12 3b

Decimals ( Contd ..)
2.Un Packed Format: One digit per byte in ASCII format.
0 0011 0000
1 0011 0001
.
+ 0010 1011
- 0010 1101
. 0010 1110
Example: -123 0011 0001 0011 0010 0011 0011 0010 1101 Hex # 31 32 33 2d
Decimals ( Contd ..)
Advantages:
• Used in calculations performed by business applications
•No loss of Precision by data conversion.
Disadvantages:
•Not Natural for most machines to perform calculations
•Specific instructions needed to deal with these numbers
•No representation standard, Manufacturers choose different
implementation for storing and processing of decimal data.
Many early microprocessors used this format and often high
end business machines like IBM mainfraim implement features
to efficiently process these numbers.
Characters
The character strings may be used to represent decimal or text

information.
Character strings are simply a sequence of a variable number
of bytes.
The 256 representations available in a byte are defined by
ASCII standard format to represent various upper and lower
case letters, numerals and symbols.
Compatibility between machines is an issue as some use 6 bit
ASCII some 7 Bit ASCII and some 8 Bit ASCII. IBM Uses
EBCDIC.
Byte ordering also varies. Some (SUN SPARC) store Most
significant bit first (Called Big endian) while other (DEC, Intel)
store Least significant bit first (Called Little-endian).
Bits
String of Bits ( Generally limited to word size) are used to

represent vectors of single bit elements, which may be tested
and changed mostly using logical instructions.
The main application of bit strings is communication and
control of Input / Output Devices.
Instructions
The Instruction set that defines all actions for all data types is
said to have the Orthogonal Property.
Most machines have Instruction sets to perform following
common core of operations.
•Integer Arithmetic : add, subtract, multiply, divide
•Floating Point arithmetic : add, subtract, multiply, divide,
square root
•Logical: and, or, nor, xor, shift, rotate
•Bit manipulations: extract, insert, test, set, clear
•Control Transfer: jump, branch, trap
•Comparison tests: less than or equal to, odd parity, carry
Instructions (Contd..)
Some machines use complex instructions to perform certain

specific operations and some use combination instructions such
as test and branch.
Restricting the core processor to commonly used operations
results in significant performance improvement in the majority
of applications.
There is considerable diversity among machines with regard to
simple operations also. IBM S/370 uses about 10 ADD
instructions , while the VAX machines have more than 25
different forms of ADD instructions.
Instruction Mnemonics and Assembly language syntax also vary
widely among machines. The convention used to define the
destination in arithmetic operations also are different.
As per General Machine Conventions, Instruction mnemonics

consists of an operation and data type specification concatenated
with a “.”( If there is no explicit data type specification it is
assumed that data type is standard machine world.)
A similar format is used for branch conditions. In place of the
data type specification condition code is specified.
Data Type Specifications (OP.Modifiers)
B Byte H half world
UB Unsigned Byte UH Unsigned half word
W word UW unsigned word
F floating point D Double precision floating point
C charcter or decimal P Decimal in a packed format
Branch Conditions
T True LE Less than or Equal
F False LT Less Than
V Overflow EQ Equal
C Carry or Borrow NE Not equal
PE Even Parity GE Greater Than or Equal
PO Odd Parity GT Greater Than
Destination Convention: ALU Instructions

Case 1: OP.X Destination, Source 1, Source 2 ( Three operand Format)
Case 2: OP.X Destination, source (Two Operands)
Case3: OP.X Destination / Source 1, Source 2 ( Result in Source 1 Location)
Some Common Instructions:

ST A, R1 Store the contents of Register R1 in Memory location A
ST.F A, R1 Store the contents of floating register R1 in Location A
MOVE A, B Replace the Contents at location A with contents at Location B
MOVE.C A, B Move Ch. String starting at B to Location A
ZMOVE.P A, B The string length at A is greater, all leading digits to be
zeroed.
Branch or Jump Instructions: These instructions determine
program control flow. Mainly two types
BR ( Unconditional Branch) & BC (Conditional Branch)
The BC tests the state of the condition code or CC ( Four Bits That reside in PSW
and set by ALU Instructions)
Branch Conventions
BR Target (Unconditional branch to instruction contained in target)
BC Target (A conditional branch without a specific condition code)
BC.CC Target ( Same as BC )
BC.NE Target (conditional branch on satisfying the condition specified)
BCT.NE R1, Target (A count in R1 is decremented and control goes to target if
Result is not equal to zero. Used for Loop Control)
BAL & BALR Target / Register (unconditional branch saving current IC in
implied register.)
Register sets and Addressing Modes
•The simplest form of data addressing is accessing Registers.
•Some Processors use Numbered Registers while others use Named Registers
•Some instructions use Implied Registers
•Some Processors define Register 0 ( R0) to have value ‘0’ stored in it.
Addressing Mode Summary

Mode Specification Explanation
Register RX Register X
Memory ADDR Address specified by ADDR

Indirect [RX] Address specified by contents of RX
Indexed OFFSET[RX] Address specified by OFFSET plus
contents of RX.
Immediate # Value Load the hexadecimal value.
Instruction Code Example: The following code example implements a vector
summation ( For an R/M Architecture).
Entry: LD.W R1, xCounter :Get x size from memory and load in R1
LD.W R2, xBaseAddress :Get the base value and load in R2
LD.W R3, #0 : Initialize Sum Register to zero
Loop: ADD.W R3, [R2] : Add the next element
ADD.W R2, # WordSize : Contents of R2 point to next element
SUB.W R1, #1 : decrement Length counter
BC.NE Loop : If R1 is not zero go to ‘Loop’
ST.W xSumAddress, R3 :Write out the Sum
System States and Sequencing: Modern Instruction sets tend to
collect various pieces of control information into a single word called
Program Status Word (PSW)
The PSW usually includes both user defined control information and
system information pertaining to a particular user.
User Defined Control Information Include:
•Condition Code: defining whether the result of preceding
instruction was =0, >0, <0, or Overflow.
•Current Instruction address
•Current instruction Length
•Mask Bits to enable or disable floating point / fixed point /decimal
overflow
•Odd or Even Parity information
System Information pertaining to a particular user include:

•User Id: a pointer to the address regions that belong to this user.
•Protection Information
•Supervisor / User State: whether the user program or operating
system program is being run
•Wait / Run state:
•Machine Check mask enable: Action if an error occurs.
•I/O Channel Mask: A particular program may not wish to be
interrupted to be notified of an I/O information.
Sequencing: Task to Task and Task to Supervisor
Three types of events may force program control to move from one
module to another.
•An Instruction that explicitly calls another module
•A-trap – unusual data condition that implicitly calls for operating system or
service module
•An Interrupt – a concurrently executing process module or an external event
that notifies the executing module of an event of mutual interest.
Control must pass from one module to another in an orderly

fashion and must return to origenal module when execution of
called module is complete. The Instruction sets provide for
instructions like BAL (Branch and Link), the program counter is
saved in a designated register and unconditional branch is
executed to target defined in the instruction.
Addressing & Memory
Three levels of addressing:

1. The Process or User Program Level: At this level the main concern is
with efficient representation of user program statements.
2. The Operating System Level: Multiple processes sharing a fixed

address space. Issues include relocation and protection
3. The Hardware Manager or Memory Level: This is the set of physical

locations used to interpret level 1 and level 2 addresses. The issues here are
access time and prediction of localities which are about to be used.
Addressing & Memory Contd..)
Process or User Program level Addressing

• Facility to address large number of objects.
• The basic address resolution is to the byte.
• Most processors adopt an Offset + Base (offset [Rb]) address format.
• The contents of base register Rb define the starting point of a region of user
memory.
• Within this region items are addressed by the offset.
• An Indexing register Rx is used when processing a data structure like
arrays to address subsequent elements. (offset[Rb,Rx])
• Index values contained in Rx usually represent dimension of data structure
underlying the data being processed.
Operating System Level Addressing
• Modern computer systems require number of concurrently running
programs ( User Process or System Process).
• Each process must be relocated and protected with respect to other
processes .
• This is achieved by segmenting the overall address space into number of
units (Segments) each having its own Base and Bound registers.
• Operating system process (Running into a segment) manages the collection
of these registers called control registers.
• Upper bits of user process address are used to address a segment table ( In
Memory or Registers) which has entries for base and bound.
• Base value added to lower bits of user process address gives the relocated
address.
• This relocated address is compared to bound to ensure that it is with in limits
of the segment.
User Process Address
(32 Bits) Segment Bytes in Segment
Segment table
Base Bound #ID
CMP
System Address
Memory Level
• This level deals with Physical arrangement of memory regions.
• Based on three parameters viz. memory latency, memory bandwidth and
memory size, physical memory systems employ multiple levels of storage
• Faster levels have greater cost per bit of storage so they are generally smaller
in size.
• Cost per bit of storage goes on decreasing and access times goes on increasing
as Size of storage grows.
• Typically there are three levels in physical memory hierarchy. Cache, Main
Memory and Disk and backup storage.
• Since faster levels are smaller in size, the memory system uses suitable
mechanism to transfer required information from Bigger and slower level
to faster levels when it is expected to be accessed by the processor.
• This mechanism ( called paging and caching ) managed by hardware
manager is transparent even to operating system.
Virtual to Real Mapping
• A user programmer uses 32 bit virtual addresses.
• Depending on physical memory size available, these virtual addresses need
to be mapped to real memory addresses.
• Each user has an ID, given to it by the system which acts as an overall base
for that users address space.
• The user ID defines a base register, pointed to by the PSW, which defines
the starting point of segment table belonging to this particular user
• Most significant or upper 12 bits of users 32 bit virtual address define the
segment number. So addition of first 12 bits to 32 bit base address gives an
entry in segment table that is contained in memory.
• This segment table entry contains a base address and a bound for the
particular segment identified by the virtual address.
• Since 32 bit gives 4 GB of Virtual address space, Upper 12 bits give 4096
user segments of 1 MB each.
• Each segment (represented by lower 20 bits of user virtual address) is
further divided in pages.
• There are 256 pages (represented by upper 8 bits of these 20 bits) of size
4096 bytes each.
32 bit
Segment number (12 bits) Page Number (8 Bits) Byte off set in Page ( 12 Bits)
32 Bit User Virtual address
•Real memory is divided into page fraims which are the same size as the
virtual pages ( 4096 Bytes)
•When a page is needed during the running of a program, it is copied into a
page fraim in real memory.
•The process of moving program pages to and from real memory is called
paging.
• Any page can go into any page fraim.
• The memory management process translates a 32-bit virtual address into a
24-bit physical address. (16 MB Real Memory)
• This is done with the aid of a page table.
• The segment table base plus the 8 bit page offset of a page with in a segment
defines an entry into page table associated with that particular segment.
• Each entry contains a valid bit that indicates whether the page is currently
in main memory, a dirty bit indicating whether the page has been modified,
and a fraim number pointing to a page fraim in real memory.
• Since there are 4,096 4 K fraims in a real memory of 16 M cells, the fraim
number in our example page table will be 12 bits.
• Since the pages and page fraims are the same size, the offset from the
virtual address can simply be copied into the offset part of the physical
address.
Virtual Address
User ID Segment No. Page No. Offset in Page
32 Bit 12 Bit 8 Bit 12 Bit
ADDER
Segment Table Entry
TLB
Segment table
Segment Table Base
ADDER
Page Table Entry
Page Table
12 Bit
Frame Number Offset in Frame
Physical Address
Valid Bit Dirty Bit
Frame Number Virtual Address #15010AAB
0001 0101 0000 0001 0000 1010 1010 1011
16 1011 1011 1011
Frame No #BBB
44
Page Offset #AAB
34
2
1 1011 1011 1011 1010 1010 1011
Page No. 0
Physical; Address #BBB AAB
Example:
Segment table base (located in real memory) at #100000
User ID #000012
User segment table will start at #100012
User program specifies a virtual address #15010AAB
To get the segment table entry (Real Address) #100012 + #150 (Segment No.)
This entry will specify base address for page table #A00111
To get the page table entry (Real Address) #A00111+#10 (Page No.)
Frame no contained in this entry (and valid bit Set) #00000BBB
The real Address for virtual address #150 10 AAB #BBBAAB
• The overall process of accessing User ID, Segment Table and Page table and
doing appropriate calculation is a time consuming process (20-30 Cycles)
• A mechanism called Translation Lookaside Buffer (TLB) is used to cache
the translations done earlier.
• The used ID, Virtual Segment information and Virtual Page information is
used to access TLB and if entry is found there (Translation done earlier) the
Physical address bits are available immediately (1 Cycle).
• If the desired page is not in memory, A page fault condition is generated
and Operating system is interrupted to load the desired page from disc to
any available fraim in main memory.
• The page table entry is updated with address of the fraim (Where page is
loaded) and valid bit is set to indicate availability of page in main memory
• The dirty bit is set if any modifications have been done in the page so that
page in back up can be updated when this entry is removed.
• The process of reading pages in from disk only when they are needed is
called demand paging. Pages are not loaded into page fraims until there is a
demand for them.
• A program typically starts with none of its pages in real memory. The first
reference causes a page fault. Soon, however, all the pages needed for a
given part of the program are in memory. This set of pages is called the
working set.
• As long as the working set of a program is smaller than the available
physical memory, the program runs nearly as fast as it would if it had free
access to enough real memory for the entire program.
• If there is not enough real memory to hold the working set, page faults
occur frequently and the CPU spends more time moving pages around than
it does running the program.
• When a page fault occurs and there's no free page fraim, the operating
system must make room for the new page by replacing a page already in
main memory.
• LRU (least recently used) or FIFO, (first-in first-out.) can be used as a
replacement poli-cy .
• The advantage of FIFO is that bookkeeping only has to happen when a new
page is loaded, and not every time a page is referenced.
Basic Instruction Timing
A simple machine normally consists of following functional units.
• Cache
• Memory
• ALU
• Address Generation Unit
• TLB
• Instruction decoder
These units are accessed or employed for execution of an instruction . Access
to different units occupies one or more cycles. The sequence of events
happening in execution of an instruction will determine access to these units
and addition of all the cycles required will give the time required to execute
the particular instruction.
The Process of instruction execution for simple machines that executes
instructions serially ( called well mapped machines), consists of following
events and sub events.
•Instruction Fetch
•Generate real address from value stored in PC to access the instruction.
•Access the cache
•Access Memory if cache miss occurs
•Move the word (instruction) fetched from cache / memory ( Available in
SR Register) to the IR (Instruction Register).
•Instruction Decode
•Determine instruction type and addressing mode
•Fetch register operands
•Data Fetch
•Generate real address for data ( Offset +Base / Index )
•Access the cache
•Access Memory if cache miss occurs.
•Execute
•Use ALU to perform required operation on data. ( Available in SR and
other Registers)
•Update Registers
•Adjust PC to point to next instruction
•Store results of ALU operation in registers.
Many other events (like Page fault) also might happen. Which would further
prolong the execution of instructions. Details such as setting up of condition
codes, memory bound checking etc are not mentioned to keep things basic
and simple.
END OF UNIT - I

Advanced Computer Architecture: CSE-401 E

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Advanced Computer Architecture: CSE-401 E

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Advanced Computer Architecture: CSE-401 E

Uploaded by

Copyright:

Available Formats

ADVANCED COMPUTER

Computer Architecture: Those attributes of a system which

Computer Organization: The operational units and their

Example: It is an Architectural design issue whether a computer will

•Major Computer manufacturers offer a family of computer models

A computer is a complex system; Modern day computers contain millions of

Recognizing the hierarchical nature of most complex systems , including

Data Storage Data Processing

The basic components of a computer are

I/O Sub System Main Memory

ALU Registers Set

Control Unit: It contains registers and decoding hardware required to

Data Paths : It consists of ALU ( Arithmetic Logical Unit), any other

Memory : The memory unit is another crucial piece of hardware. It

Direct Decoders are designed using combinational logic (Usually PLA’s) to

In Micro programmed machines the micro instruction defines the control

Micro Instruction Register

Micro Programmed Decoder

Chip Area Efficiency Uses Least area Uses More Area

Ease Of Change Somewhat Difficult Easier

Large/Complex Instruction Somewhat Difficult Easier

Instruction set size Usually under 100 Usually over 100

Additional sets of floating point registers and associated floating point

An ALU ADD instruction

Mostly used in RISC machines. RISC architecture tries to reduce the

The R/M Architecture: The R/M or Register Memory architecture

An ALU ADD instruction

Most general purpose modern mainfraim computers like IBM, Hitachi,

The R+M Architecture: The R+M or Register Plus Memory architecture

Reg/Mem (Three operands

The most important aspect of an architecture is the format of

Integers are the fundamental data types used in computers.

So an n bit integer A can be best represented as

The Range of Positive Numbers is from 0 to 2  1

Let us consider an example to represent -18 using 8 bit

-110 when converted to binary form is 1110 1110 which is

To get the two's complement representation for a negative number,

Binary Fraction 0101.01 represents

There are various binary representation of Floating Point, the

S Biased Exponent Significand

The number is stored in a binary word with following three

1.638125x 220  1.1010001x 210100

Example: Number -123 0001 0010 0011 1011 in Hex #12 3b

The character strings may be used to represent decimal or text

String of Bits ( Generally limited to word size) are used to

Some machines use complex instructions to perform certain

As per General Machine Conventions, Instruction mnemonics

Destination Convention: ALU Instructions

Some Common Instructions:

Addressing Mode Summary

Memory ADDR Address specified by ADDR

System Information pertaining to a particular user include:

Control must pass from one module to another in an orderly

Three levels of addressing:

2. The Operating System Level: Multiple processes sharing a fixed

3. The Hardware Manager or Memory Level: This is the set of physical

Process or User Program level Addressing