0% found this document useful (0 votes)
12 views

Computer Organisation and Architecture (1)

The document outlines the syllabus for a Computer Organization and Architecture (COA) course, detailing the chapters and their respective topics, including components of computers, instruction formats, ALU, pipelining, cache memory, and I/O interfaces. It also discusses the evolution of computer generations and the significance of studying COA. Additionally, it covers the instruction cycle, memory concepts, and types of registers within a CPU.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Computer Organisation and Architecture (1)

The document outlines the syllabus for a Computer Organization and Architecture (COA) course, detailing the chapters and their respective topics, including components of computers, instruction formats, ALU, pipelining, cache memory, and I/O interfaces. It also discusses the evolution of computer generations and the significance of studying COA. Additionally, it covers the instruction cycle, memory concepts, and types of registers within a CPU.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 271

Monday, July 8, 2024 4:32 AM

weightage : 9 - 12 marks

chapter - 1 (introduction to COA)

(i) Introduction
(ii) Components of computer
(iii) types of registers
(iv) instruction cycle
(v) memory concept
(vi) byte & word addressable
(vii) system bus
(viii)byte ordering

chapter - 2 (instruction format and addressing modes)

(i) Instruction concept


(ii) machine instruction
(iii) instruction format
(iv) expand opcode technique
(v) addressing modes concept
(vi) types of addressing modes
Weightage Page 1
(vi) types of addressing modes
(vii) instruction set architecture

chapter - 3 (ALU, data path and control unit)

(i) Data path


(ii) micro instruction
(iii) micro program
(iv) control unit design

chapter - 4 (pipelining)

(i) Pipeline concept


(ii) Pipeline types
(iii) Perfomance evaluation
(iv) Dependencies in pipeline
- structural dependency
- data dependency
- control dependency
(v) pipeline hazards

chapter - 5 (cache memory)

Weightage Page 2
(i) Memory concept
(ii) types of memory organization
(iii) cache memory
- cache organization
- mapping technique
- replacement algorithm
- updating technique and multi level cache

chapter - 6 (secondary memory and I/O interface)

(i) disk concept


(ii) disk structure
(iii) disk access time
(iv) I/O interface and its types

Weightage Page 3
topic - 1 (introduction)
Monday, July 8, 2024 4:45 AM

Introduction of COA

Topic : Computer generation

Ist IInd IIIrd IVth and Vth

1942-1955 1955-1964 1965-1974 1974-present

component Vaccumetube Transistor Integrated VLSI/ULSI


chip very large scale
integration

Language M/C language Assemly H.L.L OOPS and


and OOP's RDBMS, AI

Note :
First digital computer was ENIAC in 1943
(Electronic numerical integrator and computer)

Topic : Computer organisation and architecture

why study COA?


to know how the computer works and design(built).

COA

Chapter - 1 (intro to COA) Page 4


COA

Computer architecture Computer organisation


attribute that is visible deals with how features will
programmer implement
- represents internal - how various memory and
design I/O interact and implement
- Instruction format or communicate with the
- Adressing mode system
- ALU
- number of bit required
to represent the data

Note :
intel x86 share the same architecture but ogranisation is different(how
features are implemented that is different)

intel x86
80186
80286 family
80386
80486
80586

Chapter - 1 (intro to COA) Page 5


topic - 2 (components of computer)
Monday, July 8, 2024 4:46 AM

component of the computer

ALU (arithmetic logical unit)


1. CPU Registers
CU(control unit)

primary/main memory
2. memory
secondary/auxillary memory

Input device
3. I/O
Output device

Register : stores the data (temporary storage)(register present inside the CPU)
(fastest) (made with flip-flop; flip-flop is 1 bit storage device)

Size : SM>Main Memory>cache>register

Speed: register>cache>Main Memory>Secondary memory

8 bit register : 8 bit storage device / stores 8 bit data.

register types in CPU

1. Based on the task assigned to them


(A) General purpose register : not for any specific purpose
(B) Special purpose register : made for specific purpose

Chapter - 1 (intro to COA) Page 6


topic - 3 (types of registers)
Monday, July 8, 2024 4:46 AM

types of registers
memory buffer register : hold the
Special purpose register types : instructions or data
why MBR/DR/MDR(data)?
connected to the data line of
the system bus

Accumlator : contains the temporary


result of the ALU operation (or) first
operand of the ALU operation.
example : ADD[4000]
AC AC + M[4000] general purpose register : used to process the data.
stack pointer (sp register) : contains the top of stack
address. stack[LIFO:last in first out] memory insert
and delete operation perfomed at same end(one end)
program counter : contains starting called TOS and this TOS address is pointed/denoted
address of the next instruction to be by stack pointer register.
executed (fetch) (supplied by programmer) PSW (program status register)/flag regsiter : it stores
but dont know the address. the status of the ALU result.

instruction register : contains the


instruction which is currently
executed by the CPU. memory address register : stores PC : next instruction address
why only IR and not mbr? all the address of memory used for IR : instruction of current execution
because(Instruction format is read/write operation.
pre-defined in IR) why MAR/AR?
because it is connected to address
line of the system bus. knows address
and how to and where to go etc.

1. Memory address register (MAR)

2. I/O address register (I/O AR)


stores the address
3. Program counter (PC)

4. Stack pointer register (SP)

5. Memory buffer register (MBR/MDR/DR)

6. I/O data/buffer register (I/O BR)

7. Instruction register (IR) stores the instruction (or) data

8. Accumlator (AC)

9. Flag register / program status word (PSW)

Program counter : When instruction is fetched (fetch cycle executed) then PC denotes the starting address of next
instruction.

Chapter - 1 (intro to COA) Page 7


Program counter : When instruction is fetched (fetch cycle executed) then PC denotes the starting address of next
instruction.

PC increment by 1 (or) PC increment by '4' (1word=4Byte) or '8'(1word=8Byte) or 'x' value.


gets Instruction address register / Instruction point register.

2. Based on the information/content they have

(A) Data register : stores the data


(B) Address register : store the addresses

memory access through MAR and MBR to

Address : MAR/AR
1. Memory read
Data : MBR

Address : MAR/AR
2. Memory write
Data : MBR

Chapter - 1 (intro to COA) Page 8


topic - 4 (instruction cycle)
Monday, July 8, 2024 4:46 AM

Instruction cycle

the process required for each instruction exectuion

(OR)

the phases that we need to go through in order to exectute the instructions

(OR)

instruction cycle describes the execution sequence of the instruction

Instruction cycle contains 2 sub cycle :


1. Fetch cycle
2. Exectute cycle Decode
Execute

1. Fetch cycle : to fetch (bring) the instruction from main memory to CPU. (doesn't care
(works as a what is the instruction)
postman)
Memory CPU(IR)
instruction store hota hai
and at the end of fetch cycle program counter is incremented. memory mai aur execute
hota hai CPU mein
instruction is stored in memory location 1000
1000 I1 : Load [6000] (load : memory read)
address AC M[6000] address
accumulator mai memory location 6000 ka
data(instruction) daaldo (it loads in IR)

fetched instruction
is loaded into the IR

PC MAR instruction MBR IR


1000 adress with the help of
memory line
and data with the help of
I1 : Load [6000]
supplies data line

1000 I1 : Load [6000]

(memory)

2. Execute cycle : the objective of the execute cycle is to execute (to process) the fetch instruction.
it decodes; does the analysis of the instruction. (what is OPCODE, how many
operand, operand address calculation, operand fetch, processing, result storage)
Chapter - 1 (intro to COA) Page 9
operand, operand address calculation, operand fetch, processing, result storage)

PC MAR instruction MBR IR


1000 adress with the help of
memory line
and data with the help of
I1 : Load [6000]
supplies data line

1000 I1 : Load [6000]

(memory)

decoder decodes what is in the


data that has been passed
AC MBR instruction MAR DECODER (analysis of instruction)
(what opcode, how many operand)

AC AC + M[6000]
and data [6000]
with the help
I1 : Load [6000]
(operand fetch)
of system buses

[6000] whatever is stored AC M[6000]


decode I1 kya kehnta chata hai,
(memory) accumulator mai memory
Load : Memory read location 6000 ka content
Store : Memory write daaldo

Step 1 : at the beginning of each instruction cycle the processor fetches an instruction
from memory.

Step 2 : the program counter (PC) holds the address of next instruction to be fetched next.

Step 3 : The processor increments the PC after each instruction fetch so that it will fetch the
next instruction in sequence.

Step 4 : The fetched instruction is loaded into the instruction register (IR)

Step 5 : The processor interpret the instruction and performs the action

fetch cycle (i) IAC (instruction address calculation)


(ii) IF (instruction fetch)

(iii) decoding (analysis of instruction) (what OPCODE


decode how many operand, where operands are available)
(iv) OAC (operand address calculation)
execute cycle
(v) OF (operand fetch)
Chapter - 1 (intro to COA) Page 10
how many operand, where operands are available)
(iv) OAC (operand address calculation)
execute cycle
(v) OF (operand fetch)

execute (vi) DP (data processing)


(vii) result storage

Instruction cycle : break/stop the sequence of fetch/execute cycle


halt

execute cycle

fetch cycle

start

Instruction cycle with interrupt cycle :

when CPU encounter the interrupt then after finishing (completion) of


current instruction execution, interrupt will be serviced.

fetch cycle - when CPU encounter the interrupt then it push the PC (program
counter) value into the stack as a return address and control
transfer to ISR (interrupt subroutines)

execute cycle

push the PC value


in the stack as
no return address
check
interrupt
unusual event
that disturbs
the flow
yes

service the
stack
interrupt
why PC value Push into stack?
because stack works in LIFO
the value you are leaving at last will be the first we get.

Chapter - 1 (intro to COA) Page 11


why PC value Push into stack?
because stack works in LIFO
the value you are leaving at last will be the first we get.

What happens if return address stored in Queue?

PC value
returns
insert
different
(X)
address
(Y)
I3 fetch PC:1002 becomes 1003
PC moves to next instruction
I1 : 1000 PC : 1003
I2 : 1001 when we will go to service the
I3 : 1002 (interrupt) interrupt we will be going before
I4 : 1003 telling the stack in CPU about
the next instruction so we will
continue the next time from here stack(1003)
after execution of I3 we have to
service interrupt (interrupt after servicing the interrupt i will
will have any address (call) POP the PC value from stack. stack
where we have to service) so PC
will save the address of
interrupt and after servicing
the interrupt we return to I4

Chapter - 1 (intro to COA) Page 12


topic - 5 (memory concept)
Monday, July 8, 2024 4:47 AM

memory

byte addressable word addressable

cells cells
8 bit 1 word
8 bit 1 word 1 cell size = 1 word
1 cell size = 8 bit
1 word = word size
8 bit = 1 byte
8 bit 1 word word size = 32 bit.
1 word = 32 bit.
8 bit 1 word
8 bit 1 word

8 bit 1 word

note:
word size is given in the question

word to byte and byte to word conversion :


1 word = 32 bit [4byte] 4 byte = 1 word
2 word = 64 bit [8 byte] 8 byte = 2 word
4 word = 128 bit [16 byte] 16 byte = 4 word

Q. I1 : 1 word Program stored at starting address 2000, memory is


I2 : 2 word (i) word addressable
(ii) byte addressable
during the execution of I2 what is the value of PC? (word size : 32bit)

(i) word addressable :

I1 32 bit 2000

I2 32 bit 2001

next instruction 2002

starting address : 2000


when I2 executing means I2 fetch then PC denote next instruction starting address
so, the value of PC will be 2002

Chapter - 1 (intro to COA) Page 13


so, the value of PC will be 2002

(ii) byte addressable :

2000 8 bit 1 word = 32bit (4 byte)

2001 8 bit I1 32 bit


I1 32 bit
2002 8 bit
I2 32 bit
2003 8 bit

2004 8 bit

2005 8 bit
I2 32 bit
2006 8 bit
2007 8 bit
starting address : 2000
8 bit when I2 executing means I2 fetch then PC denote
2008
next instruction starting address
2009 8 bit so, the value of PC will be 2008

2010 8 bit

2011 8 bit

Questions :

Chapter - 1 (intro to COA) Page 14


after I6 executing means I6
fetch then PC denote next
instruction starting address

PC : 1004 interrupt encounter then PC


values stores the next
interrupt instruction starting address,
push into the stack as a return
stack(1004) address

stack

Chapter - 1 (intro to COA) Page 15


depends on type of instruction

register to/from memory


transfer : 4 clock cycle
MUL with both operand and
stored in register : 6 clock cycle
instruction fetch and
decode : 3 cycle per word

HALT : only fetch and not decode

Chapter - 1 (intro to COA) Page 16


topic - 6 (byte and word addressable)
Monday, July 8, 2024 4:47 AM

byte and word addressable

- memory is a storage element in the computer which store instruction and data.
- memory is organised into equal parts, each parts is called cells.
- each cell is identified by a unique number called as address.

byte addressable

cells
instructions
and data 8 bit 00000000
1 cell size = 8 bit instructions 8 bit
and data
every cell storing
the operation that needs to be
8 bit data instructions
and data 8 bit done in that particular cell is
instructed by control signal
instructions
and data 8 bit
instructions
and data 8 bit
instructions
and data 8 bit 11111111 Address
every cell is identified with
a unique number that is
called address

Address :
reach Data : read/write Control line : work
through will carry that need to be perfomed
address line through data line

note :
word size = number of Data lines = data bus width
(data register size)(MBR/MDR/DR)
word (unit of data a/c to CPU) CPU talks in word unit.

Chapter - 1 (intro to COA) Page 17


registers
word blocks pages secondary
main
CPU cache (logical)
memory
memory

CPU jo bhi baat karega


vo word ki language mai karega

note :
MAR/AR register size = number of address lines = address bus width

ex: ex.
32bit processor 24bit address
word size = 32bit Address line = 24bit
Data line = 32bit Address Bus = 24bit
Data bus = 32bit MAR = 24bit
MBR = 32bit AR = 24bit
DR/MDR = 32bit PC = 24bit
ALU = 32bit
AC = 32bit
Register = 32bit

Chapter - 1 (intro to COA) Page 18


21 =2 230 =1G (giga)
22 =4 240 =1T (tera)
23 =8 250 =1P (peta)
24 =16 260 =1E (exa)
25 =32 270 =1Z (zeeta)
26 =64 280 =1y (yotta)
27 =128
28 =256
29 =512 1 byte = 8 bit
210 =1024 1 Nibble = 4 bit
210 =1k (kilo)
220 =1m (mega)

n bits =........ then N = ?


3 bit = 23 = 8
13 bit = 213 = 23 x 210 = 8K (8 x 1k)
22 bit = 222 = 22 x 220 = 4M (4 x 1M)
35 bit = 235 = 25 x 230 = 32G (32 x 1G)

N =........ then n = ? bits


(always search for closer and greater)
20 = 5bit
50 = 6bit
150 = 8bit (28=256 not 27=128 because it is lesser than 150)
200 = 8bit
100 = 7bit
500 = 9bit
512 = 9bit

memory is represented in the form of 2 n x m

n : number of address line (a.l)


m : number of data line (d.l)

Address line : specify the capacity of the memory (0000 - 4bit) (oo1o oo11 - 8bit)
Data line : specify the capacity of data (cell size)

n-bit address line can represent 2n memory cells


(memory locations)
o to 2n - 1
because memory
starts from 0.

or in some places :
NxM
N : total number of memory cells ( memory location)
Chapter - 1 (intro to COA) Page 19
N : total number of memory cells ( memory location)
M : each sell size

to represent any of the cells (among 2n) n bit address are required.
(or)
n bit address can represent 2n memory cells

1 cell ka size
N = 2n
0 m bit

1 m bit
n = .. bit address
. line
m bit
. total 2n
. m bit memory cells
.
. m bit x
kisi bhi 'x' cell ka
2n - 1 m bit
address n-bit
(n:address line)
memory 0 se start mai likha jayega
hoti hai islie - 1
o to 7 = 8
n = address (4bit)
m = cell size N = # memory cells
8bit 16 byte
0000 8bit 1 byte (group of 8 bits)
for example : 16 byte meri memory hai 0001 . .
.
total 2n mere cells hai 0010
.
. .
16 ke lie 4 bit lagega 0100 . .
n = 4 bit
. .
aur 1 byte means 8 bit 1000
. . address line
1001 .
. .
1010
. .
16 byte = 2n x m
0101
. . 24 [16 cells]
0110 . .
24 x 8bit 0111 . .
.
4 bit ki address line .
.
.
8 bit ki data line 1011 . .
. .
1101 .
.
n = 4 bit (address line) 1100 . .
. .
m = 8 bit (data line) 1110
.
8bit
N = 24 =16 memory locations/cells 1111 1 byte

cell size
8bit
n = 2 (capacity of memory) address line. cell 0
m = 8 (cell size/cell capacity) data line.
cell 1
N = 4 (total cells) 22 [4cells]
cell 2
N = 2n memory location/cells
N = 22 = 4 memory location/cells cell 3

Chapter - 1 (intro to COA) Page 20


22 [4cells]
cell 2
N = 2n memory location/cells
N = 22 = 4 memory location/cells cell 3

m bit
memory = 2nx m
each cell size = m bit m bit
total 2n memory cells
m bit
total 2n
to represent any of the cells among 2 n m bit memory cells
n bit address are required.
m bit

m bit

Questions :
Q.1. Memory : 1k byte

= 210 x 8bit (1k=210 and 1byte = 8bit)

10 bit address line (MAR)


8bit data line (MBR)
210 memory cells / locations
10bit address is required to represent any cell (among 2 10)

n = 10 (address line)
m = 8 (data line) 10bit
N = 1024 (210) N = 210
0000000000 0 8 bit
.
.
. 1 8 bit
.
. .
8 bit
.
.
. total 210
. . 8 bit memory cells
.
. .
.
. . 8 bit x
.
1111111111 210 - 1 8 bit

kisi bhi 'x' cell ka


address 10-bit
(n:address line)
mai likha jayega
n = address (10bit)

Q.2. Memory : 8 byte

Chapter - 1 (intro to COA) Page 21


= 23 x 8bit (23 and 1byte = 8bit)

3 bit address line (MAR)


8 bit data line (MBR)
23 memory cells / locations
3bit address is required to represent any cell (among 2 3)

n = 3 (address line)
m = 8 (data line) 3bit
N = 8 (23) N = 23 = 8 cells
0x0 000 0 8 bit
.
.
. 1 8 bit
.
. .
. 8 bit
. . total 23
.
.
. 8 bit memory cells
. .
.
. . 8 bit x
.
. .
8 bit
.
.
.
. . 8 bit kisi bhi 'x' cell ka
. address 3-bit
.
23 - 1 (n:address line)
0 x7 111 8 bit mai likha jayega

n = address (3bit)

Q.3. Memory : 64kbyte

= 26 x 210 x 8bit (26 =64, 210 =1k and 1byte = 8bit)


= 216 x 8bit
16 bit address line (MAR)
8 bit data line (MBR)
216 memory cells / locations
16 bit address is required to represent any cell (among 2 16)

n = 16 (address line)
m = 8 (data line)
N = 65536 (216) N = 216 = 65536 cells

Chapter - 1 (intro to COA) Page 22


0000 0000 0000 0000 0000 0 8 bit 0x000
0 0 0 0
.
. 1 8 bit
.
. .
. 8 bit
. . total 216
.
.
. 8 bit memory cells
. .
.
. . 8 bit x
.
. .
8 bit
.
.
.
. . 8 bit kisi bhi 'x' cell ka
.
address 16-bit
.
1111 1111 1111 1111 FFFF 216 - 1 8 bit
(n:address line)
0x3FF
mai likha jayega
F F F F

(1hex = 4bit)

n = address (16bit)

Q.3. Memory : 1Gbyte

= 230 x 8bit (230 =1G and 1byte = 8bit)

30 bit address line (MAR)


8 bit data line (MBR)
230 memory cells / locations
30 bit address is required to represent any cell (among 2 30)

n = 30 (address line)
m = 8 (data line)
N = 1G (230) N = 230 =1G
0000 0000 0000 .... 0000 00000000 0 8 bit
0 0 0 0
.
.
.
1 8 bit
. .
. 8 bit
. . total 230
.
.
. 8 bit memory cells
.
.
.
. . 8 bit x
.
. .
. 8 bit
.
.
.
.
. 8 bit kisi bhi 'x' cell ka
.
address 30-bit
1111 1111 1111 ..... 1111 3FFFFFFF 230 - 1 8 bit
(n:address line)
mai likha jayega
F F F F

n = address (30bit)

Chapter - 1 (intro to COA) Page 23


F F F F

n = address (30bit)

1111 1111 1111 1111 1111 1111 1111 11 = 3FFFFFFF


F F F F F F F 3

0000 0000 0000 0000 0000 0000 0000 00 = 00000000


0 0 0 0 0 0 0 0

hexa decimal
symbol : Oxor ()16

4bit pair : A - 10 4 binary bit : 1 Hex value


0000 - 0 B - 11
1111 - F C - 12
D - 13
0000 - 0 E - 14
1000 - 8 F - 15
1001 - 9

byte addressable word addressable


m = 8bit (always) m = 16bit, m = 32bit

Terms and notations :


Digital computers : 0 and 1 b : bits (bit : binary digit)
Each is called bit : Binary digit B : bytes (byte : group of 8bits)
byte = 8bit W : words

Word : how many bits the CPU process at a time


term word is used to refer a group of bytes that is processed simuntaneously

32bit processor 64bit processor


word size = 32bit word size = 64bit
CPU perform operations CPU perform operations
on 32bit (4byte) data at on 64bit (8byte) data at
a time a time

Chapter - 1 (intro to COA) Page 24


the exact number of bytes that constitute a word depends on the system
for ex.
- pentium : 1 word = 4bytes = 32bits
- itanium : 1 word = 8bytes = 64bits

if 32bit processor that means 1 word = 32bits


if 64bit processor that means 1 word = 64bits

based on the cell size, memory configuration is divided into types :

Byte addressable memory. Word addressable memory.


default

8 bit 1 word

8 bit 1 word

8 bit 1 word

8 bit
cell size = 1 word
8 bit size of word is given
in computer
8 bit - word addressable
8 bit

8 bit

cell size = 8 bit


- byte addressable

1. Byte addressable memory : when the cell size is 8 bit then corresponding address is byte addressable.

4x8=2bit address

16x8=4bit address

32x8=5bit address

64x8=6bit address byte addressable because


each cell size is 8 bit
256x8=8bit address
Chapter - 1 (intro to COA) Page 25
each cell size is 8 bit
256x8=8bit address

512x8=9bit address

1kx8=10bit address

1mx8=20bit address

2nx8=nbit address

2. word addressable memory : when the cell size is given in the form of words (depends on word length)
the corresponding address is word addressable.

4xw=2bit address

16xw=4bit address

32xw=5bit address

64xw=6bit address
cell size must be a word depends upon
the word length of the processor
256xw=8bit address

512xw=9bit address

1kxw=10bit address

1mxw=20bit address

2nxw=nbit address

memory CPU cells/


interface
Microprocessor Byte[B] word (data line)
(data line) [1 word size = word length]
(depends upon word length of the processor)

8bit processor 8bit 8bit 1

16bit processor 8bit 16bit 2

32bit processor 8bit 32bit 4

64bit processor 8bit 64bit 8


. . . .
. . . .
. . . .
n bit processor 8bit nbit
Chapter - 1 (intro to COA) Page 26
. .
. . . .
. . . .
n bit processor 8bit nbit
unique meaning no unique meaning
no ambiguity hence ambiguity

memory

64kb 64kb

216 x 8bit 216 x w

64k cells x 8bit 64k cells x w(?)


note:
- default memory configuration (data type in memory) is byte addressable so in the memory
chip data is stored byte wise

- but default data type in the CPU is words. so the operation are performed on a CPU according
to word format.

- to synchronise memory data type (byte) with CPU data type (word) memory interfacing
will be adjusted by the designer

- according to the word length of the CPU, to access the data from memory to CPU (multiple
byte access). kaunsa uthega : byte ordering

if my processor is working on 32 bits then i can say that my processor can perform operation on
32bit at a time but in memory chip it is stored in 8bit format so we have pick multiple memory cells.

ex. 16 bit proccessor


MOV Ax [6000] (instruction)
word length = 16bit = 2byte (multibyte) 2 consecutive bytes parallely
In CPU whatever operation performs will perform on 16bit data

M[6000] byte ordering [endian mechanism] :


Ax memory chip is byte addressable
M[6001]
(little endian and big endian)

when we use multibyte data then byte


ex. 32 bit proccessor ordering is used
MOV r1 [6000] (instruction:move to r1)
word length = 32bit = 4byte (multibyte) 4 consecutive bytes parallely
In CPU whatever operation performs will perform on 16bit data

Chapter - 1 (intro to COA) Page 27


M[6000]
M[6001] r1
M[6002]
M[6003]

kaunsi byte uthani hai vo byte ordering (endian's mechanism) batata hai.

Questions :

memory : 4Gb 2 2 . 230 byte [1giga = 2 30]


232 byte

generally, 1 word = 4byte


but in given que. 1 word = 2bytes

2G x 2byte = 2 30 . 21 = 231

4Gbyte

4Gbyte
2byte = 2Gwords 231 = 31 words

Q. consider a 32 bit hypothetical processor which support 128Mbyte memory.


system is enhanced (new design) with a word addressable memory. then
how many address line are required in the new system?

if only 128Mbyte
28.220 x 8bit

Chapter - 1 (intro to COA) Page 28


28.220 x 8bit
227byte
Address = 27bit

but new design : word addressable


1 word = 32bit
1word = 4byte

128Mbyte
128mb ko 4byte banana hai (1word)
kyuki word addressable hai

128MB = 32Mwords = 32M x 4byte 220 = 1MB


4B 128Mbyte 25 = 32
32Mega words

225 word

8bit cell 0 cell 0


.
cell 1
. 32 bit
. cell 2
. .
. . old27 design : byte addressable new design : word addressable
2 cells 225 cells
. X . 1 cell size : 8bit cell 1 x 1 cell size : 32 bit (8bit ke 4 cells) (4byte)
. .
. .
. . 32bit
address represent : 27bits address represent : 25bits
. .
8bit
cell 8

note :
8 bit : byte addressable
multiple of 8 : word addressable

Q. consider a 64 bit hypothetical processor which support 512Gbyte memory. system is enhanced
(new design) with a word addressable memory. then how many address line are required in the
new system?

512Gbyte
= 230 x 29 x 8 bit
= 239

1 word = 64bit
1word = 8byte
Chapter - 1 (intro to COA) Page 29
1word = 8byte

512Gbyte
512gb ko 8byte banana hai (1word)
kyuki word addressable hai

230 x 29 512GB = 64 Gwords = 64G x 8byte


=
23 8B 512Gbyte
64Gwords

236 word

8bit cell 0 cell 0


.
cell 1
.
. cell 2
. .
. . old39 design : byte addressable 64bit new design : word addressable
2 cells 236 cells
. X . 1 cell size : 8bit 1 cell size : 64 bit (8bit ke 8 cells) (8byte)
. .
. . X
. . address represent : 39bits
. .
8bit address represent : 36bits
cell 8

8bit cell 9 cell 1


.
cell 10
.
. cell 11
. .
. . old39 design : byte addressable 64bit new design : word addressable
2 cells 236 cells
. X . 1 cell size : 8bit 1 cell size : 64 bit (8bit ke 8 cells) (8byte)
. .
. . X
. . address represent : 39bits
. .
8bit address represent : 36bits
cell 16

Q. consider a 64 bit hypothetical processor which support 4Gwords memory. system is enhanced (new design)
with a byte addressable memory. then how many address line are required in the new system?

4G words 1 byte = 8 bit 4Gwords


= 230 x 22 1 word = 64bit = 232 x 8byte
= 232 1 word = 8 byte = 235

Chapter - 1 (intro to COA) Page 30


4G words 1 byte = 8 bit 4Gwords
= 230 x 22 1 word = 64bit = 232 x 8byte
= 232 1 word = 8 byte = 235
= 230 x 25
= 1G x 32
= 32Gbyte.

cell 0 cell 0 8bit


.
cell 1
.
cell 2 .
. .
64bit old design : word addressable . . new design : byte addressable
232 cells 235 cells
1 cell size : 64 bit (8bit ke 8 cells) . . X 1 cell size : 8bit
(8byte) . .
X . .
. .
address represent : 35bits
. .
address represent : 32bits cell 8 8bit

cell 1 cell 9 8bit


.
cell 10
.
cell 11 .
. .
64bit old design : word addressable . . new design : byte addressable
232 cells 235 cells
1 cell size : 64 bit (8bit ke 8 cells) . . X 1 cell size : 8bit
(8byte) . .
X . .
. .
address represent : 35bits
. .
address represent : 32bits cell 16 8bit

note :
in the processor design operation are always perfomed on word format so when the word length of the
CPU/Processor is greater than 8bit then multiple byte (cells) accessing is required to process the data
parallely.

Q. how to fill memory ( starts from 1000 and 4 byte. 1 word = 4byte)

Chapter - 1 (intro to COA) Page 31


1000 1000
1001 999 never
acceptable (OR)
1002 998 acceptable
1003 997

two ways to store: 1000 - 1003 : i1 and 1004 - 1007 : i2

cells cells
1000 1007

I1 1001

1002
I2 1006

1005

1003 1004
1004 1003

I2 1005

1006
I1 1002
1001

1007 1000

note :
memory always in increasing order.

Chapter - 1 (intro to COA) Page 32


topic - 7 (system bus)
Monday, July 8, 2024 4:47 AM

system bus

system bus is collection of lines/wires which are used to provide the communication (transmission media)
between the components of the computer [CPU, I/O, memory, etc].

collection of lines/wires/electrical lines (not wireless) that provides communication/transmission media


between the components of the computer.

address line

data line
control line

system bus
note :
one line carry 1 bit of data at any point of time.

system bus contain 3 types of buses


(i) Address bus (line)
(ii) Data bus (line)
(iii) Control bus (line)

(i) Address bus : Collection of address lines.


- Address line : Address line are used to carry the address towards memory and I/O

note :
Address lines are uni-directional (from CPU to memory and I/O or components of system)

address memory
CPU and I/O

(ii) Data bus (line) : Collection of data lines


Chapter - 1 (intro to COA) Page 33
(ii) Data bus (line) : Collection of data lines
- Data lines : Data lines are used to carry the data (binary sequence)

note :
Data lines are bi-directional(from CPU to memory and I/O and vice-versa)

data
memory
CPU
data
and I/O

(iii) Control bus (line) : Collection of control lines


- Control lines : Control lines are used to carry the control signals

note :
Control lines are Individual uni-directional (from CPU to memory and I/O and vice-versa)

individual
read signal
wires
memory
CPU through control lines
and I/O
interrupt

Control signal

control signal generated by


LD control unit (CU).
ST
Load control unit is supervisior of
Store
Read operation system that control every
Write operation
(RD signal) activity
(WR signal)

2 types of architecture

Von neunmann harvard


architecture architecture
-stored program

Chapter - 1 (intro to COA) Page 34


Von neunmann harvard
architecture architecture
-stored program
concept
-Computer works
on stored program
concept

Von neunmann architecture (stored program concept)

- main memory contain the instruction and data


- ALU operating on binary data
- control unit interpret the instruction from memory and executes it
- Input/output equipment operated by control unit with the help of control
signals.

computer : it is a computational device used to process the data under the control of program.

input output
computer

program

function of computer is program execution.

program : it is sequence of instructions along with the data

instructions

program
data

instruction : its a binary sequence which is designed inside the processor to perform some task

computer binary ki language


samzta hai, agar usko binary sequence - bind with - operation
program assign krke toh vo
101010 ki format mai krega.
[101010101010] (Instruction ka binary : processor knows)

Chapter - 1 (intro to COA) Page 35


data : its a binary sequence which is associated with a value [based on data format]

computer binary ki language


samzta hai, agar usko program binary sequence - bind with - values
assign krke toh vo 101010 ki [101010101010] (data ka binary : we know it)
format mai krega.

Chapter - 1 (intro to COA) Page 36


topic 8 (byte ordering)
Monday, July 8, 2024 4:48 AM

byte ordering

MSB LSB

most least
significant significant
portion of data. portion of data.
[big end of data] [little end of data]

ex. 16 bit proccessor


MOV Ax [6000] (instruction)

memory : byte addressable


(multibyte ordering)
opcode
(mnemonics) note :
destination 16bit processor = 16bit data line = 16bit data register
register
(16bit) source
type of address
operation

memory
location Ax mai
M[6000] and
M[6001] ka
data daldo

word length = 16bit = 2byte (multibyte)


In CPU whatever operation performs will perform on 16bit data
M[6000]
Ax
M[6001]

Data in hexadecimal (16bit data)


Data = Ox 11 21
Data = (11 21)16

Chapter - 1 (intro to COA) Page 37


Data = (11 21)16

(16bit binary)
Data : 0001 0001 0010 0001
Hexa : 1 1 2 1
decimal

4 HEX digit
4 x 4 = 16 bit.

2 bytes of data stored in Ax


2 ways :
(i) Ax = 11 21
one instruction and 2 output
(OR)
hence creates ambiguity
(ii) Ax = 21 11

ex. 32 bit proccessor


MOV r1 [6000] (instruction)

opcode
(mnemonics)
destination
register
source
type of address
operation

memory
location r1 mai
M[6000],
M[6001],
M[6002] and
M[6003] ka
data daldo

word length = 32bit = 4byte (multibyte)


In CPU whatever operation performs will perform on 16bit data

M[6000]
M[6001] r1
M[6002]
M[6003]

Chapter - 1 (intro to COA) Page 38


Data in hexadecimal (32bit data)
Data = Ox 11 21 53 56
Data = (11 21 51 56)16

(32bit binary)
Data : 0001 0001 0010 0001 0101 0001 0101 0110
Hexa : 1 1 2 1 5 1 5 6
decimal

8 HEX digit
8 x 4 = 32 bit.

4 bytes of data stored in r1


2 ways :
(i) r1 = 11 21 51 56
one instruction and 2 output
(OR)
hence creates ambiguity
(ii) r1 = 56 51 21 11

- why endian mechanism required?


to solve the problem of ambiguity, endian (byte ordering)
mechanism is required.

note :
- memory chip is byte addressable, so in the memory chip data is stored byte wise.
- in the processor operation are performed on word format.
when word size is greater than 8bit [1byte] (i.e

Q. memory size = 512GByte


then the number of bits in address if -
(i) byte addressable
(ii) word addressable with 1 word = 16bit
(iii) word addressable with 1 word = 64bit

- 512GB
= 230 x 29 x 8 bit
= 239

- 512GB (word size = 16bit) (2byte)


= 512GB
2byte

Chapter - 1 (intro to COA) Page 39


= 256Gword
= 230 x 28
Address = 38bit

- 512GB (word size = 32bit) (4byte)


= 512GB
4byte

= 128Gword
= 230 x 27
Address = 37bit

Q. consider a 32bit processor with memory = 128Gword then no. of address bits if
memory is byte addressable

1 word = 32bit = 4byte


128Gword = 128G x 4byte = 512Gbyte = 230 x 29
= 237 x 22
= 239

Conversions :

27 26 25 24 23 22 21 20
128 64 32 16 8 4 2 1

write 11 in 8 bit :
0 0 0 0 1 0 1 1

1+1=2

Ox O B
2+2=4
4+1=5

5 +5 = 10
10 + 1 = 11

BINARY HEX
0000 : 0 1010 = 10 = A
Chapter - 1 (intro to COA) Page 40
BINARY HEX
0000 : 0 1010 = 10 = A
0001 : 1
0010 : 2 1011 = 11 = B why 4 digit?
0011 : 3 because in hexadecimal
0100 : 4 1100 = 12 = C 1hex value = 4bit
0101 : 5 hex = 0-9 and A to F
0110 : 6 1101 = 13 = D
0111 : 7
1000 : 8 1110 = 14 = E
1001 : 9
1111 = 15 = F

Binary Hexa decimal

8bit : 0100 1001 (O x 49) 0r (49)H

16bit : 0001 0010 0011 0100 (0 x 1234) or (1234)16

32bit : 0001 1010 0010 1011


1 10 2 11
A B (0x 1A 2B 3C 4D)
(0x 1A 2B 3C 4D)16
0011 1100 0100 1100
3 12 4 13
C D

32bit : 0101 1011 1001 1110


5 B 9 E
(0x 5B 9E 21 B3)
(0x 5B 9E 21 B3)16
0010 0001 1011 0011
2 1 B 3

32bit : 00000001 00000010


01 02
(0x 01 02 03 04)
(0x 01 02 03 04)16
00000011 00000100
03 04

Chapter - 1 (intro to COA) Page 41


- memory starting from 2000:
32bit [4byte]

m bit 2000 m bit 2003

m bit 2001 m bit 2002

m bit 2002 m bit 2001

m bit 2003 m bit 2000

Q. consider 32bit data 0x (1A 2B 3C 4D) stored at memory location 2000 onwards
then result is ?
solution :
Data : 0x (1A 2B 3C 4D)

two ways of storing data : 32bit data

1A 2B 3C 4D
0001 1010 0010 1011 0011 1100 0100 1101
8bit 8bit 8bit 8bit

(4) (3) (2) (1)

(or)

(1) (2) (3) (4)

one instruction create 2 output that is creating ambiguity, to solve


the problem of ambiguity there is a endian's mechanism

byte ordering : for multibyte and memory is byte addressable


1 word = 32bit = 4byte

Chapter - 1 (intro to COA) Page 42


0001 1010
1A 2000 4D 2003
0100 1100
0010 1011
2B 2001 3C 2002
0011 1100
0011 1100
3C 2002 2B 2001
0010 1011
0100 1100
4D 2003 1A 2000
0001 1010

byte byte
addresable addresable
hai 8 bit hai 8 bit
rakhega rakhega

ex.1. 32bit data : OX (01 02 03 04)

0000 0001 0000 0010 0000 0011 0000 0100

data : Ox 01 02 03 04

MSB of the word LSB of the word


or or
big end little end
or or
higher byte of lower byte of data
data

ex.2. 32bit data : Ox(9D 8F CD 36)

1001 1101 1000 1111 1100 1101 0011 0110

9 D 8 F C D 3 6

data : Ox 9D 8F CD 36

MSB of the word LSB of the word


or or
big end little end
or or
higher byte of lower byte of data
Chapter - 1 (intro to COA) Page 43
or
big end little end
or or
higher byte of lower byte of data
data

Topic - byte ordering : when data is multibyte we use byte ordering

types :

1. Little endian : lower address contain lower bytes and higher address
contain higher bytes.
(OR)
right to left
(OR)
little endian starts with little end and store at lower memory
address
(OR)
little end of the word stored at lowest memory address

Q. 32bit data : (9D 8F CD 36) stored at memory location


starting from 1000 then store the data in little endian
format

1001 1101 1000 1111 1100 1101 0011 0110

9 D 8 F C D 3 6

data : Ox 9D 8F CD 36

MSB of the word LSB of the word


or or
big end little end

little end : starts from little end and little end


stored at lowest memory address

Chapter - 1 (intro to COA) Page 44


0011 0110
36 H 1000
1100 1101
CD H 1001
1000 1111
8F H 1002
1001 1101
9D H 1003

byte
addresable
hai 8 bit
rakhega

Q. 32bit data : Ox(01 02 03 04) stored at memory location


starting from 1000 then store the data in little endian
format

data : Ox 01 02 03 04

MSB of the word LSB of the word


or or
big end little end

little end : starts from little end and little end


stored at lowest memory address

0001 1010
04 1000
0010 1011
03 1001
0011 1100
02 1002
0100 1100
01 1003

byte
addresable
hai 8 bit
rakhega

Q. 32bit data : Ox(AB CD 11 EF) stored at memory


location starting from 1000 then store the data in little
endian format

data : Ox EF 11 CD AB

Chapter - 1 (intro to COA) Page 45


endian format

data : Ox EF 11 CD AB

MSB of the word LSB of the word


or or
big end little end

little end : starts from little end and little end


stored at lowest memory address

1010 1011
AB 1000
1100 1101
CD 1001
0001 0001
11 1002
1110 1111
EF 1003

byte
addresable
hai 8 bit
rakhega

Q. 64bit data : Ox(11 AB 22 CD 33 EF 44 56) stored at


memory location starting from 2000 then store the data
in little endian format.

data : Ox 11 AB 22 CD 33 EF 44 56

MSB of the word LSB of the word


or or
big end little end

little end : starts from little end and little end


stored at lowest memory address

Chapter - 1 (intro to COA) Page 46


56 2000

44 2001
EF 2002
byte
33 2003 addresable
hai 8 bit
CD 2004 rakhega
22 2005
AB 2006

11 2007

2. Big endian : lower memory address contain higher byte and higher memory
address contain lower byte.
(OR)
left to right
(OR)
start from the MSB of word stored at lower memory address.
(OR)
big end of word stored at lowest memory address.

Q. 32bit data : Ox(9D 8F CD 36) stored at memory


location starting from 1000 then store the data in big
endian format

1001 1101 1000 1111 1100 1101 0011 0110

9 D 8 F C D 3 6

data : Ox 9D 8F CD 36

MSB of the word LSB of the word


or or
big end little end

big end : starts from big end and big end stored
at lowest memory address

Chapter - 1 (intro to COA) Page 47


1001 1101
9d H 1000
1000 1111
8f H 1001
0011 1100
cF H 1002
0100 1100
36 H 1003

byte
addresable
hai 8 bit
rakhega

Q. 32bit data : Ox(01 02 03 04) stored at memory location


starting from 1000 then store the data in big endian
format

data : Ox 01 02 03 04

MSB of the word LSB of the word


or or
big end little end

big end : starts from big end and big end stored
at lowest memory address

01 1000

02 1001
03 1002

04 1003

byte
addresable
hai 8 bit
rakhega

Q. 32bit data : Ox(AB CD 11 EF) stored at memory


location starting from 1000 then store the data in Big
endian format

data : Ox AB CD 11 EF
Chapter - 1 (intro to COA) Page 48
location starting from 1000 then store the data in Big
endian format

data : Ox AB CD 11 EF

MSB of the word LSB of the word


or or
big end little end

big end : starts from big end and big


end stored at lowest memory address

AB 1000

CD 1001
11 1002

EF 1003

byte
addresable
hai 8 bit
rakhega

Q. 64bit data : Ox(11 AB 22 CD 33 EF 44 56) stored at


memory location starting from 2000 then store the data
in big endian format.

data : Ox 11 AB 22 CD 33 EF 44 56

MSB of the word LSB of the word


or or
big end little end

big end : starts from big end and big


end stored at lowest memory address

Chapter - 1 (intro to COA) Page 49


11 2000

AB 2001
22 2002
byte
CD 2003
addresable
2004 hai 8 bit
33
rakhega
EF 2005
44 2006

56 2007

TYPE : 2
when data is already stored in main memory and we have to
write in little endian and big endian format?

little endian : starts from little end and little end stored at
lower address.

big endian : starting from big end and big end stored at lower
address

Q. write in little and big endian format.

01 2003

02 2002
03 2001

04 2000

step one : write down the data


because address started in
data : Ox 04 03 02 01 descending order

MSB of the word


Chapter - 1 (intro to COA) Page 50
MSB of the word LSB of the word
or or
big end little end

step two : write down in little and big endian

Little endian :
Ox 01 02 03 04

big endian :
Ox 04 03 02 01

Q. write in little and big endian format.

BA 2003

09 2002
24 2001

AB 2000

step one : write down the data


because address started in
data : Ox AB 24 09 BA
descending order

MSB of the word LSB of the word


or or
big end little end
step two : write down in little and big endian

Little endian :
Ox BA 09 24 AB

big endian :
Ox AB 24 09 BA

Chapter - 1 (intro to COA) Page 51


Q. write in little and big endian format.

21 2050

18 2051
17 2052

16 2053

step one : write down the data

data : Ox 21 18 17 16 because address started in


ascending order

MSB of the word LSB of the word


or or
big end little end
step two : write down in little and big endian

Little endian :
Ox 16 17 18 21

big endian :
Ox 21 18 17 15

Q. Identify the data format.

21 2050

18 2051
17 2052

16 2053

step one : write down the data

because address started in

Chapter - 1 (intro to COA) Page 52


Ox 21 18 17 16 because address started in
data : ascending order

MSB of the word LSB of the word


or or
big end little end
step two : write down in little and big endian

Little endian :
Ox 16 17 18 21

big endian :
Ox 21 18 17 16

Q. consider a CPU having a memory address 0 to 200. 32bit


data stored at memory location 80 onward.

76 76

77 77

78 78

4D 80 1A 80

3C 81 2B 81

2B 82 3C 82

1A 83 4D 83

84 84

TYPE : 3
By watching memory layout how to write in little endian
and big endian.

step by step !

q. Stored in which mechanism?

Chapter - 1 (intro to COA) Page 53


q. Stored in which mechanism?
76

77

78

1A 80

2B 81

3C 82 big Little
endian endian
4D 83
84

step one : write down the data

Ox 1A 2B 3C 4D because address started in


data : ascending order

MSB of the word LSB of the word


or or
big end little end

step two : write down in little and big endian


Little endian :
Ox 4D 3C 2B 1A

big endian :
Ox 1A 2B 3C 4D

Conclusion : this data is stored in big


endian.

q. Stored in which mechanism?


76

77

Chapter - 1 (intro to COA) Page 54


76

77

78

4D 80

3C 81

2B 82 big Little
endian endian
1A 83
84

step one : write down the data

Ox 4D 3C 2B 1A because address started in


data :
ascending order

MSB of the word LSB of the word


or or
big end little end

step two : write down in little and big endian


Little endian :
Ox 1A 2B 3C 4D

big endian :
Ox 4D 3C 2B 1A

Conclusion : this data is stored in big


endian.

Practise sheet :

Chapter - 1 (intro to COA) Page 55


Chapter - 1 (intro to COA) Page 56
Chapter - 1 (intro to COA) Page 57
Chapter - 1 (intro to COA) Page 58
note :
In memory data stored from starting address and in increasing order.
[2000], [2001], [2002]......

Little endian Big endian

starts from the little starts from the big end


end and little end and big end stored at
stored at lower address lower address

Chapter - 1 (intro to COA) Page 59


Points about byte ordering concept :

1.Endian ness (little endian and big endian) is used in processor design time. it
means endian ness is the property of the CPU (processor), not the property of main
memory.

2. Endianness does effect the ordering of data item (does apply) on multibyte data
value, individual data items.

3. Endianess does not effect the ordering of data item (does not apply on) structure
like strings, arrays and struct for every individual data item BUT (if multibyte
data) then endianness (byte ordering) concept is applied.
- If data structure : 2 byte (multibyte) then each with endianess
- If data structure : 1 byte then not with endianess

big endian little endian

01 higher bytes 01 higher bytes

02 02

03 03
lower
04 bytes
04 lower bytes

01 02 03 04 04 03 02 01
1000 1001 1002 1003 1000 1001 1002 1003
lower higher lower higher
address address address address
contains contains contains contains

type : 4
if data is given in little endian then how to write in big endian.

Q. Ox 0001
Ox 6665
Ox 4243
Ox 0100
2byte unsigned integer, stored in little endian format then write
in big endian.

little endian big endian data


1000 1001 1000 1001

Chapter - 1 (intro to COA) Page 60


little endian big endian data
1000 1001 1000 1001

66 65 65 66 Ox 66 65

4000 4001 4000 4001

00 01 01 00 Ox 00 01

9000 9001 9000 9001

42 43 43 42 Ox 42 43

11000 11001 11000 11001

01 00 00 01 Ox 01 00

Processors that use

1. little endian 2. Big endian


- Intel x86 - IBM370
- VAX - IBM 390
- alpha - Motorola 680 x0
- sun sparc

Chapter - 1 (intro to COA) Page 61


Chapter - 1 (intro to COA) Page 62
0001 0001
big little
11 00 (0000 0000) 14 00 (0000 0000) ek integer 4
ek integer 4
byte ka hai byte ka hai
12 01 (0000 0001) 13 01 (0000 0001) multibyte hence
13 . . ordering change
12
Chapter - 1 (intro to COA) Page 63
ek integer 4
byte ka hai byte ka hai
12 01 (0000 0001) 13 01 (0000 0001) multibyte hence
13 . . ordering change
12
. .
14 . 11 .
. .
P . .
P
. ek integer 4 byte ka hai . ek integer 4 byte ka hai
A . par data nahi hai . par data nahi hai
A
. .
D . .
D
. .
21 08 (0000 1000) 28 08 (0000 1000)
22 09 (0000 1001) 27 09 (0000 1001)
23 0A (0000 1010) 26 0A (0000 1010)
. .
24 25
. . b double hai :8
. b double
25 24 . byte hence
. hai hence 8
. ordering change
26 . byte 23 .
27 . .
. 22
.
28 0F (0000 1111) 21 0F (0000 1111)
31 10 (0001 0000) 34 10 (0001 0000)
. .
32 33 character C
. character C .
. (4byte hence
33 32 .
. ordering change)
.
34 13 31 13
A 14 A 14
. .
B . B .
C . array : 1byte C . array : 1 byte (not
. . multibyte) hence no
D . D change in ordering
.
. .
E E
. .
F 19 F 19
G 1A (0001 1010) G 1A (0001 1010)
-- 1B -- 1B
51 1C 52 1C e tha half word means
e tha half word
52 1D means 16bit means 51 1D 16bit means 2 byte. 2
2 byte toh 2jagah jagah khali chod di but
Chapter - 1 (intro to COA) Page 64
51 1C 52 1C e tha half word means
e tha half word
52 1D means 16bit means 51 1D 16bit means 2 byte. 2
2 byte toh 2jagah jagah khali chod di but
khali chod di multibyte hence ordering
changed

61 20 64 20
. . ek integer 4 byte ka hai
62 . ek integer 4 63 . means multibyte hence
63 . byte ka hai . ordering changed.
62
. .
64 24 61 24

Q. an array of 2 two byte integers is stored in big endian machine in byte address as shown
below what will be its storage pattern in little endian machine?

Address Data
0 x 104 78
0 x 103 56
0 x 102 34
0 x 101 12

big endian little endian


104 78 104
103 56 103
102 34 102
101 12 101

it is a 2-byte integer arrays


visualization :
A B
34 78
12 56

it is multibyte but they are in 2 byte pair so the ordering will occur in
A and B individually

A B
Chapter - 1 (intro to COA) Page 65
A B
12 56
34 78

little endian
104 56
103 78

102 12
101 34

Topic - Pins : system contain hardware pin to perform the operation

(i) active low pin


(ii) active high pin
(iii) time multiplex pin

(i)active low pin : this pin is enabled when input is 0 or in the low state

denoted as : pinname

ex. RD, WR ke upar bar

(ii) active high pin : this pin is enabled when input is 1 or in the high
state
ex. INTR, HLDA, ALE etc.

(iii) time multiplex pin : this pin carries the multiple meaning but one
only at a time
- address pins are time multiplexed with data pins to carry the address
and data.

but how to know if it is carrying the address or data?


- with the help of ALE

(1) then time multiplexer pins carries the address


ALE (address latch enable)
(0) then time multiplexer pins carries the data

advantage : number of hardware pins will be reduced.


Chapter - 1 (intro to COA) Page 66
advantage : number of hardware pins will be reduced.

1. Address line : address line are used to carry the address (CPU to memory) it is
unidirectional.

note :
based on address line we can determine the capacity of memory [number of Main
memory location/cells]

ex. 8085 processor [40pin IC]


Bonus : 16 pins ka kaam
AD0-AD7, A8-A15 8 mai ho gaya
16bit address
216 cells/location = 64kb cells

AD0-AD7 , A8-A15
A : address : 16bits
D: data : 8 bits and 8 bits

ex. 8086processor Bonus : 32 pins ka kaam


AD0-AD15, A16-A19 16 mai ho gaya

20bit processor
220 cells/location = 1M cells

AD0-AD15, A16-A19
A : address : 20bits
D: data : 16 bits

2. Data line : data lines are used to carry the data (CPU to memory) it is
bi-directional

note :
• based on the data line we can determine the word length of the processor or CPU
• the performance of the processor is measured by word length of CPU.

ex. 8085processor
AD0-AD7 , A8-A15
word length = 8bit
operation performed on 8bit data format

AD0-AD7 , A8-A15
Chapter - 1 (intro to COA) Page 67
AD0-AD7 , A8-A15
A : address : 16bits
D: data : 8 bits and 8 bits

ex. 8086processor
AD0-AD15, A16-A19
word length = 16bit
operation perfomed on 16bit data format

AD0-AD15, A16-A19
A : address : 20bits
D: data : 16 bits

Chapter - 1 (intro to COA) Page 68


chapter - 2 (machine instruction and addressing modes)
Thursday, July 11, 2024 5:02 PM

machine instruction and addressing modes

topic - Instruction : instruction is a binary code (binary sequence) which is designed inside
the processor to perform some operations.

binary sequence - bind with - operations

instruction made up of two things

OPCODE OPERAND (OR) (ADDRESS FIELD)[OPERAND REFERENCE]

type of operation
either we get the
operand or its address

memory register constant (value)


address address (intermediate)
field field

1. OPCODE : operation / operational code tells us about type of operation

- if opcode is 2bit then 22 = 4 operation perfomed


assume 00 ADD [Addition]
01 SUB [subtract]
02 MUL [multiply]
03 LOAD [load data from memory]

- if opcode is 3bit then 23 = 8 operation perfomed


assume 000 0R
001 AND
010 NAND
011 XOR
100 ADD
101 LOAD
110 STORE
111 MUL
note :
nbit opcode performs 2n operations

Chapter - 2 (introduction format and addressing Page 69


2. OPERAND : data
(OR)
OPERAND : operand reference (address field) address mil jata hai operand ka

Address field : n bit address field can specify 2 n memory cells (loactions)
1kb memory then address : [log21kb] = [log 210] = 10bit

nbit address line

word addressable byte addressable


- 2n words - 2n byte
ex. ex.
- 215 words - 215 byte
32k words 32k bytes

210 =1k (kilo)


220 =1m (mega)
230 =1G (giga)
240 =1T (tera)
250 =1P (peta)
260 =1E (exa)
270 =1Z (zeeta)
280 =1y (yotta)
1 byte = 8 bit
1 Nibble = 4 bit

1. if OPCODE field is given then number of operations performed?

OPCODE field operations performed


1bit = 21 = 2
2bit = 22 = 4
3bit = 23 = 8
4bit = 24 = 16
5bit = 25 = 32
6bit = 26 = 64
7bit = 27 = 128
8bit = 28 = 256
9bit = 29 = 512
10bit = 210 = 1024

2. if OPCODE is 6 bit then number of operations performed?

6bit = 26 = 64
Chapter - 2 (introduction format and addressing Page 70
2. if OPCODE is 6 bit then number of operations performed?

6bit = 26 = 64

3. if 100 operations performed then number of bits in OPCODE field?

n = 7bits
because , 26 = 64
27 = 128
hence, 7bits.

4. if memory is 256k byte then address field?

256KB = 28 . 210 Byte = 218


Address field : 18bit

5. if processor has 30 registers then number of bits required to represent register

30 = 2n= 25
register address field = 5bits

Chapter - 2 (introduction format and addressing Page 71


total number of operation / instruction number of bits in OPCODE field
100 7bit

55 6bit

210 8bit

30 5bit

50 6bit

90 7bit

25 register 5bit

50 register 6bit

110 operation 7bit

13bit address field 213 = 8kb/8kword

8bit OPCODE 28 = 256 operation

200 operation OPCODE : 8bit

8 operations / instructions OPCODE : 3bit

128b memory = 27 = 7bit


Address field
instruction size = 16bit 2. for 1AF
------16bit-----
1. for 2AF OPCODE AF1
---------16bit-------- 2bit 7bit
OPCODE AF1 AF2 OPCODE = 16 - 7
2bit 7bit 7bit = 9bit
OPCODE = 16 - 14 total operations = 29 = 512

Chapter - 2 (introduction format and addressing Page 72


---------16bit-------- 2bit 7bit
OPCODE AF1 AF2 OPCODE = 16 - 7
2bit 7bit 7bit = 9bit
OPCODE = 16 - 14 total operations = 29 = 512
= 2bit operations
total operations - 22 = 4
operations

100 operation = OPCODE = 7bit


memory = 1mb = 220 byte = 20bits address field

OPCODE AF1 AF2 AF3


7bit 20 20 20
instruction length = 7 + 20 + 20 + 20
=67

110 operationinstruction = OPCODE = 7bit


50 register = register address field = 6bit
memory = 512kb = 219 byte = 19bits address field
immediate field = 13bit
OPCODE reg AF ref AF memory AF immediate field
7bit 6bit 6bit 19bit 12bit
instruction length = 7 + 6 + 6 + 19 + 13
=51bits = 51/8 = 6.375 (7 : ceiling) bytes
divided by 8 for bytes.
1 instruction size : 7 byte
program contains 300 instruction
program size = 300 x 7 byte = 2100 byte

16bits

Chapter - 2 (introduction format and addressing Page 73


16bits

40 operationinstruction = OPCODE = 6bit


24 register = register address field = 5bit
OPCODE reg AF ref AF immediate field
6bit 5bit 5bit
immediate = 32 - (6+5+5)
=16bits

Topic : Instruction format/Instruction set architecture

instruction made up of two things


OPCODE OPERAND (OR) (ADDRESS FIELD)[OPERAND REFERENCE]

memory register constant (value)


address address field (intermediate)
field (no of registers
register file size)

example :
4bits 6bits 6bits
OPCODE OPERAND REFERENCE OPERAND REFERENCE
16bits

total operation : 24 = 16 operations/instructions

the instruction set architecture is the part of the processor that is visible to the programmer or
compiler writer. the ISA serves as a boundary between software and hardware. software is
converted to machine instructions using software (compiler). then the instructions are executed

Chapter - 2 (introduction format and addressing Page 74


converted to machine instructions using software (compiler). then the instructions are executed
using hardware.

An ISA contains :
- the functional definition of storage locations (registers, memory) and operations(add,
multiply,branch,load,store,etc)
- precise description of how to invoke and access them

AN ISA does not contains :


- how operations are implemented
- which operations are fast and which are slow
- which operations take more power and which take less.

ADD: destination source 1 source 2

1. Number of explicit operand : how many operands are available

destination source 1 source 2

3 AI / 3 AF OPCODE AF1 AF2 AF3 OPCODE ke alawa 3 address ADD R1 R2 R3 R 1 R 2 + R3

2 AI / 2 AF OPCODE AF1 AF2 OPCODE ke alawa 2 address ADD R1 R2 R 1 R1 + R2

1 AI / 1 AF OPCODE AF1 OPCODE ke alawa 1 address ADD R1 AC AC + R1

0 A1 / 0 AF OPCODE OPCODE ke alawa no address ADD ADD

2. Location of the operands : data mera kaha hoga? registers, accumulator or memory?

based on the destination source 1 source 2


availability of
ALU operand

general register 3 AI / 3 AF OPCODE AF1 AF2 AF3 OPCODE ke ADD R1 R2 R3 R 1 R 2 + R3


alawa 3 address
organisation

general register 2 AI / 2 AF OPCODE AF1 AF2 OPCODE ke ADD R1 R2 R1 R 1 + R 2


organisation alawa 2 address

Chapter - 2 (introduction format and addressing Page 75


general register 2 AI / 2 AF OPCODE AF1 AF2 OPCODE ke ADD R1 R2 R1 R 1 + R 2
organisation alawa 2 address

accumalator based 1 AI / 1 AF OPCODE OPCODE ke ADD R1 AC AC + R1


organisation alawa 1 address

stack based 0 A1 / 0 AF OPCODE OPCODE ke ADD ADD


alawa no address

3. specification of the operands : specific location operand ki

based on the destination source 1 source 2


availability of
ALU operand

general register 3 AI / 3 AF OPCODE AF1 AF2 AF3 OPCODE ke ADD R1 R2 R3 R1 R 2 + R 3


alawa 3 address
organisation

general register 2 AI / 2 AF OPCODE AF1 AF2 OPCODE ke ADD R1 R2 R1 R1 + R2


organisation alawa 2 address

accumalator based 1 AI / 1 AF OPCODE OPCODE ke ADD R1 AC AC + R1


organisation alawa 1 address

stack based 0 A1 / 0 AF OPCODE OPCODE ke ADD ADD


alawa no address

4. sizes of operand supported : sizes the operand support


- byte (8-bits)
- half-word (16bits)
- word (32-bits)
- double (64-bits)

5. supported operations :

- ADD
- SUB
- MUL
- AND
- OR
- CMP
- MOVE
Chapter - 2 (introduction format and addressing Page 76
- MOVE

(i) stack based organisation : in the stack based organisation ALU operations are perfomed
on stack data. ALU operand stack mai milega aur result bhi.

so both the operand (data)must be required (present) in the


stack and after the processing result is also stored in the stack.

in stack CPU "0" (zero) explicit operand (or) all operand are
implicit (alag se address pass karne ke zarurt nahi hai, dono
operand stack mai hi mil jayenge)that means ALU operand
(data) before the ALU operation must be present in stack.
(sidhe ADD likh dete hai)

- what is stack?
stack is a block(part) of a memory in RAM but to control, CPU keeps a stack pointer
register.

- what stack pointer (SP) register does?


stack pointer register points to the top of the stack (TOS) address.

- what is top of the stack?


in the stack, insert and delete operation are perfomed at the one end(same end) called top
of the stack and this is default address in stack pointer register.
destination source 1 source 2

teeno data stack mai


ADD R1 R2 R3 R1 R 2 + R3
vo bhi top of stack me
TOS TOS+TOS

stack memory ka hi
stack part hai
but for visuailsation
ke lie dikhaya hai

stack
based
organisation TOS

ALU
operand
TOS se
aayega

ALU

Chapter - 2 (introduction format and addressing Page 77


ALU

main
memory

Push 100 TOS TOS


Push 200 (POP) + (POP) = 300
Add 200 100
200 pop karlenge stack se 200

100 100 300


stack memory stack memory

In depth explanation :

I1 : PUSH A : push whatever value present at location A to the top of the


stack(which present in location A)
TOS M[A] (TOS mai memory loaction A ka data daal dena)

A 100
B 600

C 100 (TOS)
memory stack

12: PUSH B : push whatever value present at location B to the top of the
stack(which is present in location B)
TOS M[B] (TOS mai memory loaction B ka data daal dena)

Chapter - 2 (introduction format and addressing Page 78


A 100
B 600

600 (TOS)
C 100
memory stack

13: ADD : add the top two (2) elements of the stack (pop 2 TOP element) from the
stack and perform addition and save the result back to the TOP of the
stack.
ADD : 100 + 600 = 700 (TOS)

600 popped out to


perform ADD
100

stack

I4: POP C : store the value(result) (which is present in TOS) at the top of the stack
to memory loaction C
M[C] TOS

700 (TOS)
stack

Chapter - 2 (introduction format and addressing Page 79


700 (TOS) C 700
stack memory
(empty)

note:
- in stack CPU ALU operation are 0 address field
- In the STACK-CPU data transfer operation (PUSH, POP) are not 0 address field, its 1
address field.

but we write directly!

- PUSH A :

- PUSH B :

- ADD :

- POP C :

Q. (x * Y) + Z : how many machine instruction required to execute using stack


based organisation?

sol. PUSH X : X pushed into TOS

A X
B Y

Z
X (TOS)
memory stack

PUSH Y : Y pushed into TOS

Chapter - 2 (introduction format and addressing Page 80


A X
B Y

Z Y (TOS)

X
memory stack

MUL : X and Y Popped out and performed the multiply operation

Y popped out to
perform MUL
X

stack

PUSH Z : Z pushed into TOS

Y
Z (TOS)
Z
XY
memory stack

ADD : XY and Z Popped out and perfomed Addition

Z popped out to
Z XY perform ADD

XY
stack

Chapter - 2 (introduction format and addressing Page 81


I1: PUSH A

I2: PUSH B

I3: ADD

I4: PUSH C

I5: PUSH D

I6: ADD

I7: MUL

I8: POP X

EMPTY
because we popped the result from stack

address = 24 bit = 3byte (8+8+8)


I1: PUSH A : 4B +3B = 7B OPCODE = 1word = 4byte (32bit)

I2: PUSH B : 4B + 3B = 7B
operation address
I3: ADD : 4B = 4B or
instruction
I4: PUSH C : 4B + 3B = 7B

I5: PUSH D : 4B + 3B = 7B

I6: ADD : 4B = 4B
only operation
I7: MUL : 4B = 4B

I8: POP X : 4B + 3B = 7B

Chapter - 2 (introduction format and addressing Page 82


I7: MUL : 4B = 4B

I8: POP X : 4B + 3B = 7B

OPCODE REF AF MEM AF IMMEDIDATE FIELD


8bit 8bit 40bit 16bit

250 register = reg address field = 8bit

180 operation hence OPCODE = 2n = 28 = 8bit

memory address field = 40bit


(word size) data = 24

hence, length of instruction = 8+8+40+16 = 72bits = 9byte.


one instruction size = 9byte
program contains 400 instruction : program size
in bytes = 400 x 9 = 3,600 bytes
in word (3byte) = 3600/3 = 1200 words

--------------------------32bit-------------------------
OPCODE REG1 REG2 IMMEDIATE
6bit 6bit 6bit 14bit

1 word long instruction hence instruction size = 32bits


it have 64 registers = reg AF = 6bits
45 instructions / operations = OPCODE = 6bit

immediate field = 32 - (6+6+6)


= 14bits

immediate unsigned = 14bits


unsigned range = 0 to 2n - 1
= 0 to 214 - 1
= 0 to 16383

Chapter - 2 (introduction format and addressing Page 83


immediate unsigned = 14bits
unsigned range = 0 to 2n - 1
= 0 to 214 - 1
= 0 to 16383

Instruction format

OPCODE OPERAND (OR) OPERAND REFERENCE

single accumulator ogranisation


- in the accumulator based organisation first ALU operand are always present in the accumulator
register and second ALU operand present in the memory.

- after the processing result is also stored in the accumulator (OR) accumulator is used as
destination to store the result of ALU operation.

- accumulator is resgister in the CPU which is associated with the ALU.

- the operand in the accumulator is loaded from the memory using the LOAD command or
instruction and the result is stored in the memory from the accumulator using the STORE
command.

note :
here 1 operand is implicit (present in accumlator) and 1 operand is explicit (present in the
memory address)

OPCODE OPERAND (OR) OPERAND REFERENCE 1 AI / 1 AF OPCODE ADDRESS FIELD ADD R1

type of operation
destination source 1 source 2

(i)ALU operation : AC AC + MEM

destination source 1 source 2

ADD B; AC AC + M[B]

destination source

(ii)data transfer : Memory AC [STORE]


accumlator mem [LOAD]
Chapter - 2 (introduction format and addressing Page 84
destination source

(ii)data transfer : Memory AC [STORE]


accumlator mem [LOAD]

destination source

STORE [7000] ; M[6000] AC

destination source

Load [6000] ; AC M[6000]

AC

single
accumulator
organization

ALU

result

main
memory

Accumulator based org.

ex.1 A+B using AC. CPU

I1 : LOAD A; AC M[A]

I2 : ADD B; AC AC +M[B]

2 instruction using accumulator CPU

Chapter - 2 (introduction format and addressing Page 85


ex. C=A+B using AC. CPU WHERE A, B, C are the memory address

I1: LOAD A; AC M[A]; AC 100

I2: ADD B; AC AC +M[B]; AC 500

I3: STORE C; M[C] AC; M[C] 600

I1: LOAD A; AC M[A]; AC 100


100

A 100
B 500 load the content (data) of
memory location/address (A)
to the accumulator

I2: ADD B; AC AC +M[B]; AC 500


100 500

A 100
B 500 fetch the content (data) from memory location
(address) B and add that contain with accumulator
(AC) data or value present in the accumulator and
save the result back to the Accumulator

I3: STORE C; M[C] AC; M[C] 600

Chapter - 2 (introduction format and addressing Page 86


A 100

B 500
store the value (data) from the
accumulator (AC) to the
memory location C.

C 600

Defintions

I1 : LOAD X; AC M[A] ; load the content (data) of memory


location/address (A) to the accumulator

I2: ADD B; AC AC +M[B]; fetch the content (data) from memory location
(address) B and add that contain with accumulator
(AC) data or value present in the accumulator and
save the result back to the Accumulator

I3: STORE C; M[C] AC; store the value (data) from the accumulator (AC) to the
memory location C.

ex.3 Z = (X * Y)

I1 : LOAD X; AC M[X]

I2 : MUL Y; AC AC +M[Y]

I3 : STORE Z; M[Z] AC

3 instruction using accumulator CPU

ex.4 (X * Y) + Z ; how many machine instructions are required using AC-CPU

I1 : LOAD X; AC M[X]

I2 : MUL Y; AC AC *M[Y]

I3 : ADD Z; AC AC+M[Z]

Chapter - 2 (introduction format and addressing Page 87


I2 : MUL Y; AC AC *M[Y]

I3 : ADD Z; AC AC+M[Z]

3 instruction using accumulator CPU

ex.4 (X * Y) + Z ; how many machine instructions are required using AC-CPU

I1 : LOAD X; AC M[X]

I2 : MUL Y; AC AC *M[Y]

I3 : ADD Z; AC AC+M[Z]

I4 : STORE A; M[A] AC

4 instruction using accumulator CPU

#memory spills : to store the intermediate result in the memory

ex.5 X = [(A+B) * (C+D)] ; A,B,C,D and X are the variable in the memory.

I1 : LOAD A;

I2 : ADD B; AC AC + M[B]
memory spills : to store
I3 : STORE Temp; m(temp.) AC the intermediate result in
the memory

I4 : LOAD C; AC M[C] save your operation before loading another variable


spills : 1
I5 : ADD D; AC AC + M[D]
to save in temporary memory
we use temp.
I5 : MUL Temp; AC AC + M[temp]

I6 : STORE X; M[C] AC save your operation


but spills : 0 because i didnt want to save but the execution is
completed and the question asked to save.

7 instruction using accumulator CPU

memory spills : 1

stack based accumulator based


alu operand : implicit (stack) 1 operand alu implicit (accumulator)
Chapter - 2 (introduction format and addressing Page 88
stack based accumulator based
alu operand : implicit (stack) 1 operand alu implicit (accumulator)
alu operation : 0 AI 2nd operand memory and result stored
data transfer : 1 AI in AC.

General register organisation

register based organisation

register - memory memory - memory register - register

(i)register - memory : in this organisation first ALU operand present in the registers (general
purpose register) and second ALU operand present in memory and after
processing result stored in register.

this architecture is different from the accumulator architecture in a way


that it has multiple register used to store the operand and result.

OPCODE ADDRESS 1 ADDRESS 2 2 AI / 2 AF OPCODE ADDRESS FIELD ADDRESS FIELD ADD R1

type of operation
destination source 1 source 2
(register) (register) (memory)

(i)ALU operation : reg. reg. + MEM

destination source 1 source 2


(register) (register) (memory)

ADD R1 7000; ADD R1 + M[7000]


OPCODE ADDRESS 1 ADDRESS 2

destination source

(ii)data transfer : Register memory [LOAD]


memory register [STORE]

- Load R1 [6000] ; R1 M[6000] Load R1 [6000]

OPCODE ADDRESS 1 ADDRESS 2

Chapter - 2 (introduction format and addressing Page 89


OPCODE ADDRESS 1 ADDRESS 2

- STORE R1 [6000] ; M[6000] R2 STORE [9000] R2

OPCODE ADDRESS 1 ADDRESS 2

Ex. ADD R1 [4000]


R1 R1 + M[4000] (R1 mai R1 + memory location 4000 ka data)

destination source 1 source 2


(register) (register) (memory)

#register file : the number of register supported by the processor

.
.
register file
R3
R2
R1

ALU

result

main
memory

ex.1 Z = (X + Y) using register memory CPU

I1 : LOAD R1 X; R1 M[X]

I2 : ADD R1 Y; R1 R1 +M[Y]

I3 : STORE Z R1 ; M[Z] R1

3 machine using register-memory CPU

Chapter - 2 (introduction format and addressing Page 90


ex.2 (A + B) using register memory CPU

I1 : LOAD R1 A; R1 M[A]

I2 : ADD R1 B; R1 R1 +M[B]

2 machine using register-memory CPU

ex.3 (X * Y) + Z ; X, Y and Z are variable in the memory (memory address)

I1 : LOAD R1 X; R1 M[X] // I1 : MOV R1 X;

I2 : MUL R1 Y; R1 R1 * M[Y]

I3 : ADD R1 Z; R1 R1 + M[Z]

3 machine using register-memory CPU

(ii) memory - memory : all the operand must be required in the memory
(we dont study it)

ADD A, B, C
m[A] m[B] + m[C]

(iii) register - register : in this architecture all the ALU operand required/must be present in the
(RISC)fastest register.

- this architecture supports more number of registers.

OPCODE ADDRESS 1 ADDRESS 2 2 AI / 2 AF OPCODE ADDRESS FIELD ADDRESS FIELD ADD R1

type of operation
destination source 1 source 2
(register) (register) (register)

(i)ALU operation : regz. regx. + regy.

destination source 1 source 2


(register) (register) (register)

ADD R1 7000; ADD R1 + M[7000]


OPCODE ADDRESS 1 ADDRESS 2

Chapter - 2 (introduction format and addressing Page 91


(register) (register) (register)

ADD R1 7000; ADD R1 + M[7000]


OPCODE ADDRESS 1 ADDRESS 2

destination source

(ii)data transfer : Register memory [LOAD]


memory register [STORE]

- Load R1 [6000] ; R1 M[6000] Load R1 [6000]

OPCODE ADDRESS 1 ADDRESS 2

note :
arithmatic operations does not access the memory only load and store instructions
are used to access the memory

.
.
register file
R3
R2
R1

ALU
LOAD
and
STORE
result

main
memory

ex.1 Z = (X + Y) using register-register CPU

I1 : LOAD R1 X; R1 M[X]

I2 : LOAD R2 Y; R2 M[Y]

Chapter - 2 (introduction format and addressing Page 92


I1 : LOAD R1 X; R1 M[X]

I2 : LOAD R2 Y; R2 M[Y]

I3 : ADD R3 R1 R2 ; R3 R1 + R 2

I4 : STORE Z R3 ; M[Z] R3

4 machine using register-register CPU

ex.2 C=A+B using AC. CPU WHERE A, B, C are the memory address

I1 : LOAD R1 A

I2 : LOAD R2 R3

I3 : ADD R3 R1 R2 ; R3 R1 + R 2

I4 : STORE C R3 ; M[C] R3

ex.3 (X * Y) + Z ; register-register CPU

I1 : LOAD R1 X

I2 : LOAD R2 Y

I3 : MUL R3 R1 R2 ; R3 R1 * R2

I4 : LOAD R4 Z

I5 : ADD R5 R3 R4 ; R5 R 3 + R4

5 machine using register-register CPU

ex.3 (A + B) * (C + D) ; register-register CPU

I1 : LOAD R1 A

I2 : LOAD R2 B

I3 : ADD R3 R1 R2 ;

I4 : LOAD R4 C

Chapter - 2 (introduction format and addressing Page 93


I5 : LOAD R5 D

I6 : ADD R6 R4 R5

I7 : MUL R7 R3 R6

I8 : STORE X R7

how many minimum no of registers are required?

I1 : LOAD R1 A

I2 : LOAD R2 B

I3 : ADD R1 R2 R3 ; R1 A +B

I4 : LOAD R2 C

I5 : LOAD R3 D

I6 : ADD R2 R2 R3 ; R2 C + D

I7 : MUL R3 R1 R2 ; R3 R 1 * R2

I8 : STORE X R3 ; M[X] R3

3 Registers

OPCODE REG1 REG2 MEM IMMEDIATE


7bit 6bit 6bit 19bit 13bit

110 instruction, OPCODE = 7BIT


it have 50 registers = reg AF = 6bits
512KB memory = mem AF = 19bit

program contains 300 instructions


300 x 7 = 2,100 byte.

Chapter - 2 (introduction format and addressing Page 94


program contains 300 instructions
300 x 7 = 2,100 byte.

OPCODE REG1 REG2 REG2 IMMEDIATE


4bit 6bit 6bit 6bit 12bit

12 instruction, OPCODE = 4bit


it have 64 registers = reg AF = 6bits
instruction size = 4+6+6+6+12 = 34 bits = 5byte.

program contains 100 instructions


porgram size : 100 x 5 = 500 = 500byte.

16bit(instruction)
2AF: OPCODE AF1 AF2
2 7 7

2AF = total no of operations = 22 = 4 operation

16bit(instruction)
1AF: OPCODE AF1
9 7

1AF = total no of operations = 2 9 = 512 operation

expand opcode technique

expand opcode length is required in the fixed length instruction supported CPU design to
implement the various instruction with different formats (alag alag format ke operation
support karne ke lie)

Chapter - 2 (introduction format and addressing Page 95


variable length instruction/ fixed length
fixed length opcode instruction/variable length
opcode

OPCODE = 8Bbit OPCODE = 8Bbit


AF = 8bit AF = 8bit
16bit(instruction) 16bit(instruction)
1AF OPCODE AF = 16bit 1AF OPCODE AF = 16bit
8 8 8 8

0AF OPCODE = 8bit 0AF OPCODE = 16bit


8 16
instruction ki length variable hai instruction ki length fix hai toh puri
16bit opcode ko dedenge

variable length : OPCODE fixed


fixed length : variable size OPCODE

expand opcode technique


in expand opcode technique we start from primitive instruction (smallest bit in OPCODE)

expand opcode length is required in the fixed length instruction supported CPU design to
implement the various instruction

Assume category :

(i) Primitive instruction : lowest/smallest bits in OPCODE.

(ii)Derived instruction : higher(more)bit in OPCODE.

(iii) more (further) derived instruction : highest (more and more) bit in OPCODE.
Chapter - 2 (introduction format and addressing Page 96
(iii) more (further) derived instruction : highest (more and more) bit in OPCODE.

jis OPCODE ka bit sabse kam hai waha se start karenge


OPCODE bit in increasing order (smallest...more.higher....highest)

OPCODE bits execution 1 : smallest bit in OPCODE

execution 2 : more higher bit in OPCODE

execution 3 : highest bit in OPCODE

Assume primitive instruction means smallest opcode instruction

Step 1 : identify the primitive instruction (lowest OPCODE bit) in the CPU
according to the CPU

step 2 : calculate total number of possible operation

step 3 : identify the free opcode after allocating existed instruction

step 4 : calculate the number of derived instruction possible by multiply

free opcode x 2increment bit in opcode

Chapter - 2 (introduction format and addressing Page 97


--------6bit--------- -----6bit------
OPCODE AF OPCODE
given 1 address instruction (AI) : 2
2bit 4bit 6bit
lowest OPCODE bit (derived
(primitive instruction) instruction)
(type 2)

step 1 : find primitive instruction(type 1) : sabse pehle format likhlo jaise 1AF aur 0 AF diya
hai. phir OPCODE but dekho jisme kam vo primitive.

--------6bit--------- -----6bit------
OPCODE AF OPCODE
2bit 4bit 6bit
lowest OPCODE bit (derived
(primitive instruction) instruction)
(type 2)

step 2 : Total number of possible operation : OPCODE = 22 = 4

given 1 address instruction (AI) : 2

step 3 : idenitfy number of free OPCODE after allocating 1 address instruction 4-2 = 2

step 4 : calculate the number of derived instruction/operation


--------6bit---------
free OPCODE x 2 increment bit in OPCODE
OPCODE AF
increment bit in OPCODE : pehle 2 bit ka tha abhi 6 bit ka 2bit 4bit
lowest OPCODE bit
increment bit in OPCODE = 6-2 = 4 (primitive instruction)
2 x 26-2 = 2 x 24 = 32

-----6bit------
OPCODE
6bit
(derived instruction)
(type 2)

feeling / working of the concept

1AF 0AF
--------6bit--------- -----6bit------
OPCODE AF __ ____
2bit 4bit 2bit 4bit
lowest OPCODE bit (derived instruction)
(primitive instruction) (type 2)

total operation = 2 2 = 4 2bit ka OPCODE already


Chapter - 2 (introduction format and addressing Page 98
lowest OPCODE bit (derived instruction)
(primitive instruction) (type 2)

total operation = 2 2 = 4 2bit ka OPCODE already


tha ab AF wala 4 bit idhar
00 OPCODE me club hokar
USED (given : 2)
01 kaam karega
10
FREE
11
yeh 2 bache hue kitno ke sath
combination banayege?

2 ki power 4 combination islie kiya :

__ ____ __ ____
10 0000 11 0000
0001 0001
. .
. .
. .
. .
. .
. .
10 1 1 1 0 11 1 1 1 0
1111 1111
16 16

agar expand OPCODE nahi hota toh 4 x 24 hota lekin


idhar 2 already use hogye islie 2 x 2 4

1 word = 8bit (1byte)


memory is 256 word
hence 256 bytes ki memory = 28
instruction size = 3 word = 3 x 8 (bit) = 24bit ka instruction
--------24bit -------- -----24bit ----- --24bit --
OPCODE AF1 AF2 OPCODE AF1 OPCODE
8bit 8bit 8bit 16bit 8bit 24bit
primitve derived further derived
(type 1) (type 2) (type 3)

type : 1
--------24bit --------
OPCODE AF1 AF2
8bit 8bit 8bit

given 2-address instruction (AI) : 254 = 28

step 2 : Total number of possible operation : OPCODE = 28 = 256

Chapter - 2 (introduction format and addressing Page 99


given 2-address instruction (AI) : 254 = 28

step 2 : Total number of possible operation : OPCODE = 28 = 256

step 3 : idenitfy number of free OPCODE after allocating 2 address instruction 256-254 = 2

type : 2
-----24bit -----
OPCODE AF1
16bit 8bit

given 2-address instruction (AI) : 256 = 28

calculate the number of derived instruction/operation


= free OPCODE x 2 increment bit in OPCODE
= 2 x 2 16-8
= 2 x 28
= 512

idenitfy number of free OPCODE after allocating 1 address instruction 512 - 256 = 256

type : 3
--24bit --
OPCODE
24bit

calculate the number of derived instruction/operation


= free OPCODE x 2 increment bit in OPCODE
= 256 x 2 24-16
8
= 256 x 2
= 216
= 65536.

type : 1 type : 2 type : 3


--------24bit -------- -----24bit ----- --24bit --
OPCODE AF1 AF2 OPCODE AF1 OPCODE
8bit 8bit 8bit 24bit
16bit 8bit
given 2-address instruction (AI) : 256 calculate the number of
given 2-address
instruction (AI) : 254 derived instruction/operation
calculate the number of derived = free OPCODE x 2 increment bit
instruction/operation in OPCODE
step 2 : Total number of
= free OPCODE x 2 increment bit in OPCODE = 256 x 2 24-16
possible operation :
= 2 x 2 16-8 = 256 x 28
OPCODE = 28 = 256 8
=2x2 = 216
= 512 = 65536.
step 3 : idenitfy number
of free OPCODE after
idenitfy number of free OPCODE after
allocating 2 address
allocating 2 address instruction 512 -
instruction 256-254 =
256 = 256
2

Chapter - 2 (introduction format and addressing Page 100


16bit instruction
Registers : 15 then address field : 2 4=4bit

--------16bit -------- -----16bit ----- --16bit --


OPCODE AF1 AF2 OPCODE AF1 OPCODE
8bit 4bit 4bit 12bit 4bit 16bit
primitve derived further derived
(type 1) (type 2) (type 3)

type : 1
--------16bit --------
OPCODE AF1 AF2
8bit 4bit 4bit

given 2-address instruction (AI) : X

step 2 : Total number of possible operation : OPCODE = 28 = 256

step 3 : idenitfy number of free OPCODE after allocating 2 address instruction 256-x

type : 2
-----16bit -----
OPCODE AF1
12bit 4bit

calculate the number of derived instruction/operation


= free OPCODE x 2 increment bit in OPCODE
= 256-x x 2 12-8
= 256-x x 24
= 256-x x 16
16 = 256-x
x = 256 - 16
x = 240.

11bit instruction
address field : 4bit

--------11bit -------- -----11bit ----- --11bit --


OPCODE AF1 AF2 OPCODE AF1 OPCODE
3bit 4bit 4bit 7bit 4bit 11bit
primitve derived further derived
(type 1) (type 2) (type 3)

type : 1
--------11bit --------
OPCODE AF1 AF2
3bit 4bit 4bit

given 2-address instruction (AI) : 5

step 2 : Total number of possible operation : OPCODE = 23 = 8

Chapter - 2 (introduction format and addressing Page 101


given 2-address instruction (AI) : 5

step 2 : Total number of possible operation : OPCODE = 23 = 8

step 3 : idenitfy number of free OPCODE after allocating 2 address instruction : 8-5 = 3

type : 2
-----11bit -----
OPCODE AF1
7bit 4bit

calculate the number of derived instruction/operation


= free OPCODE x 2 increment bit in OPCODE
= 3 x 2 7-3
= 3 x 24
= 3 x 16
= 48

idenitfy number of free OPCODE after allocating 1 address instruction : 48 - 32 = 16

type : 3
--11bit --
OPCODE
11bit

calculate the number of derived instruction/operation


= free OPCODE x 2 increment bit in OPCODE
= 16 x 2 11-7
= 16 x 24
= 256.

16bit instruction
address field : 6bit

--------16bit -------- -----16bit ----- --16bit --


OPCODE AF1 AF2 OPCODE AF1 OPCODE
4bit 6bit 6bit 10bit 6bit 16bit
primitve derived further derived
(type 1) (type 2) (type 3)

type : 1
--------16bit --------
OPCODE AF1 AF2
4bit 6bit 6bit

given 2-address instruction (AI) : 14

step 2 : Total number of possible operation : OPCODE = 24= 16

step 3 : idenitfy number of free OPCODE after allocating 2 address instruction : 16 - 14 = 2

Chapter - 2 (introduction format and addressing Page 102


--------16bit -------- -----16bit ----- --16bit --
OPCODE AF1 AF2 OPCODE AF1 OPCODE
4bit 6bit 6bit 10bit 6bit 16bit
primitve derived further derived
(type 1) (type 2) (type 3)

type : 1
--------16bit --------
OPCODE AF1 AF2
4bit 6bit 6bit

given 2-address instruction (AI) : 14

step 2 : Total number of possible operation : OPCODE = 24= 16

step 3 : idenitfy number of free OPCODE after allocating 2 address instruction : 16 - 14 = 2

type : 2
-----16bit -----
OPCODE AF1
10bit 6bit

calculate the number of derived instruction/operation


= free OPCODE x 2 increment bit in OPCODE
= 2 x 2 10-4
= 2 x 26
= 2 x 64
= 128

idenitfy number of free OPCODE after allocating 1 address instruction : 128 - 127 = 1

type : 3
--16bit --
OPCODE
16bit

calculate the number of derived instruction/operation


= free OPCODE x 2 increment bit in OPCODE
= 1 x 2 16-10
= 1 x 26
= 64

idenitfy number of free OPCODE after allocating 0 address instruction : 64 - 60 = 4

4 OPCODE can be used in future for new operations.

16bit processor : word length = 16bit


instruction : 16bit
Register = 30 then address field : 5bit
memory : 4kb, memory AF : 12bits

--------16bit -------- -----16bit ----- --16bit --


OPCODE REG1 REG2 OPCODE MEM OPCODE
6bit 5bit 5bit 4bit 12bit 16bit
derived primitive further derived

Chapter - 2 (introduction format and addressing Page 103


--------16bit -------- -----16bit ----- --16bit --
OPCODE REG1 REG2 OPCODE MEM OPCODE
6bit 5bit 5bit 4bit 12bit 16bit
derived primitive further derived
(type 2) (type 1) (type 3)

type : 1
-----16bit -----
OPCODE MEM
4bit 12bit

given 1-memory instruction (MI) : 10

step 2 : Total number of possible operation : OPCODE = 24= 16

step 3 : idenitfy number of free OPCODE after allocating 1 address instruction : 16 - 10 = 6

type : 2

--------16bit --------
OPCODE REG1 REG2
6bit 5bit 5bit

given 2-address instruction (AI) : 11

calculate the number of derived instruction/operation


= free OPCODE x 2 increment bit in OPCODE
= 6 x 2 6-4
= 6 x 22
= 6 x4
= 24
idenitfy number of free OPCODE after allocating 2 address instruction : 24 - 11 = 13

type : 3
--16bit --
OPCODE
16bit

calculate the number of derived instruction/operation


= free OPCODE x 2 increment bit in OPCODE
= 13 x 2 16 - 6
= 13 x 2 10
= 13k

topic : Addressing modes


- addressing mode tells us where the operand is present (location of the operand)

- addressing mode is a technique used to calculate the effective address and operand.

- addressing mode shows the way where the operand is present.

- addressing mode shows the way how to get the operand.

operand kaha aur kaise milega vo batata hai addressing mode.


instruction
OPCODE OPERAND (OR) OPERAND REFERENCE
Chapter - 2 (introduction format and addressing Page 104
OPCODE OPERAND (OR) OPERAND REFERENCE

Operational code Operand (data)

type of operation data kahan hai


- memory ? addressing mode
- register ? batata hai
- instruction itself ?

Effective address
Effective address is the actual address of the operand.

Addressing mode
Different ways in which the location of the operand is specified in an instructions are referred as
addressing mode. Operand kaha present hai aur usko kaise lena hai)

1. Fetch cycle : to fetch (bring) the instruction from main memory to CPU. (doesn't care
(works as a what is the instruction)
postman)

fetched instruction
with the help of is loaded into the IR
memory line

PC MAR AL binary DL MBR IR


address with the help of
supplies RD CL data line

read control line

(memory)

IR : OPCODE MODE OPERAND/ADDRESS FIELD

2. Execute cycle : the objective of the execute cycle is to execute (to process) the fetch instruction.
it decodes; does the analysis of the instruction. (what is OPCODE, how many
operand, operand address calculation, operand fetch, processing, result storage)

Chapter - 2 (introduction format and addressing Page 105


fetched instruction
with the help of is loaded into the IR
memory line

PC MAR AL binary DL MBR IR


with the help of
1000 address
supplies RD CL data line I1 : Load [6000]
1000 I1 : Load [6000]
read control line

(memory)

decoder decodes what is in the


data that has been passed
AC MBR instruction MAR DECODER (analysis of instruction)
(what opcode, how many operand)

AC AC + M[6000]
and data [6000]
with the help
I1 : Load [6000]
(operand fetch)
of system buses

data process and result store [6000] whatever is stored operand fetch AC M[6000]
decode I1 kya kehnta chata hai,
(memory) accumulator mai memory
location 6000 ka content
daaldo
operand address calculation
Load : Memory read
Store : Memory write

Why addressing mode?


reason 1 :
- to get the location of the operand
- t0 get to know where the required operand is present
- how to access (deal) with the address field.

reason 2 :
whenever we write program in high level language (C, C++) then we use different structures (machine
language kaise deal karega)
(i) constant
assembly language
(ii) variable(global, local, static) H.L.L converted to all these features are
implement by addressing
mode in assembly language
(iii)pointers

(iv) arrays machine language

reason 3 : instruction hota hai in binary (OPCODE pata hai humein lekin 5 kya hai)
OPCODE ADDRESS
can be
- value(constant)
- register (direct, indirect)
- memory (direct, indirect)

Chapter - 2 (introduction format and addressing Page 106


example :
OPCODE ADDRESS
4bit 3bit
1101 101 what is 5?
opcode 5 - value(constant)?
- register (direct, indirect)?
- memory (direct, indirect)?
to remove this confusion we need addressing mode.

mode field (mode bit) :

how we find total number of addressing immediate field


mode and which addressing mode
instruction is using?

OPCODE MODE FIELD OPERAND REFERENCE register

memory
mode field / mode bit : helps you
how to get the operand (or) how to if 4 am then , mode field = 2bit
use this address part if 7 am then, mode field = 3 bit
(immediate/memory/register
kaise use karna hai?)

so in the computer operand (data) are present in the register (or) memory (or) instruction itself

based on these there are various types of addressing modes.

types of addressing modes

(i) immediate addressing modes (#) or (I)

(ii) direct / absolute addressing mode [ ]

(iii) memory indirect addressing mode (@) or ( )

(iv) register direct addressing mode : reg. name

(V) register indirect addressing mode : index reg. name

(VI) PC relative addressing mode

Chapter - 2 (introduction format and addressing Page 107


(VI) PC relative addressing mode

(Vii) index register addressing mode


displacement addressing mode
(viii) based register addressing mode

(ix) auto decrement addressing mode

(x) auto increment addressing mode

(xi) implied / implicit addressing mode

when operand (data) is present in either immediate field (or) register (or) memory then why
various addressing modes are used?

because by 3 only, we cannot implement variable, pointers, arrays, loops etc. so we need
different - different addressing modes.

H.L.L machine
assembly
language

(i) immediate addressing mode : in this addressing mode operand is present(placed) in the
instruction itself
(OR)
operand are present in the address field of the instruction.

instruction
OPCODE ADDRESS
DATA
(operand)

note : immediate addressing mode are used to access the constant (or) initialize the register
(or) variable with value.

but immediate addressing mode have some limitation (disadvantage)


- addressing mode cannot be used as a destination address because constant does not have
any storage

MOV 100 R1
100 mai R1 nahi daal sakte

- range of constant is limited by the size of address field

n bit unsigned range = (o to 2n-1)


Chapter - 2 (introduction format and addressing Page 108
n bit unsigned range = (o to 2n-1)
n bit signed range = - (2n-1 to 2n-1-1)

data available hai instruction


mai hi usko access karne ke
example : lie memory ke pass nahi jana
pad raha
(i) (ii)
MOV R1 #1000: ADD R1 #5000
data data
(OR) (OR)
MVI R1 1000: ADDI R1 5000
R1 1000 R1 R1 + 5000

R1 mai 1000 daal do add immediate into R1

move immediate

MOV R1 #4000
(OR) MOV R5 #600
600 11
MVI R1 #4000 R5 600
R1 4000

source and destination as well


ADD R7 #600
ADDI R0 4000 4000 600
R7 R7 + 600
(OR)
ADD R0 #4000
R0 R0 + 4000
memory

memory mai hum gaye hi nahin

(ii) memory direct / absolute addressing mode : in this addressing mode operand is present(placed) in
the memory and instruction contains the effective
address
(OR)

address field of the instruction specify the effective


address

OPCODE AF
yahi mera effective address hoga
jo memory mai lekar jayega

operand hoga memory mai aur

Chapter - 2 (introduction format and addressing Page 109


operand hoga memory mai aur
instruction ka address field uska
DATA read/write address rakhegi.

memory

note :
- this addressing mode are used to access the variabes.

- 1 (one) memory reference is required to access (read (OR) write) the data.

example :

(i) (ii)
ADD R1 [1000] MOV R [6000]
R1 R1 + M[1000] R3 M[6000]
R1 mai R1 aur memory location R3 mai memory location
1000 ka data add hokar daal do. 6000 ka data daal do.

ADD R1 [4000]
R1 R1 + M[4000]
R1 R1 + 600
MOV R1 [4000] 600 11
R1 M[4000] R1 mai R1 + memory
R1 600 location 4000 ka data
daal do
R1 mai 600 (memory
location 4000 ka
R1 mai memory data) + R1 daal diya
location 4000 ka data 4000 600
daal do
R1 mai 600 (memory
location 4000 ka ADD R4 [600]
data) daal diya R4 R4 +M[600]
R4 R4 + 11

memory R4 mai R4 + memory


location 600 ka data
daal do
R4 mai 11 (memory
location 600 ka data)
+ R4 daal diya

Chapter - 2 (introduction format and addressing Page 110


OPCODE destination source 1 source 2
ADD [7000] [4000] [6000]

M[7000] M[4000] + M[6000] 4000 100

M[7000] 100 + 200


M[7000] 300
6000 200

M[7000] M[4000] + M[6000]


7000 300

1 memory memory
reference for
write 1 memory
reference for
read 1 memory memory reference : kitni baar memory
reference for ko access karna pada
read

(iii) memory indirect addressing mode : in this addressing mode operand is present(placed) in
the memory and effecive address (address of operand)
is also present in the memory, instruction contains the
address of effective address.

OPCODE AF
effective address ka memory mai operand
address instruction
mai hota hai DATA (operand) - 2 memory reference for access
the operand

- 1 memory reference for EA

- 1 memory reference for DATA


(operand)
EA

memory mai effective address

memory

Chapter - 2 (introduction format and addressing Page 111


note :
- indirect address is used to access the pointers.

- 2 memory reference are required to access (read/write) DATA


(operand)

example :

(i) (ii)
ADD R1 [1000] MOV R5 @7000
(0R) R5 M[7000]
ADD R1 @1000
R5 mai memory location
R1 R1 + M[1000] 7000 par jaakar uska
data daal R1 mai daaldo.
R1 mai R1 aur memory location
1000 ka data add hokar daal do.

(address of effective address)

ADD R1 @4000
MOV R3 @4000
R1 R1 + M[4000]
R3 M[4000]
R1 mai memory location 4000 par 600 11(DATA)
jaakar uska data daal R1 mai add R3 mai memory location 4000 par
karke R1 mai daaldo. jaakar uska data daal R3 mai daaldo

R3 M[600]
R1 R1 + M[600]
R1 mai memory location 600
R1 mai memory location 600 4000 600 (EA) (M[4000]) par jaakar uska data daal
(M[4000]) par jaakar uska data daal R1 mai daaldo.
R1 mai add karke R1 mai daaldo.

R1 11
R1 R1 + 11
R1 mai 11 (M[600])ko daaldo.
R1 mai 11 (M[600])ko R1 mai add
karke R1 mai daaldo.

memory

#indirect : direct na jaakr address pass karte hue address par jaa rahe hai

Chapter - 2 (introduction format and addressing Page 112


OPCODE destination source 1 source 2
ADD [6500] @4000 @600 11 10

M[6500] M[4000] + M[600]


M[6500] M[600] + M[11]
600 11
M[6500] 11 + 10
M[6500] 21

M[6500] M[4000] + M[600] 4000 600

6500 21
memory
direct

1 memory
memory memory
indirect
reference for memory
write DATA 1 memory indirect
reference for
EA 1 memory
reference for
1 memory EA
reference for
read DATA 1 memory
reference for
read DATA

2 addressing modes are used here


- memory direct
- memory indirect

note : we can have different tyoe of addressing modes in one instruction

(iv) register addressing mode : this addressing mode is same as memory direct addressing
mode but the difference is here the operand are present in the
register instead of memory.

Chapter - 2 (introduction format and addressing Page 113


OPCODE ADDRESS R1
register name R2 operand register mai hai
(R3)

R3 par data
R3 DATA(operand)
mil jayega in this addressing mode operand store
R4 in the register and that register address
. (register name) is maintained in the
register ka address means
uska naam address field
. address field of the instruction.
mai milega instruction ki .
.
.
.
.
.

register file

example :

(i)
ADD R1 [6000] R1 @7000
M[6000] R1 + @7000

memory
direct
register
1 memory direct
reference
memory
1 register indirect
reference
2 memory
reference

(v) register indirect addressing mode : in this addressing mode operand are present or placed
in the memory and effective address present in the
register.

Chapter - 2 (introduction format and addressing Page 114


OPCODE ADDRESS R1
register name R2 effective address
(R3)
(400) par mujhe
R3 par effective
R3 400(EA) DATA milega
address mil
jayega R4
.
register ka address means
uska naam address field
.
mai milega instruction ki . 400 DATA(operand)
.
.
.
.
.

register file

- 1 register reference for effective address memory


- 1 memory reference for read/write DATA

example :
(i)
OPCODE destination source 1
LOAD R1 @R2
R1 @R2 R0
R1 M[R2] R1
R1 M[2000] R2 2000 (EA) 2000 100
.
R1 100
.
.

LOAD R1 @R2 register files memory

register
direct
register
1 register indirect
reference
1 register
reference

Chapter - 2 (introduction format and addressing Page 115


1 memory
reference

when we have memory indirect then why register indirect is used?


(i) to shorten the instruction length.
Example : if memory size is 16GB then 34 bits memory address but in the processor (CPU) we have
16, 32 or 64 registers so it requires (4, 5 or 6bits) fewer bits to access the registers.
(ii) accessing the register very fast compared to memory.

Displacement addressing mode

(vi) (vii) (viii)


PC relative index based register
addressing register addressing
mode addressing mode
mode

effective address = PC value + Address field (offset) effective address = base register + Address field (offset)
effective address = current PC value + relative value

effective address = index register value + Address field (offset)


used in array implementation also

(ix) auto decrement and increment addressing mode : it is similar to register indirect
addressing mode in which register value
decrement (or) increment

Decrement : Pre-decrement

Increment : Post-increment

(ix) implied / implicit addressing mode : in this addressing mode operand (data info) are present
in the OPCODE itself

OPCODE

Chapter - 2 (introduction format and addressing Page 116


type of DATA
operation information

this will also tell us


example : stack based ADD the operations that
need to be performed.
STC (set carry)
cy = 1 TOS (POP)
TOS (POP)
CLC (clear carry) ADD
cy = 0 PUSH (TOS)

important points

- constant : immediate addressing mode

- variable : direct / absolute addressing mode

- pointer : indirect addressing mode

- array : index register addressing mode

- reallocation : base register addressing mode

operand is placed in the instruction


operand is placed in the memory and instruction contains the effective address

operand is placed in the memory and effective address (address of operand) is also present in the memory
same as memory direct am but operand is present in the register instead of memory
operand is present or placed in the memory and effective address is present in the register

/ implied / implicit operand (DATA info) is present in the OPCODE itself

Chapter - 2 (introduction format and addressing Page 117


--------16bit--------
LOAD to AC 500 AC M[500]
8bit OPCODE 8bitAF

operand immediate mai

500 800
201 500

800 300

702 325
- direct mai effective
600 900 address instruction
mai hai so we go to
R1 400 500, 500 is address
of effective address
400 700 800
- indirect mai
effective address
400 700 memory mai hi hai
toh 500 mera
399 450 address

effective address = PC value + Address field (offset)


= 202 + 500
= 702

effective address = index register value + Address field (offset)


= 100 + 500
= 600

Chapter - 2 (introduction format and addressing Page 118


--------16bit--------
OPCODE ADDRESS 745 OPCODE
222 746 ADDRESS

r1 = 111

222 155

effective address = 746


operand (DATA) = 222

(ii) in memory direct

effective address = 222


operand (DATA) = 155

effective address = 155


operand (DATA) = do not know m[155]

effective address = 111 + 222 = 333

loop : result of previous instruction par lagta hai means


MOV R1 , (3000) ke behalf par baat karega

Chapter - 2 (introduction format and addressing Page 119


loop : result of previous instruction par lagta hai means
MOV R1 , (3000) ke behalf par baat karega

operand is in memory
operand is operand is in
and instruction in memory register
carries the effective
address of operand and instruction carries the and instruction contains the
address of effective address register name

operand is in
instruction itself
- access consant
- initialize variable

1 memory 2 memory
EA
reference reference
used 1 register
used
reference

EA

operand is in
memory

and instruction - 1 register


contains the address of reference
effective address
- 1 memory
reference

effective address par jaakar jo effective address is


address milega usme operand register name in
instruction waha par
content milega
operand mai hi effective address
par operand

Chapter - 2 (introduction format and addressing Page 120


address milega usme operand
instruction waha par
content milega
operand mai hi effective address
par operand

register mai jaakar jo content


milega vo effective address
hoga

register name (or) register address are different?

Processor generally having 32 registers (or) 64 registers (GPR)

if 32 register then register AF = 5bit

assume,
if i write R1 then computer writes 00001

how i write how computer writes


(register name) (address)

1. Load immediate 20 : 20 (instruction mai hi operand)

AC 20

2. Load direct 20 : 40 (operand)


load [20]
AC M[20]
AC 40
3. Load indirect 20 : 60 (address of immediate address)

Chapter - 2 (introduction format and addressing Page 121


AC 40
3. Load indirect 20 : 60 (address of immediate address)
load (20)
or 20 40
load @20
AC [M[20]] 30 50
AC M[40] 40 60
AC 60
50 70
4. Load immediate 30 : 30 (instruction mai hi operand)
AC 30

5. load direct 30 : 50 (operand)


load [30]
AC M[30]
AC 50

6. Load indirect 30: 70 (address of immediate address)


load (30)
or
load @30
AC [M[30]]
AC M[50]
AC 70

Displacement addressing mode

(vi) (vii) (viii)


PC relative index based register
addressing register addressing
mode addressing mode
mode

effective address = PC value + Address field (offset) effective address = base register + Address field (offset)
effective address = current PC value + relative value

effective address = index register value + Address field (offset)


used in array implementation also
1 register reference for index register value
1 arithmatic reference for EA calculation
1 memory reference for read/write data

Chapter - 2 (introduction format and addressing Page 122


topic : displacement addressing mode

mentioned
hogi

because in the CPU


only one program
counter

format :
OPCODE REGISTER AF(A)

If Register is :

- Program counter then it will become relative addressing mode / PC relative

- index register then it will become indexed addressing mode

- base register then it will become based register mode

effective address = R + A (AF) offset

here, R can be may be implicit


explicit

- current value
because in the CPU only
- index register one program counter
- base register

Displacement addressing mode

(vi) (vii) (viii)


PC relative index based register
addressing register addressing
mode addressing mode
mode

Chapter - 2 (introduction format and addressing Page 123


addressing register addressing
mode addressing mode
mode

effective address = PC value + Address field (offset) effective address = base register + Address field (offset)
effective address = current PC value + relative value

OPCODE 500 effective address = index register value + Address field (offset) OPCODE BR 500
used in array implementation also

1 address implicit OPCODE XR 500


because in the explicit
CPU only one
program counter explicit

OPCODE R AF(A)

+ DATA(operand) read/write

memory

register file

(i) index register addressing mode : index addressing mode are used to implement the array

to implement the array we require two things

- starting / base address of array

- index value

starting / base address of array : present in the address field of the instruction (offset)
index value : stored in index register

Chapter - 2 (introduction format and addressing Page 124


index value : stored in index register

effective address = index register value + Address field (offset)

assume 1000 a[0]


int = 4byte 1004 a[1]
1008 a[2]
1012 a[3]
. .
. .
. .
. .
. .
. .
. .
1036 a[9]
register file

in H.L.L in assembly language

a[2] = 1008 effective address = 8 + 1000

here compiler will convert a[2] into index starting


address 1008 value address
(stored
in index
register)
a[n] = starting address of
array + n x size of data type here we have to tell the index value.

a[2] = 1000 + 2 x 4
index
= 1000 + 8 value
(stored
in index
= 1008 register)

OPCODE R AF(A)

Chapter - 2 (introduction format and addressing Page 125


index value
+ DATA(operand) read/write
1 register (EA) 1 memory
reference for 1 arithmetic / reference for
index value ALU reference memory data
for EA (read/write)
calculation

index register
(XR)

load to AC 500
1 register reference for
XR = 100 index value

effective address = XR value + Address field


= 100 + 500 1 ALU reference for EA calculation
= 600
200

effective address = 600 201


operand = 900 202

500 800
1 memory reference for DATA 600 900

702 325
800 300

memory

consider a processor have 32 registers

at the CPU design time, one special purpose register is made as index

Chapter - 2 (introduction format and addressing Page 126


at the CPU design time, one special purpose register is made as index
register then this special purpose register implicitly access the index value
from index register. (32 mai se ek special purpose means index register
bana diya, it will be used for only this purpose)
disadvantage : one register wastage because of not heavy utilisation
(kabhi kabhi hi kaam aayega index register)
index register (XR)

in most of the cases, any one general purpose register is designated as


index register and that register name will be mentioned. (kisi general
purpose register ko as a index register use karlenge whenever needed so we
will have 32 registers)
example : r1 as a index register
advantage : we can use this register in future as general purpose register
(index register ko phir hum wapas se general purpose jaise use kar sakte)

Chapter - 2 (introduction format and addressing Page 127


RS = 1

OPCODE A RS
contains
index value

effective address = A + R
= m[1000 + Rs] index addressing
= m[1000 + 1] mode used here
= m[1001]
RD m[1001]

RD RD + 1000
RD 1 + 1000
RD 1001

m[0+RD] (i) immediate


m[0 + 1001] am used here
m[1001] 20 (ii) index am
used here

Chapter - 2 (introduction format and addressing Page 128


24 bit = 3byte 300 300 - 302
address = 300
303 303 - 305
assume 400 is starting address .
300 + 3x = 400 .
3x = 100 .
x = 100/3 = 33.3333 390 800 390 - 392
no, it is not a starting address
393 900 393 - 394
24 bit = 3byte .
address = 300 .
399 325 399, 400, 401
assume 600 is starting address 402 300 402 - 402
300 + 3x = 600
3x = 300 memory
x = 300/3 = 100
yes, it is a starting address PC denotes the starting address of the
instruction

why auto decrement and auto increment am is used?

when we want to access / perform same operation in multiple location (assume 100
location) then different different instructions are required

but with the help of auto decrement and auto increment am it is possible by using only
one instruction we perform same operation or access the multiple location

here we use loops.

example :
for (i = 1; i<=100, i++)

----- address 1

Chapter - 2 (introduction format and addressing Page 129


----- address 2

.
.

----- address 100

R1 pe memory location 3000 ka data daal do hence, R 1 = 10

loop or branch condition previous instruction ke result par depend karegi


register m[R3], m[R3] ka value hai 2000 aur m[2000] par hai 100 hence R2 = 100
indirect am
R1 = 10 and R2 = 100 hence R1 + R2 = 110

m[R3], means m[2000] mai 110 daal do

R3 hai m[2000] usme + 1 kardo hence, m[2001]


R1 hai 10 usme 1 decrement kardo hence 10 - 1 = 9

R1 mai 0 nahi hai toh dobara loop mai bejh do

auto increment auto


decrement
2000 100 110
OPCODE AF
2001 100 109
r3 = 2000
M[3000] = 10 2002 100 108
2003 100 107
note : by using auto decrement / auto increment
am we are accessing the multiple locations (or) we 2004 100 106
are performing same operation in multiple
2005 100 105
locations.
2006 100 104
2007 100 103
2008 100 102
2009 100 101
2010 100 100
memory
out of loop

R2 mai m[R3] means m[2001]=100 hai vo daaldo


R2 mai R1 ( 9) + R2 (100) = 109 daaldo

M[R3] means m[2001] mai R2 (109) daal do

R3 (m[2001]) mai R3 (m[2001]) + 1 kardo = m[2002]


R1 mai R1 (9) - 1 = 8 daal do
R1 mai 0 nahi hai toh dobara loop mai bejh do

so, the loop will continue to execute until R1 becomes 0 and in memory the decrement will continue

Chapter - 2 (introduction format and addressing Page 130


so, the loop will continue to execute until R1 becomes 0 and in memory the decrement will continue

in 1 loop / iteration
2 times we access the memory

loop = total = 10 times


10 x 2 = 20

total memory reference = 1(bahar wala) + 20(loop wale) = 21

concept : how to find target address?

(i) in PC relative addressing mode

case (i)
1000 i1 i1 fetch PC : 1001

i2 fetch PC : 1002
1001 i2 stored in tos
i3 fetch PC : 1003 when i3 is decoding (executing) i3 gets
1002 i3
JMP 1051 PC : 1051 to know it does not have to go to 1003
1003 i4 JMP +48 instead i3 have to go to 1051.
. hence, target address : 1051
.
.
. forward jumping
1051 i51

1052 i52

1053 i53

what is the displacement / relative value?


target address = current PC value + displacement / offset / relative value

1051 = 1003 + displacement / offset / relative value


Chapter - 2 (introduction format and addressing Page 131
1051 = 1003 + displacement / offset / relative value
relative value = 1051 - 1003
relative value = +48

case (ii)

1000 i1 i1 fetch PC : 1001

i2 fetch PC : 1002
1001 i2
i3 fetch PC : 1003
1002 i3

1003 i4 i4 fetch PC : 1004


.
. backward jumping
.
1051 i51 i51 fetch PC : 1052 stored in tos

1052 i52 i52 fetch PC : 1053 when i52 is decoding (executing) i52
JMP 1002 PC : 1002 gets to know it does not have to go to
1053 i53 1053 instead i52 have to go to 1002.
JMP -51 hence, target address : 1002.

what is the displacement / relative value?


target address = current PC value + displacement / offset / relative value

1002 = 1053 + displacement / offset / relative value


relative value = 1002 - 1053
relative value = -51

displacement / relative value/ offset value

+ve -ve
forward backward
jumping jumping

note : relative AM and base AM register are used to write re-allocatable code.

branch instruction / program control category


jo program ki category ko change karde
unconditional conditional

Chapter - 2 (introduction format and addressing Page 132


jo program ki category ko change karde
unconditional conditional

JMP BNZ
jump branch on not
zero

GOTO JNZ
go to jump on
not zero

SKIP BE
skip branch
on
equal

condition
if sub x y becomes 0
then go 211

branch on zero

condition
if R1 and R2 is equal
then go to 235

branch if equal

relative value or offset : displacement between current location of PC to target location.

case (i)

1000 i1 i1 fetch PC : 1001

i2 fetch PC : 1002
1001 i2
i3 fetch PC : 1003 JMP +48 jo kaam tha i51 par le
1002 i3 jaane ka vo isse ho gaya

1003 i4
displacement /
.
forward jumping offset /
.
relative value
.
.
Chapter - 2 (introduction format and addressing Page 133
.
forward jumping offset /
.
relative value
.
.
1051 i51

1052 i52

1053 i53

target address = current PC value + displacement / offset / relative value

1051 = 1003 + displacement / offset / relative value


relative value = 1051 - 1003
relative value = +48

case (ii)

1000 i1 i1 fetch PC : 1001

i2 fetch PC : 1002
1001 i2
i3 fetch PC : 1003
1002 i3

1003 i4 i4 fetch PC : 1004


.
. backward jumping
.
1051 i51 i51 fetch PC : 1052

i52 fetch jo kaam tha i2 par le


1052 i52 PC : 1053 JMP -51
jaane ka vo isse ho gaya

1053 i53 displacement /


offset /
relative value

target address = current PC value + displacement / offset / relative value

1002 = 1053 + displacement / offset / relative value


relative value = 1002 - 1053
relative value = -51

benefit : relative AM and base AM register are used to write re-allocatable code.

concept : re-allocatable code

Chapter - 2 (introduction format and addressing Page 134


concept : re-allocatable code

if operation system changes or re-allocate the memory address, suppose we start


from 4000 instead of 1000 then the procedure we followed before will not be
applicable here but with the help of relative addressing mode this will work perfectly.

but why OS change or re-allocate the address?


suppose high priority process comes in CPU then OS will do the swapping
means this program will be swapped out from memory to backing store.

after some time same program will reload but at different address

case (i) address changed from 1000 to 4000

4000 i1 i1 fetch PC : 4001

i2 fetch PC : 4002
4001 i2
i3 fetch PC : 4003 JMP +48 jo kaam tha i51 par le jaane
4002 i3 ka hai vo isse ho gaya

4003 i4
. displacement /
. forward jumping offset /
. relative value
. now, this will not work here
4051 i51

4052 i52 JMP 1051

4053 i53

target address = current PC value + displacement / offset / relative value


target address = 4003 + 48
target address =4051

benefit : agar mai memory location (re-allocate) change bhi kardu toh displacement /
offset / relative idhar bhi kaam karega but with previous method , hume address 1000
par operation perform krna tha jisme JMP 1051 tha, toh hum sirf 1000 par hi operate
kar paate but with displacement we can work on any address because it does not stick
to any particular address (i.e 1000, 2000, etc) instead it is a operation like JMP +48
that can be perfomed on any address that will be perfoming forward jumping.

Chapter - 2 (introduction format and addressing Page 135


target = 211
PC : 204
offset : +7

target = 202
PC : 211
offset : -9

target = 235
PC : 226
offset : +9

PC relative addressing mode : intra segment transfer of control (branching) when target address is
present in same segment then during program execution control will be transferred with in the
segment called intra segment branching.

i1
i2
i3
segment 1 intra segment
.
(same segment)
.
ix

segment 2 iy inter segment


(different segment)

8086 processor has 1mb


memory is divided into 16
segment each 64kb

Chapter - 2 (introduction format and addressing Page 136


16 x 64kb
24 x 216 = 220 B
1mb

in PC relative addressing mode


target address = current PC value + displacement / offset / relative value

in base addressing mode


target address = base register value + offset / displacement

(ii) base register addressing mode : the only difference is that in based register we have to put the
starting address in base register (due to re-allocation)

target = 211
PC : 204
offset : +7

target = 202
PC : 211
offset : -9

target = 235
PC : 226
offset : +9

the only difference is that in based register we have to put the starting address in Base register

ex. BR = 200

Chapter - 2 (introduction format and addressing Page 137


278128
starting address = 278128 278129
instruction size = 4byte. 278130 i1
relative value = -8
278131
target address = current pc value + displacement
target address = 27132 + (-8) 278132 next instruction
target address = 27124

900
901 i1
word size = 16 bits = 2 byte 902 next instruction
instruction size = 4byte = 2 words
PC value = 902
displacement value = -32
jmp -32

target address = current pc value + displacement


target address = 902 + (-32)
target address = 870

target address = base register value + displacement


target address = 500 + (-32)
target address = 468

1 word size = 16 bit


instruction size = 16bit 900 opcode
901 AF
opcode af 902 next instruction

8bit 8bit

PC value = 902
target address = 614

Chapter - 2 (introduction format and addressing Page 138


8bit 8bit

PC value = 902
target address = 614

target address = current PC value + displacement


614 = 902 + displacement
displacement = 614 - 902 = -288
backward jumping.

PC value
(i) before instruction fetch :
900

(ii) after instruction fetch


902

(iii) after instruction fetch


614

assume starts from 1000

1000 - 1003
1004 - 1007
1008 - 1011 compare
1012 - 1015 branch if equal
1016 - 1019

when i + 3 only fetch then


PC value becomes 1016
target address = 1000

target address = current PC value + displacement


1000 = 1016 + displacement
displacement = 1000 - 1016 = -16
backward jumping.

Chapter - 2 (introduction format and addressing Page 139


chapter - 3 (floating point representation)
Tuesday, June 18, 2024 6:29 PM

floating point representation

topic : number system

unsigned([+ve])

magnitude

signed ([+ve], [-ve])

1's complement ([+ve], [-ve])

complement

2's complement ([+ve], [-ve])

0 [+ve]

signed

1 [-ve]

Chapter - 2 (introduction format and addressing Page 140


o converts to 1 1's complement + 1
and 1 converts to 0

negative in
signed when
starting with
1

1's complement 2's complement


1000 0111 = -7 0111
+1
=-8
negative because signed

why 2's complement used in computer system?

signed +0(0000) -0(1000) 2's complement mai 0 ke redundant


1's complement +0(0000) -0(1111) representation nahi hote hai

redundant reresentation for '0'.


we use 2's complement to represent -ve numbers.

nbit 2's complement range = -(2n-1) to +(2n-1-1)

2 6 25 24 2 3 22 21 20 . 2-1 2-2 2-3 2-4


64 32 16 8 4 2 1 1/2 1/4 1/8 1/16
64 32 16 8 4 2 1 0.5 0.25 0.125 0.0625

Chapter - 2 (introduction format and addressing Page 141


Q. how to write 2.5 in binary? Q. how to write 6 in binary?
10.1 110
21 20 . 2-1(1/2)
1 0 . 1 4 2 1
1 1 0
Q. how to write 3.25 in binary?
11.01 4bit : 0110
5bit : 00110
21 21 . (1/2) (1/4) 6bit : 000110
1 1 . 0 1

Q. how to write 11.25 in binary?


1011.01

Q. how to write 13.75 in binary?


1101.11

topic : floating point reprsentation

why we use floating point representation?

(i) to represent very very large number (e.g. 9876556527....)

(ii) to represent very very small number (e.g. 0.000000000000000001)

eg.
16bit number = -(216-1) to +(216-1)
= (-32k to +32k -1)

but if we want to store 51,000 then it is not possible with 16bit data because range is
between (-32k to +32k -1), hence this is not possible with fixed point.

syntax :

1bit 2bit 4bit


S E M

0 [+ve]

s (sign bit)

Chapter - 2 (introduction format and addressing Page 142


s (sign bit)

1 [-ve]
S e
+/- 0.xxxxxx X 2e

mantissa

E/BE= exponent / biased exponent


E= e + bias

(or)
BE= AE + bias

M= mantissa
e=exponent / actual exponent AE

write in SEM format.

1bit 2bit 4bit


S E M

note : hume jo bhi floating point mai represent karna


hota hai vo hamesha isi format mai karna hota hai

+/- 0.xxxxxx X 2e

110.1

0.11o1 x 2+3

(i) method to cross check


Q. is e=+3 correct or not?
[0.11o1] x 2+3
= [2-1+2-2+2-4 ]x 2+3
= [2-1+3 + 2-2+3+2-4+3]
= 22 + 21 + 2-1
= 4 + 2 + 0.5
= 6.5

(ii) second technique


right alignment = 2+ (power will be in positive)
left alignment = 2- (power will be in negative)
110.1

point udhar hi rahega

0.1101 x 2+3
three bits right mai sarka di means right alignment

Chapter - 2 (introduction format and addressing Page 143


point udhar hi rahega

0.1101 x 2+3
three bits right mai sarka di means right alignment
hence power is in positive.

0.1101 x 2+3
S e
+/- 0.xxxxxx X 2e

mantissa

s(sign) = 0 (positive hai)


e = +3 [11]
M = 1101 (mantissa)

s(1bit) e(2bit) m(4bit)


0 11 1101

right alignment = 2+ (power will be in positive/increment in exponent)


left alignment = 2- (power will be in negative/decrement in exponent)

Q. +(4.5)

1bit 2bit 4bit


S E M

100.1

0.1001 x 2+3

0.1001 x 2+3
S e
+/- 0.xxxxxx X 2e

mantissa

s(sign) = 0 (positive hai)


e = +3 [11]
M = 1001 (mantissa)

s(1bit) e(2bit) m(4bit)


0 11 1001

Chapter - 2 (introduction format and addressing Page 144


Q. +(4.75)

1bit 2bit 5bit


S E M

100.11

0.10011 x 2+4

0.10011 x 2+3
S e
+/- 0.xxxxxx X 2e

mantissa

s(sign) = 0 (positive hai)


e = +3 [11]
M = 10011 (mantissa)

s(1bit) e(2bit) m(5bit)


0 11 10011

if M was 4 bit then note : format diya


rahega question mai
1bit 2bit 4bit
S E M

s(1bit) e(2bit) m(4bit)


0 11 1001

Q. 0.00101

1bit 4bit 5bit


S E M

0.101 x 2-2

0.101 x 2-2
S e
e
+/- 0.xxxxxx X 2

mantissa
s(sign) = 0 (positive hai)

e = -2 [10] (no provision to represent (tell) exponent is negative)

Chapter - 2 (introduction format and addressing Page 145


mantissa
s(sign) = 0 (positive hai)

e = -2 [10] (no provision to represent (tell) exponent is negative)


exponent is negative but sign bit is 0 because number is positive so how to
deal with negative exponent?
hence, we have to take 2's complement.
e = 0010 (in 4bits)
1's complement : 1101
2' complement : 1101 + 1 = 14 (1110)

M = 101 (mantissa)

s(1bit) e(4bit) m(5bit)


0 1110 00101
biasing
converted -2 into +14.
-2
this creates confusion because we
are writing this as e=-2 but if e
we read it randomly then it 14
seems like +14

q.(i) what is bias exponent?


E

q.(ii) why biasing (or) bias exponent [E/BE] is used?


to convert negative/positive number into positive or '0'.
when e is -ve then we take 2's complement
nbit 2's complement range = -(2n-1) to +(2n-1-1)
5bit 2's complement range = -(25-1) to +(25-1-1) = -16 to +15

q.(iii) how bias value is decided/selected?


if your exponent = k-bit
then bias value = 2k-1

if exponent is kbit then 2's complement range = -2k-1 to 2k-1-1


if exponent is 4bit then 2's complement range = -24-1 to 24-1-1 = -8 to +7

in order to convert all numbers into positive numbers, take the most (highest) negative number
and add as a bias

if exponent is kbit then bias = -2k-1


if exponent is kbit then bias = -24-1 = 23 = 8
if exponent is 4bit then 2's complement range = -24-1 to 24-1-1 = -8 to +7

Chapter - 2 (introduction format and addressing Page 146


e + bias = E
-8 + 8 =0
-7 + 8 =1
-6 + 8 =2
-5 + 8 =3
-4 + 8 =4 number kuch bhi ho we use
-3 + 8 =5 bias exponent toh E ke lie
-2 + 8 =6 kuch nahi sochna sab 0
-1 + 8 =7 hoga ya positive hoga
0
.
.
.
7+8 = 15

note : sometimes bias is


given in this form
bias = 8
exponent = 4bit

Q. 0.00101

1bit 5bit 4bit


S E M

0.101 x 2-2

0.101 x 2-2
S e
e
+/- 0.xxxxxx X 2

mantissa
s(sign) = 0 (positive hai)

e = -2 [10] (no provision to represent (tell) exponent is negative)


exponent is negative but sign bit is 0 because number is positive so how to
deal with negative exponent?
hence, we have to take 2's complement.
e = 0010 (in 4bits)
1's complement : 1101
2' complement : 1101 + 1 = 14 (1110)

M = 101 (mantissa)

s(1bit) e(5bit) m(4bit)

Chapter - 2 (introduction format and addressing Page 147


M = 101 (mantissa)

s(1bit) e(5bit) m(4bit)


0 01110 1010
biasing
converted -2 into +14.
-2
this creates confusion because we
are writing this as e=-2 but if e
we read it randomly then it 14
seems like +14

continuation....

1bit 5bit 4bit


S E M

s(sign) = 0 (positive hai)


e = -2 [10]
M = 101 (mantissa)
E= e + bias

exponent : 5bit
hence , bias = 25-1
bias = 25-1 = 24 = 16

E= e + bias
E= -2 + 16 = 14

in 5 bits : 01110

1bit 5bit 4bit


S 01110 1010

q.(iv) how to write mantissa


we have to normalize the mantissa

normalized mantissa

explicit normalized implicit normalized

syntax syntax

0. 1........X 2e 1. .......X 2e
M M

Chapter - 2 (introduction format and addressing Page 148


0. 1........X 2e 1. .......X 2e
M M
point ke baad immediate point ke pehle 1.
bit 1 honi chaiye 1.something

formula to get number formula to get number


[value formula] [value formula]
(-1)s x 0.M x 2e (-1)s x 1.M x 2e
(-1)s x 0.M x 2E-bias (-1)s x 1.M x 2E-bias

example : example :
(101.11) (101.11)
0.10111 x2+3 1.0111 x22

M=10111 M=0111
e=3 e=2
E = e+bias E = e+bias
e = E - bias e = E - bias

Q. +6.75

1bit 4bit 5bit


S E M

110.11
S e
exponent : 4bit e
+/- 0.xxxxxx X 2
hence , bias = 24-1
bias = 24-1 = 23 = 8

E= e + bias
E= e + 8 mantissa

explicit implicit
0
+110.11 x 2 +110.11 x 20

0.11011 x 23 1.1011 x 22

s(sign) = 0 (positive hai) s(sign) = 0 (positive hai)


e=3 e=2
M = 11011 M = 1011

E= e + bias E= e + bias
E = 3 + 8 = 11 (1011) E = 2 + 8 = 10 (1010)
1bit 4bit 5bit 1bit 4bit 5bit
0 1011 11011 0 1010 10110

Chapter - 2 (introduction format and addressing Page 149


E = 3 + 8 = 11 (1011) E = 2 + 8 = 10 (1010)
1bit 4bit 5bit 1bit 4bit 5bit
0 1011 11011 0 1010 10110

5 bits ke lie maine mantissa mai 0 lagaya but


last mai zero kyu lagaya?

explicit implicit

1bit 4bit 5bit 1bit 4bit 5bit


0 1011 11011 0 1010 10110

E = 1011 = 11 E = 1010 = 10
bias = 8 bias = 8

formula to get number formula to get number


[value formula] [value formula]
(-1)s x 0.M x 2e (-1)s x 1.M x 2e
(-1)s x 0.M x 2E-bias (-1)s x 1.M x 2E-bias

(-1)0 x 0.11011 x 2 11-8 (-1)0 x 1.10110 x 210-8


(-1)0 x 0.11011 x 2 3 (-1)0 x 1.10110 x 22
0.11011 x 23 1.10110 x 22
110.11 x 20 110.110 x 20
6.75 6.75

0 lagane se farq nahi pada


last mein, result same aya.

Q. +5.5

1bit 4bit 5bit


S E M

101.1
S e
exponent : 4bit
+/- 0.xxxxxx X 2e
hence , bias = 24-1
bias = 24-1 = 23 = 8

E= e + bias
E= e + 8 mantissa

Chapter - 2 (introduction format and addressing Page 150


explicit implicit
+101.1 x 20 +101.1 x 20

0.1011 x 23 1.011 x 22

s(sign) = 0 (positive hai) s(sign) = 0 (positive hai)


e=3 e=2
M = 1011 M = 011

E= e + bias E= e + bias
E = 3 + 8 = 11 (1011) E = 2 + 8 = 10 (1010)
1bit 4bit 5bit 1bit 4bit 5bit
0 1011 10110 0 1010 01100
hexa : (1 7 6) 16 (1 4 C)16

Q. +4.875

1bit 4bit 5bit


S E M

100.111
S e
exponent : 4bit
+/- 0.xxxxxx X 2e
hence , bias = 24-1
bias = 24-1 = 23 = 8

E= e + bias
E= e + 8 mantissa

explicit implicit
0
+100.111 x 2 +100.111 x 20

0.100111 x 23 1.00111 x 22

s(sign) = 0 (positive hai) s(sign) = 0 (positive hai)


e=3 e=2
M = 100111 M = 00111

E= e + bias E= e + bias
E = 3 + 8 = 11 (1011) E = 2 + 8 = 10 (1010)
1bit 4bit 5bit 1bit 4bit 5bit
0 1011 10011 0 1010 00111

5 bits ke lie maine mantissa mai se ek


1 hata diya

explicit implicit

1bit 4bit 5bit 1bit 4bit 5bit


0 1011 10011 0 1010 00111

E = 1011 = 11 E = 1010 = 10
bias = 8 bias = 8

Chapter - 2 (introduction format and addressing Page 151


E = 1011 = 11 E = 1010 = 10
bias = 8 bias = 8

formula to get number formula to get number


[value formula] [value formula]
(-1)s x 0.M x 2e (-1)s x 1.M x 2e
(-1)s x 0.M x 2E-bias (-1)s x 1.M x 2E-bias

(-1)0 x 0.10011 x 211-8 (-1)0 x 1.00111 x 210-8


(-1)0 x 0.10011 x 23 (-1)0 x 1.00111 x 22
0.10011 x 23 1.00111 x 22
100.11 x 20 100.111 x 20
4.75 4.875
1 hatane se result inaccurate
aya hai

either increase the bits in mantissa


(0r) use implicit normalization.

mantissa : giving precision (accuracy) (large number / more and more bits) (in mantissa
getting accurate for very small number)

exponent : giving range (power) (more bits in exponent means large-large number)

instruction size
S E M

either exponent is directly given

(or)

in excess (excess 32 : bias = 32, 2 k-1 = 26-1=25) k = 6


exponent : 6bit

Chapter - 2 (introduction format and addressing Page 152


1bit 6bit 9 bit
S E M

-(29.75)10
S e
11101.11 x 20
+/- 0.xxxxxx X 2e

excess : 32
bias : 2k-1 = 26-1 = 25 = 32
k= 6bit
mantissa
exponent : 5

E= e + bias
E= e + 32
explicit normalized :

11101.11 x 20
0.1110111 x 25

e=5
E= 5 + 32
E= 37 (100101)
M= 1110111

1bit 6bit 9 bit


1 100101 111011100
4 B D C

Q. 21.75

1bit 7bit 8 bit


S E M

21.75
S e
10101.11 x 20
+/- 0.xxxxxx X 2e

bias : 27-1 = 27-1 = 26 = 64


k= 7 bit
mantissa
E= e + bias
E= e + 64

explicit implicit
+10101.11 x 20 +10101.11 x 20

0.1010111 x 25 1.010111 x 24

s(sign) = 0 (positive hai) s(sign) = 0 (positive hai)

Chapter - 2 (introduction format and addressing Page 153


0.1010111 x 25 1.010111 x 24

s(sign) = 0 (positive hai) s(sign) = 0 (positive hai)


e=5 e=4
M = 1010111 M = 010111

E= e + bias E= e + bias
E = 5 + 64 = 69 (1000101) E = 4 + 64 = 68 (1000100)
1bit 7bit 8bit 1bit 7bit 8bit
0 1000101 1010111 0 1000100 01011100

1bit 6bit 9 bit


S E M

13.5
S e
1101.1 x 20
e
+/- 0.xxxxxx X 2
excess : 32
bias : 26-1 = 26-1 = 25 = 32
k= 6 bit
mantissa
E= e + bias
E= e + 32

explicit implicit
0
+1101.1 x 2 +1101.1 x 20

0.11011 x 24 1.1011x 23

s(sign) = 0 (positive hai) s(sign) = 0 (positive hai)


e=4 e=3
M = 1101 M = 1011

E= e + bias E= e + bias
E = 4 + 32 = 36 (100100) E = 3 + 32 = 35 (100011)
1bit 6bit 9bit 1bit 6bit 9bit
0 100100 110110000 0 100011 101100000

4 9 B 0 4 7 6 0

explicit implicit

1bit 6bit 9bit 1bit 6bit 9bit


0 100100 110110000 0 100011 101100000

E = 100100 = 36 E = 100011 = 35
bias = 32 bias = 32

Chapter - 2 (introduction format and addressing Page 154


0 100100 110110000 0 100011 101100000

E = 100100 = 36 E = 100011 = 35
bias = 32 bias = 32

formula to get number formula to get number


[value formula] [value formula]
(-1)s x 0.M x 2e (-1)s x 1.M x 2e
(-1)s x 0.M x 2E-bias (-1)s x 1.M x 2E-bias

(-1)0 x 0.110110000 x 236-32 (-1)0 x 1.101100000 x 235-32


(-1)0 x 0.110110000 x 24 (-1)0 x 1.101100000 x 23
0.110110000 x 24 1.101100000 x 23
1101.10000 x 20 1101.100000 x 20
13.5 13.5

1bit 7bit 8 bit


S E M

excess : 64
bias : 2k-1 = 27-1 = 26 = 64
k= 7 bit

E= e + bias
E= e + 64

implicit
(i) smallest mantissa : 0000 0000
(ii) largest / highest mantissa : 1111 1111
(iii) smallest exponent : 0000 0000
(minimum value in exponent)
(iv) large / highest exponent : 1111 111
(maximum value in exponent)

explicit
(i) smallest mantissa : 1000 0000 (.M point ke baad mantissa)
(ii) largest / highest mantissa : 1111 1111
(iii) smallest exponent : 0000 0000
(minimum value in exponent)
(iv) large / highest exponent : 1111 111
(maximum value in exponent)

note : in the implicit we cannot represent '0' in either


(i) implicit : 1.something (1.xxxxx)[(1.0) karne par bhi its not zero]
Chapter - 2 (introduction format and addressing Page 155
note : in the implicit we cannot represent '0' in either
(i) implicit : 1.something (1.xxxxx)[(1.0) karne par bhi its not zero]
(ii) explicit : 0.1 (0.1xxxxxx) [(0.1) karne par bhi its not zero]
actually hum 0 nahi likhte.

note :
(i) in the explicit represent we cannot represent 0 because 0.1 something is not 0
(ii) in the implicit representation we cannot represent '0' because 1.something is not 0
but here for first smallest we put all 0's in E and mantissa but actually value is not zero.

so for that '0' we use IEEE 754 single precision and double precision

(i)first smallest positive number :

E(7bit) M(8bit)
0 0000000 00000000

bias : 27-1 = 26 = 64

implicit :

(-1)s 1.00000000 x 2E-Bias


(-1)0 1.00000000 x 20-64

+1.00000000 x 2-64

first smallest positive number : +1.00000000 x 2-64

(ii) second smallest positive number :

E(7bit) M(8bit)
0 0000000 00000001 second smallest positive mai idhar se 1 aajaygea

bias : 27-1 = 26 = 64

implicit :
Chapter - 2 (introduction format and addressing Page 156
implicit :

(-1)s 1.00000001 x 2E-Bias


(-1)0 1.00000001 x 20-64

+1.00000001 x 2-64

second smallest positive number : +1.00000001 x 2-64

(iii) difference between Ist smallest and 2nd smallest

+1.00000001 x 2-64 - +1.00000000 x 2-64


[1.00000001 - 1.00000000] x 2-64
[0.00000001] x 2-64

2-8 x 2-64

2-72

E(7bit) M(8bit)
0 0000000 00000000

(i)first highest positive number :

E(7bit) M(8bit)
0 1111 111 1111 1111

Chapter - 2 (introduction format and addressing Page 157


bias : 27-1 = 26 = 64
E : 127

implicit :

(-1)s 1.11111111 x 2E-Bias


(-1)0 1.11111111 x 2127-64

+1.11111111 x 263

[111111111] x 2-8 x 263 (8bits shifted to left)

29-1 x 2-8 x 263

first highest positive number : +1.11111111 x 263

(ii) second highest positive number :

E(7bit) M(8bit)
0 1111 111 1111 1110 second highestpositive mai idhar se 0 aajaygea

bias : 27-1 = 26 = 64

implicit :

(-1)s 1.1111111 x 2E-Bias


(-1)0 1.11111110 x 2127-64

+1.11111110 x 263

second highest positive number : +1.11111110 x 263


(iii) difference between Ist highest and 2nd highest

+1.11111110 x 263 - +1.11111111 x 263


[1.111111110 - 1.11111111] x 263

[111111111] x2-8x 263

Chapter - 2 (introduction format and addressing Page 158


[111111111] x2-8x 263

29-1 x 2-8 x 263

2-72

how to deal with bits?

(i) 111 23-1

(ii) 1111 24-1

(iii) 11111 25-1

(iv) 111111 26-1

(i) 0.111 1- 1/23 (or) 1-2-3

(ii) 0.1111 1- 2/24 (or) 1-2-4

(iii) 0.11111 1- 1/25 (or) 1-2-5

(iv) 0.111111 1- 1/26 (or) 1-2-6

proof :

(i) 0.111 1- 1/23 (or) 1-2-3

proof : 0.111 x 20
111 x 2-3 (left alignment)
(23-1) x 2-3
23-3 - 1 x 2-3
1-2-3

maximum +ve value using explicit :

Chapter - 2 (introduction format and addressing Page 159


1bit 6bit 9bit
S E M

0 111111 111111111

bias : 26-1=32

explicit : (-1)s 0.M x 2e-bias


(-1)s 0.111111111 x 263-32
0.111111111 x 231

0.111111111 x 231

left alignment right alignment


111111111 x 2-9x 231 111111111 x 2-9x 231
111111111 x 222 (1-2-9) x 231
(29-1) x 222 (231 - 2-9+31)
29+22 - 222 231 - 222
231-22 231
maximum positive value
with 16bits 231
nearest aayega, jaise 28 - 21 nearest aayega, jaise 28 - 21
karne par bhi 28 hi aayega karne par bhi 28 hi aayega
because 28 (256) -21 (2) krne because 28 (256) -21 (2) krne
par 28 (254) aayega but 254 par 28 (254) aayega but 254
also comes under 28 also comes under 28

Disadvantage of convential floating point representation :

(i) it cannot represent '0'

(ii) it cannot represent infinity '∞'

(iii) it cannot represent the number which is not normalized.

that is why we use IEEE 754 floating point.

Chapter - 2 (introduction format and addressing Page 160


IEEE 754 floating point

single precision double precision

----32bit---- ----64bit----
S E M S E M
1bit 8bit 23bit 1bit 11bit 52bit

bias : 2k-1-1 bias : 2k-1-1


28-1-1 211-1-1
127 1023

note : default we use implicit in IEEE


because computer mai 1.M ki form mai
likha jata hai aur accuracy jada aati
hai

single precision
default : implicit
119 : 1110111.0 x 20

1.110111 x 26 (right shift)


S:0
M : 11011
e: 6
bias : 127
E : 6 + 127 = 133
E : 10000101

1bit E(8bit) mantissa (23bit)


0 10000101 11011100000000000000000

hexa : 42EE0000

implicit E = 133
bias = 127
(-1)s 1.M x 2E-bias

(-1)0 1.11011100000000000000000 x 2 133-127


1.11011100000000000000000 x 2 6
+1110111.00000000000000000
+119

double precision

Chapter - 2 (introduction format and addressing Page 161


+119

double precision
default : implicit
119 : 1110111.0 x 20

1.110111 x 26 (right shift)


S:0
M : 11011
e: 6
bias : 211-1: 1023
E : 6 + 1023 = 1029
E : 10000101

1bit E(11bit) mantissa (52bit)


0 10000000101 1101110000...00(0-46times)

hexa : 405DC00000000000

implicit

(-1)s 1.M x 2E-bias

(-1)0 1.1101110...46times x 21029-1023


1.1101110...46timesx 26
+1110111.00000000000000000
+119

note : how to write 1029 in binary

210 29 28 27 26 25 24 23 22 21 20
1024 512 256 128 64 32 16 8 4 2 1

1 1 1
0 0 0 0 0 0 0 0

1000000101

single precision
Chapter - 2 (introduction format and addressing Page 162
single precision
default : implicit

S:0
M : 11000000000000000000000
e: 5
bias : 127
E : 5 + 127 = 132 (10000100)
E : 10000100

1bit E(8bit) mantissa (23bit)


0 10000100 11000000000000000000000

hexa : 42600000

implicit
(-1)s 1.M x 2E-bias

(-1)0 1.11000000000000000000000 x 2132-127


1.11000000000000000000000 x 25
+111000.00000000000000000
+56

single precision
default : implicit
-14.25: -1110.01 x 20

-1.11001x 23 (right shift)


S:0
M : 11001
e: 3
bias : 127
E : + 127 = 130
E : 10000010

1bit E(8bit) mantissa (23bit)


1 10000010 11001000000000000000000

hexa : C1640000

implicit

(-1)s 1.M x 2E-bias

Chapter - 2 (introduction format and addressing Page 163


implicit

(-1)s 1.M x 2E-bias

(-1)1 1.11001000000000000000000x 2130-127


-.11001000000000000000000x 23
-1110.01000000000000000000
-14.25

1bit 2bit 4bit


S E M

in convential representation
bias = 2k-1

but in IEEE 754 floating point representation


bias = 2k-1-1

single precision = 28-1-1


bias = 127

double precision = 211-1


bias = 1023

why bias = 2k-1-1?

in single precision

1bit 8bit 23bit


S E M

E = 00000000 M = 000000000000
00000000000

E is reserved in IEEE for all 0 and all 1

Chapter - 2 (introduction format and addressing Page 164


dono ko represent
kar sakte hai aur
denormalized ko
bhi

NAN ko bhi
represent kar
sakte hai

somethings are fixed in IEEE :

M=0 (+/- 0)

when E = 0
E = 00000000

M=0 (fraction /
denormalized)

M=0 (infinte)

when E = 255
E = 11111111

M=0 (NAN, not a


number)

in double precision

1bit 11bit 52bit


S E M

Chapter - 2 (introduction format and addressing Page 165


S E M

E = 00000000000 M = 0000000000
0000000000

E is reserved in IEEE for all 0 and all 1

dono ko represent
kar sakte hai aur
denormalized ko
bhi

NAN ko bhi
represent kar
sakte hai

M=0 (+- 0)

when E = 0
E = 00000
000000
M=0 (fraction /
denormalized)

M=0 (infinte)

when E = 2047
E = 111111
11111
M=0 (NAN, not a
Chapter - 2 (introduction format and addressing Page 166
E = 111111
11111
M=0 (NAN, not a
number)

why bias = 2k-1-1? in IEEE 754 floating point?

in single precision
if we take bias = 2k-1 =28-1 = 27-1 =128 (or) excess 128 then there is a
chance of getting E = 255

because 8bit 2's complement range = -28-1 to -28-1 -1 = -128 to 127

assume if e = 127
agar aap only 2k-1 loge toh 128 aayega
bias = 128 aur e =127 hai
then 255 aa jayega aur 255 infinite aur NAN
E = 127 + 128 ke lie bana hai then problem will arrive.
E = 255

1st highest(largest number) pucha hai

S E(8bit) M(23bit)

0 1111 1110 1111 1111 1111 1111 1111 111

Saare E hone par


255 aajyega
islie 0 lagaya
255 ke lie NAN aur inf ho jata hai islie hum ek kam karte hai

S=0
E = 254

Chapter - 2 (introduction format and addressing Page 167


255 ke lie NAN aur inf ho jata hai islie hum ek kam karte hai

S=0
E = 254
1<E<254 (for single precision)

(-1)s 1.M x 2e (E-bias)


(-1)0 1.M x 2254-127
1.11111111111111111111111 x 2127
111111111111 x 2-23 x 2127
(224-1) x 2104
2128 -2104=2128

32 bit se 2128 tak represent kar paa raha hai

1st highest(largest number) without IEEE 754

S E(6bit) M(7bit)
0 111111 1111111

2st highest number without IEEE 754

S E(6bit) M(7bit)
0 111111 1111110

topic : denormalized number concept

single precision E = e + bias bias = 127 (in single precision)

IEEE E = 1 to 254 1≤E≤254

254 aaye aur overflow (255) na ho jaye E islie mene bias ko 2 K-1-1 kar diya hai.

minimum E ka value 1 ho sakta


for E=1 laane ke lie 'e' will be?
E = e + bias
1 = e + 127
e = -126
1 ko laane ke lie e=-126 aayega, that means in worst case e =-126, if value 'e' is
smaller than -126 (-127, -128..)then number is not able to normalize and we store as
denormalize number.

eg. 1.1001 x 2-129


e = -129
E = e + bias
-129 + 127 = -2
Chapter - 2 (introduction format and addressing Page 168
-129 + 127 = -2
E = -2
E negative aa raha hai lekin humne E islie use kia tha ki vo hamesha positive aaye toh
isse accha hume ise as a denormalize number store kar denge.

note : single precision for a normalized number in worst case will be : 1.M x 2 -126
then its normalized otherwise denormalized number.

• smallest positive single precision normalized number :


1.0000 0000 0000 0000 0000 000 x 2-126
1.M x 2-126

• smallest single precision denormalized number :


0.0000 0000 0000 0000 0000 001 x 2-126
(23 bits shifted, left alignment)
(converts into implicit)
1.0 x 2-23 x 2-126
1.0 x 2-149

• double precision range :


1.0x 2-1074

double precision

1bit 11bit 20bit


S E M

E = 1 bias = 1023

E = E + bias
1 = e + 1023
e = -1022

+- 1.0 x 2-1022
if e = -1023, 1024...then store as
denormalized number

topic : floating point addition, subtraction, divison and multipication

(i) Add/subtraction rule :

lets say

(i)
A = .......x2-2 power same honi chaiye
and B = ......x2-3
Chapter - 2 (introduction format and addressing Page 169
(i)
A = .......x2-2 power same honi chaiye
and B = ......x2-3

(i) choose the number with the smaller exponent and shift its mantissa right a number of
steps equal to the difference in exponents.
(ii) set the exponent of the result equal to the larger exponent.
(iii) perform addition / subtraction on the mantissas and determine the sign of result.

normalize the resulting value, if neccesary


1.something x 2e

compare karo exponent ko aur 1 to 254 ke beech me aaye


chote wale ko right shift kardo underflow means 1 se kam
jab tak larger wale ke sath aur overflow means 254 se
barabari mai na aajaye jada na aaye then exception.
nahi chalega!

mantissa part ko add kardo


round off kardo

normalize karne ki zarurt hai


toh right shift ya left shift
left ya right shift karke check karlo
karo, right shift
kroge toh increase
exponent aur left
shift kiya toh
exponent decrease

example:
show the IEEE 754 binary representation of the number -0.7510 in single and double
precision
-0.11 x 20
bias = 127
e = -1
S=1
E = -1 + 127 = 126

1bit 8bit bit


S E M

Chapter - 2 (introduction format and addressing Page 170


-0.75 : -0.11 x 20

single precision

1bit 8bit 23bit


S E M

to make it implicit we have to left shift


-1.1 x 2-1

S=-1
bias = 28-1 = 128-1 = 127
E = E + bias = -1 + 127 = 126
E = 126 (001111110)

1bit 8bit 23bit


1 001111110 100000000000000000000

addition of two number

A : 9.999 x 101
B : 1.610 x 10-1

1.610 x 10-1 = 0.1610 x 100 (right shift)


= 0.01610 x 101 (right shift)

now they are equal.

9.999x101
0.016x101
10.015

10.015 x 101

normalized nahi hai toh normalized kar diya


1.0015 x 102

ab round off krdiya


1.002 x 102

(ii) multipication rule


(i) add the exponents and subtract 127
(ii) multiply the mantissas and determine the sign of the result.
Chapter - 2 (introduction format and addressing Page 171
(ii) multiply the mantissas and determine the sign of the result.
(iii) normalize the resulting value if neccesary.

example :

(1.110 x 1010) x (9.200 x 10-5)

(i) add the exponents and subtract 127


10 + (-5) = 5
10 + 127 = 137
and -5 + 127 = 122
new exponent : 259

subtract 127 :
259 - 127 = 132
E : range is 1 to 254 hence 132 allowed

(ii) multiply the mantissas and determine the sign of the result.
1.110(3points ke baad)
9.200 (3points ke baad)
10.212000 (total 6 points ke baad)
10.212 x 105

normalized : 1.0212+6

(iii) divide rule


(i) subtract the exponents and add 127
(ii) divide the mantissas and determine the sign of the result.
(iii) normalize the resulting value if neccesary.

the addition or subtraction of 127 in the multiply and divide rules results from using excess -127
notation for exponents.

Chapter - 2 (introduction format and addressing Page 172


chapter - 4 (ALU and control unit)
Friday, June 21, 2024 11:59 PM

ALU and control unit


index :

(i) Component of computer

(ii) Register, ALU, timing and signal control

(iii) working of register

(iv) working of MUX, common bus

(v) micro operation

(vi) micro program

(vii) ALU data path

(viii) control unit

what is micro operation : how smallest operation is perfomed

instruction cycle

sub cycle -
(i) fetch cycle : to fetch the instruction from memory to CPU
PC MAR, MAR Memory, Memory MBR, MBR IR

(ii) execute cycle : to execute (to process) the fetch instruction.


it decodes; does the analysis of the instruction.

objective of micro operation : how execute, data path, register, MUX, common bus,
system bus will work here and how control signal will
be implemented? how, what, when and why?

topic : structure of computer

component of the computer :

ALU (arithmetic logical unit)


Registers
Chapter - 3 (ALU^J data path and CU) Page 173
component of the computer :

ALU (arithmetic logical unit)


1. CPU Registers
CU(control unit)

Primary/main memory
2. Memory
secondary/auxillary memory

Input device
3. I/O
Output device

cpu organisation

(i) registers
(ii) ALU
(iii) control unit

#memory address register : stores all the address of memory used for read/write operation.

why MAR/AR?
because it is connected to address line of the system bus. knows address and how to and where
to go etc.

#memory buffer register : hold the instructions or data

why MBR/DR/MDR(data)?
connected to the data line of the system bus

Rin : if the content of the bus loaded into register (andar aa raha hai)

Rout : if the content from ther register will be placed on bus (bahar jaa raha hai)

Chapter - 3 (ALU^J data path and CU) Page 174


Register : scollection of bits / sequence of bits, each bit stored in flip/flop
(fastest) - stores the data (temporary storage)
- register present inside the CPU
- made with flip-flop; flip-flop is 1 bit storage device)

8 bit register : 8 bit storage device / stores 8 bit data.

based on the information they have register types :


(i) data register : stores the data
(ii) address register : stores the address

based on the task/purpose assigned to them they have register types


(i) general purpose register : use for any purpose
(ii) special purpose register : use for specific purpose / pre-determined functionality

special purpose registers :


1. Memory address register (MAR)

2. I/O address register (I/O AR)


stores the address
3. Program counter (PC)

4. Stack pointer register (SP)

5. Memory buffer register (MBR/MDR/DR)

6. I/O data/buffer register (I/O BR)

7. Instruction register (IR) stores the instruction (or) data

8. Accumlator (AC)

9. Flag register / program status word (PSW)

Program counter : When instruction is fetched (fetch cycle executed) then PC denotes the starting
address of next instruction.

PC increment by 1 (or) PC increment by '4' (1word=4Byte) or '8'(1word=8Byte) or 'x' value.


gets Instruction address register / Instruction point register.

q. what is the size of each and every register?

memory : 4096 x 16bit address = 12bit


212 x 16bit data = 16bit
12bit
11 0

Chapter - 3 (ALU^J data path and CU) Page 175


PC

4bit 12bit 11 12bit 0


OPCODE address AR
memory 4096
15 16bit 0 words
16 bits per word
IR

15 16bit 0 15 16bit 0
TR DR

7 0 7 0 15 16bit 0
OUTR INPR AC

basic computer registers and memory

(i) MAR / AR : 12bit address register


(ii) PC / stack pointer register : 12bit stores the address

(iii)MBR / DR : 16bit data register


(iv) IR : 16bit stores the data
(v) AC : 16bit
(vi) TR/ flag register / program status word : 16bit

ALU (arithmetic logical unit) : ALU is a hardware that performs arithmetic, logical operations
and condition checking, etc.
(or)
it performs multiple operation

i/p 1 i/p 2 conditional flags


(i)carry flag
(ii) sign flag
ALU flag register (iii) zero flag
status of the
instruction
(psw) (iv) auxillary
carry flag
(v) parity flag
from the accumulator it will (vi) oveflow flag
send to the respective AC
destination

CU (control unit) : timing signal and control signal. how and what to do each and everything.

• timing signal : to execute the instructions in proper sequence


Ex. fetch decode execute

Chapter - 3 (ALU^J data path and CU) Page 176


in fetch cycle :
T1: PC MAR
T2: MAR MBR
T3: MBR IR

non technical example :


T1: enrollment
T2: admit card
T3: exam writing
T4: result

q.consider if we have(total 4 components) then total connection required?

4
c2 = 6 connections. charo ko connect karne ke lie 6 connections chaiye.

ALU register

PSW memory

q.consider if we have 16 registers, 1 memory, 1 ALU , 1PSW and 1 other component (total 20
components) then total connection required?

20
c2 = 190 connections.
so the solution is, instead of using 190 connections, connect all components to a common
bus(internal
bus). at a time which part (components) will communicate? for that control signals are required.

why common bus is used?

topic : working of registers and multiplexer

(i) the number of multiplexer required = size of register (#of bits in register)
(ii) size of multiplexer required = number of register

q.4 register : A, B, C and D, each register is of 4bits


(i) the number of multiplexer required = 4 (size of register (#of bits in register))
(ii) size of multiplexer required = 4x1 (number of register)

q.if we have m registers and each register size is n bits then what is the number of mux and size
Chapter - 3 (ALU^J data path and CU) Page 177
q.if we have m registers and each register size is n bits then what is the number of mux and size
of mux?
(i) the number of multiplexer required = n (size of register (#of bits in register))
(ii) size of multiplexer required = m x 1 (number of register)

q.if we have 32 registers and each register size is n 8bits then what is the number of mux and
size of mux?
(i) the number of multiplexer required = 8 (size of register (#of bits in register))
(ii) size of multiplexer required = 32 x 1 (number of register)

working of register and multiplexer

q.4 register : A, B, C and D, each register is of 4bits


(i) the number of multiplexer required = 4 (size of register (#of bits in register))
(ii) size of multiplexer required = 4x1 (number of register)

4 mux hai toh


00: means input '0' will be
2 select line
select across all mux
regsiter A ka content select ho
jayega

s1 s0 register selected

0 0 A 00: means input '0' will be select across all mux


register A ka content select ho jayega
aur iska content phir common bus mai jayega.

0 1 B 01: means input '1' will be select across all mux


regsiter B ka content select ho jayega
aur iska content phir common bus mai jayega.
Chapter - 3 (ALU^J data path and CU) Page 178
0 1 B 01: means input '1' will be select across all mux
regsiter B ka content select ho jayega
aur iska content phir common bus mai jayega.

1 0 C 10: means input '2' will be select across all mux


register ka content select ho jayega
aur iska content phir common bus mai jayega.

1 1 D 11: means input '3' will be select across all mux


regsiter D ka content select ho jayega
aur iska content phir common bus mai jayega.

how data is transferred

register A to register B : RA RB

Process : Register A content (data) will be given to MUX then that data will be transferred from
MUX to common and then that data will be loaded into register B from common bus

step (i) : RA MUX


select line 00 A will be selected/enabled and loaded into common bus

S1 S0
0 0 =A

a1 RA out = 1
a2
a3
a4

a1 a2
a3 S1 S0
a4
0 1 =B

RB in = 1

why Rout and Rin is used?


register (RA) output connected to MUX;
then from the MUX to common bus;
Chapter - 3 (ALU^J data path and CU) Page 179
then from the MUX to common bus;
then common bus to load into respective register (RB)

when Rout is set to 1 then respective register data is loaded into the MUX then MUX to common bus,
common bus is connected to all registers, the register which have R in is set to 1 in that respective
register bus data is loaded.

RA to RB : RA out RB in
00[A] RA out = 1 then RA data load to MUX then common; then we get R B = 1 then from common bus
content is loaded into register.

working of computer

register me 3 input rehte hai


LD : load
INC : increment
LD INC CLR CLR : clear

11 0
PC

4bit 12bit 11 0
OPCODE address AR
memory 4096
15 0 words
16 bits per word
IR

15 0 15 0
TR DR

7 0 7 0 15 0
OUTR INPR AC

basic computer registers and memory

4096 x 16
212 x 16

address line= 12bit

Chapter - 3 (ALU^J data path and CU) Page 180


address line= 12bit
data line = 16bit

S2

S1 bus

S0

memory unit
4096 x 16 7
address

LD CLR

AR 1

LD INC CLR

PC 2

LD INC CLR

DR 3

LD INC CLR
E
adder
and AC
4
logic

LD INC CLR

INPR

IR
5

LD INC CLR

TR
6

LD INC CLR

OUTR

clock
LD

Chapter - 3 (ALU^J data path and CU) Page 181


OUTR

clock
LD

16bit common bus

basic computer registers connected to a common bus

total components : 7

memory : 7

AR : 1

PC : 2

AC : 4

DR : 3

TR : 6

IR : 5

then 3 select lines are required

S2 S2 S2 enables

1 1 1 7(memory) read control signal for load from memory

0 1 0 2(PC)

1 0 1 5(IR)

read : Load : memory read


write : store : memory write

in memory we have 'n' locations so, which memory address contains (data) loaded into the bus? it is
given by AR(MAR) aur MAR mai PC se aayega

111 7 (memory)
010 2 (PC), PC will be enabled and content of PC will be loaded into common bus and AR
register(MAR) [Load(IN) is set to 1(active) so AR(MAR) get the memory address

Chapter - 3 (ALU^J data path and CU) Page 182


register(MAR) [Load(IN) is set to 1(active) so AR(MAR) get the memory address
fetch cycle here
memory to CPU (IR)
101 3(IR)

- memory
- PC
- IR

but in what sequence (timing) which operation performs?


it is the responsibility of timing signal and control signals, yeh timing signal batayega

fetch cycle : instruction is fetch from memory to CPU (IR)

T1: PC M(MAR)

T2 : MAR Memory
[MAR MBR]
T2 : Memory MBR

T3 : MBR IR

PC se AR mai address waha se memory mai gaye aur memory se MBR aur MBR se hum IR mai
aye aur ise hum ALU data path kehte hai

T2
T1 T3

fetched instruction
with the help of is loaded into the IR
memory line

PC MAR AL binary DL MBR IR


address with the help of
supplies RD CL data line

instruction
address read control line

(memory)

Chapter - 3 (ALU^J data path and CU) Page 183


1bit 3bit 12bit
mode field opcode address opcode
address
mode field

IR se decoder

fetch cycle :
T1: PC M(MAR) PCout MARin

T2 : M[MAR] Memory MARout MBRin system bus


T2 : Memory MBR

PC + I PC PCout PCin local bus


increment
T3 : MBR IR MARin IRin

ALU, data path and control unit

(i) micro instruction/operation

(ii) micro program

(iii) control unit design

what is data path?

Chapter - 3 (ALU^J data path and CU) Page 184


what is data path?
MUX, ALU, buses, register ; how they process or path of their processing

every register have two switch

Rin : if the content of the bus loaded into register (andar aa raha hai)

Rout : if the content from ther register will be placed on bus (bahar jaa raha hai)

(i) micro instruction/operation : in fetch cycle, execute cycle, interrupt cycle we have small
operations [x operations]

computer

programs

instruction/data

instruction cycle
(subcycles) chote chote operations hote is cycle mai aur
fetch use hum micro operations kehte hai
execute aur execute karayega control signal
interrupt aur control aayege control unit ki taraf se

micro operation example :


register to register transfer

micro operation consume 1 cycle to complete the execution

the functional, or atomic, operations of a processor

(i) series of steps, each of which involves the processor registers.

(ii) micro refers to the fact that each step is very simple and accomplishes very little

(iii) the execution of program consists of the sequential execution of instructions.

each instruction is executed during an instruction cycle made up of shorter sub cycles (fetch,
indirect, execute, interrupt)

the execution of each sub cycle involves one or more shorter operations (micro-operations)

Chapter - 3 (ALU^J data path and CU) Page 185


(ii) micro program : collection (in sequence) of micro operation. kisi task ko krne mai 3 sequence
lage uska collection is micro program.

hardware level par jo kaam hota hai vo perform karta hai micro program.

instruction cycle

(i) fetch cycle : instruction is fetch from memory to CPU (IR)


four registers involved
fetched instruction
is loaded into the IR

PC MAR instruction MBR IR


1000 adress with the help of
memory line
and data with the help of
supplies data line

holds 1000 I1 : Load [6000]


connected to holds last
address of data bus instruction
next connected to (memory) fetched
instruction address bus holds data to
to be fetched write or last jo currently
specifies data read fetch hota
address for hai
read or write
operation

T1: PC M(MAR)

T2 : MAR Memory
T2 : Memory MBR

T3 : MBR IR

Chapter - 3 (ALU^J data path and CU) Page 186


micro operations in fetch cycle
T2
T1 T3

fetched instruction
with the help of is loaded into the IR
memory line

PC MAR AL binary DL MBR IR


address with the help of
supplies RD CL data line

instruction
address read control line

(memory)

T1: PC M(MAR) PCout MARin

T2 : M[MAR] Memory MARout MBRin system bus


T2 : Memory MBR

PC + I PC PCout PCin local bus


increment

T3 : MBR IR MARin IRin

proper sequence must be followed

(i) MAR (PC) must precede MBR (memory)


conflicts must be avoided

(ii) must not read and write same register at same time

(iii)MBR (memory) and IR (MBR) must not be in a same cycle.

if different component are there then perform in a single cycle (no conflict)

(ii) PC se MAR

MAR MAR 0000 0000 0110 0100


MBR (i) PC mai address MBR

PC 0000 0000 0110 0100 PC 0000 0000 0110 0100


IR IR

AC AC

(a) beginning (before t1) (b) after first step

Chapter - 3 (ALU^J data path and CU) Page 187


MAR MAR 0000 0000 0110 0100
MBR (i) PC mai address MBR

PC 0000 0000 0110 0100 PC 0000 0000 0110 0100


IR IR

AC AC

(a) beginning (before t1) (b) after first step

(iii) MAR se memory address


(0064)H par gaye

(0064) H (1020) H
0000 0000 0110 0100 000 0000 01100 0100

memory
(iv) (0064)H par hume mila
(1020)H aur vo MBR mai gaya (v) MBR se IR mai gaye

MAR 0000 0000 0110 0100 MAR 0000 0000 0110 0100
MBR 0001 0000 0010 0000 MBR 000 0000 0100 0000

PC 0000 0000 0110 0101 PC 0000 0000 0110 0101


IR IR 0001 0000 0010 0000
AC AC

(c) after second step (d) after third step

at the end of fetch cycle instruction is fetched from memory to CPU (IR)register

IR will have : OPCODE OPERAND REFERENCE

(ii) execute cycle :

- decode the instruction (analysis of instruction) [what opcode, how many opcode, kehna kya chate ho]

- operand address calculation

- OPERAND fetch (data)

then data process karke result kara denge (write back)


Chapter - 3 (ALU^J data path and CU) Page 188
then data process karke result kara denge (write back)

AF
LOAD 4000 memory location 4000 pe jo available
will be given to MAR hai vo accumulator mai daal do

fetched instruction
with the help of is loaded into the IR
memory line

PC MAR AL binary DL MBR IR


with the help of
4000 address
supplies RD CL data line
LOAD 4000
[4000] LOAD 4000
read control line

(memory)

decoder decodes what is in the


data that has been passed
AC MBR instruction MAR DECODER (analysis of instruction)
(what opcode, how many operand)

AC M[4000]
and data [4000]
with the help
I1 : Load [4000]
(operand fetch)
of system buses

data process and result store [4000] 11 operand fetch AC M[4000]


decode I1 kya kehnta chata hai,
(memory) accumulator mai memory
location 4000 ka content
daaldo
operand address calculation

(a) ID stage : enable the hardware to perform the operation (instruction decode / analysis)

(b) OF stage : AM's (addressing mode) are required to access (operand fetch)

example : execute cycle of direct addressing mode

operand fetch

LOAD 4000 LOAD 4000

IR DECODER MAR with the help of


AL binary DL
memory line
fetched
LOAD 4000 RD CL
instruction
is loaded into [4000] 11
the IR AC M[4000] read control line
decode I1 kya kehnta chata hai,
accumulator mai memory
(memory)
location 4000 ka content

Chapter - 3 (ALU^J data path and CU) Page 189


the IR AC M[4000] read control line
decode I1 kya kehnta chata hai,
accumulator mai memory
(memory)
location 4000 ka content
daaldo
operand address calculation AC / ALU MBR
decoder decodes what is in the AC M[4000] 11
data that has been passed data process
(analysis of instruction) and
(what opcode, how many operand) result store
(operand fetch)

execute cycle of direct addressing mode

T1 : IR [address] MAR : IR out MARin


T2 : M [MAR] MBR(EA) : MARout MRin
T3 : MBR AC/ALU : MBRout AC/ALUin

why ALU ? because if we have this kind of data type then we need ALU to
perform these operations.

i2 : ADD R1 X : R1 R1 + M[X]

direct AM
(EA) x location sabse pehle
MAR ko deni padegi

ADD R1 X : content of location X ko add kardenge register R 1 mai

step(i) IR ki address field se MAR mai jayega

step (ii) jo reference memory location hai use read karlenge

step (iii) R1 aur MBR ka content ALU ke through add karke


accumulator mai daal denge

soometimes additional micro-operations lag sakta hai (may) register ke


content ko laane ke liye

additonal micro-operations may be required to extract the register


reference from the IR and perhaps to stage the ALU inputs or outputs in
some intermediate registers.

(i) direct addressing mode load[4000] memory read (done)

Chapter - 3 (ALU^J data path and CU) Page 190


example : execute cycle of indirect addressing mode
load@4000 memory read

load @4000

address of
effective address

AC M[M[4000]]
AC M[EA]

operand fetch
LOAD @4000 4000

IR DECODER MAR AL binary DL


with the help of
memory line
fetched
LOAD @4000 RD CL
instruction
is loaded into [4000] 100
the IR AC M[4000] read control line
decode I1 kya kehnta chata hai,
accumulator mai memory
(memory)
location 4000 ka content
daaldo
operand address calculation MBR
decoder decodes what is in the data that has been passed [EA]
(analysis of instruction)(what opcode, how many
operand)(operand fetch)

operand fetch
AL MAR
AC / ALU MBR DL binary
CL RD
AC M[[4000]] 11 [100]
11 [100] read
data process control line
and
result store
(memory)

q. R0 R1 + R2

T1 T2 T1 : T1 R1 ; R1 out T1 in

T 2 : T2 R2 ; R2 out T2 in

Chapter - 3 (ALU^J data path and CU) Page 191


T 2 : T2 R2 ; R2 out T2 in

T3 : AC T1 + T2 ; T1 out T2 out ALUadd ACin

ALU T4 : R0 AC ; AC out R0 in

AC

PC to MAR : PCout MARin


MAR to MBR : MARout MBRin
MBR to IR : MBRout IRin

instruction given : R0 R1 + R 2

fetch cycle :
step (i) 3. PC r, MARw, MEMr : PC MAR and memory read
step (ii) 5. MDR r, IRw : MDR IR

execute cycle :
step (iii) 2. R 1 r, temp1 w : R1 temp1
step (iV) 1. R2 r, temp1 r, ALUadd, temp2 w : temp2 R2 + R 1
step (V) 4. temp 2 r, ROw : R0 temp2

memory read

load [4000]

AC M[4000]

RD
done
#memory store concept :

Chapter - 3 (ALU^J data path and CU) Page 192


memory store memory write micro program
T1 : AC MBR ; AC out MBR in
store [4000] store [6000]
T2 : IR(address) MBR ; IR out MAR in
M[6000] AC M[6000] AC
T3 : MBR M[MAR] ; MBR out MAR in
WR

data : MBR deals with data


address : MAR deals with address

with the help of


memory line
AC MBR DL
data
supplies
(data)

binary 11 write (6000)


with the help of
memory line memory location 6000
pe jaake content likha
IR MAR AL dunga
EA address
supplies WR CL

write control line


control
signal
(memory)

M[6000] AC
M[6000] data(11)
M[6000] content of accumulator

Instruction cycle with interrupt cycle :

when CPU encounter the interrupt then after finishing (completion) of


current instruction execution, interrupt will be serviced.

fetch cycle - when CPU encounter the interrupt then it push the PC (program
counter) value into the stack as a return address and control
transfer to ISR (interrupt subroutines)

execute cycle

no
check
interrupt
unusual event
that disturbs

Chapter - 3 (ALU^J data path and CU) Page 193


no
check
interrupt
unusual event
that disturbs
the flow
yes

service the
interrupt

if interrupt cycle occurs :

the nature of this cycle varies greatly from one machine to another

step (i) content of PC are transferred into MBR so that they can be saved for
return from the interrupt.

push the PC value with the help of MAR


in the stack as
return address

stack
step (ii) then the MAR is loaded with the address at which the contents of PC are
to be saved, and the PC is loaded with the address of the start of the
interrupt processing routine.

- these two ations may each be single micro-operation(varies, depend on processor)


- because most processor provide multipe types or level of interrupts, it may take one or more
additional micro operations to obtain the save address and the routine address before they can
be transferred to the MAR and PC respectively

step (iii) once this is done, the final step is to store the MBR, which contains the
old value of the PC, into memory

step (iv) the processor is now ready to begin the next instruction cycle.

Chapter - 3 (ALU^J data path and CU) Page 194


with the help of
memory line TOS
PC MBR DL
data
supplies
(return
address) SP
with the help of
memory line

SP MAR AL
stack pointer
tos address
supplies WR CL
address
write control line
control stack
signal
stack
vector address PC

note : stack memory ka address stack pointer rakhta hai

interrupt vector table : a data structure that associates a list of interrupt handlers with a list of
interrupt requests in a table of interrupt vectors

micro program

T1 : PC MBR ; PC out MBR in


T2 : SP MBR ; SP out MBR in
T3 : MBR M[MAR] ; MBR out MAR in
TOS

ISR address will be given to PC for interrupt service and once the service is completed
RETI/IRET interrupt will return; then we will POP the PC value from the stack

Chapter - 3 (ALU^J data path and CU) Page 195


(SP(TOS))
ISR

topic : control unit functional requirements

(i) by reducing the operation of the processor to its most fundamental level we are able
to define exactly what it is that the control unit must cause to happen
(kisi bhi processor ke operation ko reduce kara uske fundamental level par usi ke
karan we are able to define exactly ki control unit kislie bani hai).

(ii) three step process leads to a characterization of the control unit :


- define basic elements of processor
- describe micro-operatons processor perfoms (kaise perform karna hai)
- determine the functions that the control unit must perform to cause the micro-
operations to be perfomed. (uske lie kya function banega)

(iii) the control unit performs two basic tasks


- sequencing T1 : PC MAR (yeh cheez batayegi control unti)
- execution PCout MARin

control unit :
control unit is the supervisior in the system that control each and every activity.

control unit takes several input but produce control signals and these control signals are

Chapter - 3 (ALU^J data path and CU) Page 196


control unit takes several input but produce control signals and these control signals are
required to perform the operations

(i) control signals are implemented in control unit. (control unit control signal ko
generate karti hai)
(ii) control signals are required to execute the micro operation.
(iii) micro operation is the elementary operation in the hardware.
(iv) control unit generated the sequence of control signal.
(v) control signal are directly executed on a base hardware (H/W)
so, hardware generate the desired response.
computer system functionally is program execution.

IR mai aya because usme format pre-defined hota hai

fetch : opcode operand reference

kitna address kaha paya kya karna


field hai jayega chata hai
3AF Accumulator ADD
2AF Stack MUL
1AF Register JMP
OAI ...

memory mai hai operand and then we studied addressing mode, there are 11
types of addressing modes and then data laane ka working humne working of
register aur MUX padha (ALU data path) for syntax and format we studied
micro operation and for fetch cycle we studied micro program and the working of
execution will be done by control unit through control signals that generate
control words.

Chapter - 3 (ALU^J data path and CU) Page 197


note : in the fetch and decode same type of operation are perfomed.
after the decode, the input is given to the control unit, according to the
operation (operation type) control signal are generated to perform micro
operation.

fetch cycle memory IR

then IR decoder control unit

in the fetch and decode same


type of operation are perfomed

after the decode, the input is


given to the control unit

according to the operation


(operation type) control signal
are generated to perform micro
operation.

instruction register

decoder

sequencing
control address register
logic

read
control memory

control buffer register

next address control


decoder

control control
signals signals
within CPU to system
bus

Chapter - 3 (ALU^J data path and CU) Page 198


within CPU
bus

functioning of microporgrammed control unit

instruction register

control signals
flags
within CPU
control signals
control unit
from control bus
control signals to control bus
control bus
clocks

block diagram of control unit

topic : micro operations and control signal

Chapter - 3 (ALU^J data path and CU) Page 199


micro-operations active control signals

T1 : MAR PC (or)
C2
PC MAR

fetch T2 : MBR memory (or) C5, CR


PC (PC) + 1

T3 : IR (MBR) C4

T1 : MAR IR (address)
C8

indirect T2 : MBR memory C5, CR

T3 : IR(address) MBR(address) C4

T1 : MAR PC
C1

interrupt T2 : MAR save address


PC routine address

T3 : memory MBR C12, Cw

whatever component we have (register, ALU, MUX, bus, etc) we use control word for it.

control unit : PC MAR ; PCout MARin

PCin PCout MARin MARout IRin IRout MBRin MBRout ACin ACout MUX ALU GPR SPin SPout
0 1 1 0 0 0 0 0 0 0 0 0 0 0 0

C2 : one
micro
operation

control unit : memory MBR ; memoryout MBRin

Chapter - 3 (ALU^J data path and CU) Page 200


control unit : memory MBR ; memoryout MBRin

PCin PCout MARin MARout IRin IRout MBRin MBRout ACin ACout MUX ALU GPR SPin SPout
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

C5 : one
micro
operation

control unit : MBR IR ; MBRout IRin

PCin PCout MARin MARout IRin IRout MBRin MBRout ACin ACout MUX ALU GPR SPin SPout
0 0 0 0 1 0 1 1 0 0 0 0 0 0 0

C4 : one micro operation

MEM MBR

C9 : ACin ALUout
C7 : ACout ALUin
MBR IR
C4 : MBRout IRin
C5 : memoryout MBRin
C2 : PCout MARin
memory

PC MAR

how many control lines are present in hardware[S 0, S1, S2...]

how many instruction are implemented in hardware[I 1, I2, I3...]

how many micro operation are required for each instruction [T 1, T2, T3...]

what the control signal required for each micro operation for each instruction

Chapter - 3 (ALU^J data path and CU) Page 201


fetch execute indirect AM

T1 PC MBR T1 IR(AF) MAR T1 IR(AF) MAR

T2 M(MAR) MBR T2 M(MAR) MBR T2 M(MAR) MBR

T3 MBR IR T3 MBR AC/ALU T3 MBR M(MAR)

T3 M(MAR) MBR

T3 MBR AC/ALU

working : control unit generate the control signals and at the control unit design time,
designer decides which control signal are generated in which cycle (T 1, T2, T3..)
of different-different instructions and that will be stored in a table.

after that we make a boolean (logic) function for implementation.

control signal will be implemented into the control unit by using following approach :

control unit

hardwired micro program


control unit control unit

(i) hardwired control unit : in hardwired control unit, control signal are expressed in the
S.O.P (sum of product) form and they are directly realized on the
hardware.

- in the hardwired control unit they use fixed logic circuit to interpret the instruction then
generate the control signal.

advantage : hardwired control unit is the fastest control unit


RISC is the hardwired control unit.

disadvantage : - it is not flexible,


- minor modification require designing and rewiring.
- it does not support new operations (once designed).

Chapter - 3 (ALU^J data path and CU) Page 202


T1 AND
ADD

T5 OR yin

T3 AND
BR
branch

- here yin is enabled


- during T1 for add instruction
- during T3 for branch instruction
- during T5 for all instruction
yin = T1.ADD + T3.Branch + T5

Chapter - 3 (ALU^J data path and CU) Page 203


Ain = T1 I1 + T1 I2 + T2 I2 + T2 I3 + T4 I3
Ain = T1 (I1 + I2) + T2 (I2 + I3) + T4 I3

Bout = T1 I1 + T1 I2 + T1 I3 + T3 I1 + T3 I2 + T3 I3
Bout = T1 (I1 + I2 + I3) + T3 (I1 + I2 + I3)
Bout = T1 + T3

(I1 + I2 + I3) = 1
(I1 + I2 + I3) = 1

T1 mai sab kaam kar raha so 1

S5 = T1 I1 + T1 I2 + T1 I3 + T1 I4 + T3 I2 + T3 I4
S5 = T1 (I1 + I2 + I3 + I4) + T3(I2 + I4)

Chapter - 3 (ALU^J data path and CU) Page 204


S5 = T1 I1 + T1 I2 + T1 I3 + T1 I4 + T3 I2 + T3 I4
S5 = T1 (I1 + I2 + I3 + I4) + T3(I2 + I4)
S5 = T1 + T3(I2 + I4)

S10 = T2 I2 + T2 I3 + T3 I4 + T4 I1 + T4 I3 + T5 I2 + T5 I4
S10 = T2 (I2 + I3) + T3 I4 + T4 (I1 + I3) + T5 (I2 + I4)

(I1 + I2 + I3 + I4) = 1

(i) microprogrammed control unit : in microprogrammed control unit, control words are stored in
the control memory
then according to the type of operations control signals are
generated.
control memory is associated with CAR (control memory
address) and CDR (control data register) to contain the control
memory address and data respectively.

format of micro instruction:

micro instruction / control field/ CDR


branch conditions (BC) flags control field address field/NIA/CAR/CMA

how to find bits : log2BC log2flag log2Cm size

NIA : next instruction address


CAR : control address register
CMA : control memory address

Chapter - 3 (ALU^J data path and CU) Page 205


note : control fields depends upon control
signals

control unit

control signals
hardwired control unit

- sum of product (s.o.p.) decoded format encoded format


- directly hardware (horizontal (vertical programming)
independent programming) 1 bit = 21 control signal
1 bit = 1 control signal n control signal = log2n
- fastest RISC
n control signal = nbits bits
- not flexible
8bit = 8 control signal 8bit = 28 = 256 control
8 control signal = 8bit signal
256 control signal =
log2256 = 8bit

(i) horizontal programming :

(1) number of control signals in the hardware : S0 S1 S2 S3


control field

(2) decoded format of control signal : _ _ _ _

0 : disable
1 : enable
S 3 S 2 S1 S0
base hardware

4 control signal on hardware then control field will be 4bits

(3) design a horizontal micro instruction for control signal (cs) [S 0 S2]

control field

BC flag 0 1 0 1 CM address

(4) operational state

Chapter - 3 (ALU^J data path and CU) Page 206


branch condition flag 0 1 0 1 CM address

S 3 S 2 S1 S 0
base hardware

base hardware par S0 and S2 active ho jayega

(ii) vertical programming :

(1) number of control signals in the hardware : [S 0 S1 S2 S3]

(2) encoded format of control signal = log 24 = 2bit control field

00 : S0 control field
01 : S1
10 : S2 decoder
11 : S3
S0 S 1 S2 S 3

in vertical micro programing external decoder is required

(iii) design a vertical micro insruction for control signal : [S0 S2]

2bit
BC flag control field CM address

FC1 FC2
00(s1) 10(s2)

function code (FC) is generated by the control unit to give signal to CPU to perform operation

(iv)operational state :

BC flag 00 10 CM address

00 : S0

Chapter - 3 (ALU^J data path and CU) Page 207


BC flag 00 10 CM address

00 : S0
10 : S2 decoder

S 3 S 2 S1 S 0 S 3 S 2 S1 S 0

horizontal programming vertical programming

(i) in this control signal are expressed (i) in this control signal are expressed
in decoded format. in encoded format.

(ii)n bit = n control signal (ii)n bit = 2n control signal


n control signal = n bit n control signal = log 2n bits

200 control signal = 200bits 200 control signal = 8bits

(iii) larger control word (iii) smaller control word

(iv) no need of external decoder required (iv) external decoder is required to


to generate control signal generate the control signal.

(v) it is more flexibe compared to (v) it is more flexible compared to


hardwired. horizontal.

(vi) it support high degree of (vi) it support low degree of parallelism


parallerlism (none/more than one (none/one operation)
operation)

note: default microprogram control unit is vertical micro program control unit
used in CISC

speed : hardwired > horizontal > vertical

flexibility : vertical > horizontal > hardwired

Chapter - 3 (ALU^J data path and CU) Page 208


control memory : 1024 control word
1024 control word : 210 control words
AF/NIA/CAR : 10bit
16 flag : 4bit

horizontal vertical
microprogramming microprogramming

48 control signal = 48bit 48 control signal = log248 = 6bit


control word = 4 + control word = 4 + 6 + 10 = 20 bit
48 + 10 = 62 bit

flag CF CAR/NIA flag CF CAR/NIA


4bit 6bit 10bit
4bit 48bit 10bit

1 control word = 62bit 1 control word = 20bit

control memory = 1024 cw control memory = 1024 cw


= 1024 x 62bit = 1024 x 20bit
= 1024 x 62bit = 1024 x 20bit
8 8
= 8k byte. = 3k byte.

Chapter - 3 (ALU^J data path and CU) Page 209


horizontal vertical
G1 : 20 cs 20 bit 5 bit
G2 : 70 cs 70 bit 7 bit
G3 : 2 cs 2 bit 1 bit
G4 : 10 cs 10 bit 4 bit
G5 : 23 cs 23 bit 5 bit
125 bit 22 bit

125 - 22 = 103

32 branch : 5 bit
AF/NIA/CAR : 20bit
16 flag : 4bit

micro instruction / control field/ CDR

branch conditions (BC) flags control field address field/NIA/CAR/CMA


5bit 4bit 20bit

G1 G2
vertical horizontal
(none/one) (none/one)
400 cs 6 cs
9bit 6bit

control field : 15 bit

micro instruction / control field/ CDR

branch conditions (BC) flags control field address field/NIA/CAR/CMA


5bit 4bit 15bit 20bit

control word : 5 + 4 + 15 + 20 = 44 bit

control memory : 220 control words


: 220 x 44 bits

125 control words : 125 bits

total number of instructions : 140


number of cycles per instruction : 7 cycle (T1, T2...T7)
total number of micro operations / instructions : 140 x 7 = 980 cw

control memory: 980 control words

CAR/AF : log2980 = 10bit

Chapter - 3 (ALU^J data path and CU) Page 210


horizontal

control field address field/NIA/CAR/CMA

125bit 10bit

control word : 125 + 10 = 135 control words

CAR/AF : log2980 = 10bit

pre-requisites for pipelining


topic : CPU time calculation
(i) CPU time calculation program execution time
(ii) program execution time is calculated based on clock.
(iii) processor contain clock pins and these clock pin is externally connected with the clock
generator
(iv) so in the computer system all the operation are controlled by the clock, so CPU contain pints
which is externally connected with clock generator.
(v) clock generator is operating with a constant frequency to generate the clock pulse (clock
signal)

Program ET (execution time) is calculated based on 2 factor :

(i) cycle (T1, T2....Tn) : cycle is defined as clock pulse transition either from
rising edge to rising edge or falling edge to falling edge.

Chapter - 3 (ALU^J data path and CU) Page 211


ek pura clock pulse ka transaction

(ii) cycle time : the time required to transfer the pulse either form rising edge to rising
edge or falling edge to falling edge is called cycle time

cycle time depend on clock frequency

cycle time ∝ 1
clock frequency

my computer properties :
64bit processor (word length = 64bit)(operation perfomed on 64bits)
8GB RAM (AL)
ITB hardisk (240 byte hardisk)

what is 2GHZ processor ?


clock rate/frequence 2GHZ ki hai

clock pulse / signal

clk
waha se clock pulse processor (CPU)
nikalti hai

aur yeh humare processor mai


with help of clock pins inject ki
jaati hai

clock generator

clock generator hota hai jo ek


constant frequency pe kaam karta
hai.

Chapter - 3 (ALU^J data path and CU) Page 212


1GHZ constant frequency

cycle time - 1
clock frequency

- 1/1GHZ

- example : - example :

1GHZ processor 2GHZ processor


clock frequency : 1GHZ clock frequency : 2GHZ
cycle time : 1/1GHZ cycle time : 1/2GHZ
: 1/109 : 1/2x109
: 10-9 second : 0.5 x 10-9 second
cycle time : 1 nano second. cycle time : 0.5 nano second.

some conversions :

210 = 1k = 103 -3
≡ 1mili second = 10 second
220 = 1m = 106 ≡ 1micro second = 10-6 second
230 = 1g = 109 ≡ 1nano second = 10-9 second

240 = 1t = 1012
250 = 1p
260 = 1e

q. CPU has 1 GHZ processor and program P1 having 100 instruction and each instruction takes 5
cycle then what is the program execution time?
(i) in cycle
(ii) in time

1GHz processor
cycle time : 1 nano second

program P1 : 100 instructions (number of instructions : instruction count (IC))


each instruction : 5 cycles (CPI (cycle per instruction))
program execution time (in cycle) : 100 x 5 = 500 cycle.
program execution time (in time) : 500 cycle = 500 x 1 nsec = 500nsec.

q. CPU has 2 GHZ processor and program P1 having 100 instruction and each instruction takes 5
cycle then what is the program execution time?
(i) in cycle
(ii) in time

2GHz processor
Chapter - 3 (ALU^J data path and CU) Page 213
2GHz processor
cycle time : 0.5 nano second

program P1 : 100 instructions (number of instructions : instruction count (IC))


each instruction : 5 cycles (CPI (cycle per instruction))
program execution time (in cycle) : 100 x 5 = 500 cycle.
program execution time (in time) : 500 cycle = 500 x 0.5 nsec = 250nsec.

q. CPU has 4GHZ processor and program P1 having 100 instruction and each instruction takes 5
cycle then what is the program execution time?
(i) in cycle
(ii) in time

4GHz processor
cycle time : 0.25 nano second

program P1 : 100 instructions (number of instructions : instruction count (IC))


each instruction : 5 cycles (CPI (cycle per instruction))
program execution time (in cycle) : 100 x 5 = 500 cycle.
program execution time (in time) : 500 cycle = 500 x 0.25 nsec = 125nsec

note :
- number of instructions : instruction count (IC)
- CPI : cycle per instruction

concept : CPU time calculation

CPU time means program execution time

Program execution time = number of seconds / program

no. of instructions/program x no. of cycle/instruction x no. of second/cycle

instruction count cycle per instruction cycle time

program execution time (CPU time) = IC x CPI x cycle time

q. CPU has 1 GHZ processor and program P 1 having 100 instruction and each instruction takes 5
cycle then what is the program execution time?
(i) in cycle
(ii) in time
Chapter - 3 (ALU^J data path and CU) Page 214
(ii) in time

1GHz processor
cycle time : 1 nano second

program P1 : 100 instructions (number of instructions : instruction count (IC))


each instruction : 5 cycles (CPI (cycle per instruction))
program execution time (in cycle) : 100 x 5 = 500 cycle.
program execution time (in time) : 500 cycle = 500 x 1 nsec = 500nsec.

P1
IC cycle time
CPI
100 instruction 5 cycle 1 nano seco

program execution time : IC x CPI x cycle time

program execution time : 100 x 5 x 1 = 500 nano second

CPU time calculation


Program is a combination of data transfer, data manipulation and transfer
of control instruction. different instruction takes (consume) different cycle
to complete the execution. so,

program execution time : (∑(ICi x CPi )x cycle time)


i : type of instructions

modification : cycle time = 1nano sec

P1

40 instruction 40 instruction 20 instruction


8 cycle 6 cycle 4 cycle

program execution time : (∑(IC x CPI )x cycle time)


= ((40 x 8) + (40 x 6) + 20 x 4) x 1 nsec
= (320 + 240 + 80 ) x 1nsec
= 640 x 1 nsec
= 600 nsec

q. consider a CPU with clock frequency (rate) of 400MHZ and if the CPU has average CPI 6
then average execution time is?
Chapter - 3 (ALU^J data path and CU) Page 215
then average execution time is?

sol. cycle time : 1 1 1 10


= x 10-8 = x x 10-8
6
400 x 10 4 4 10

=2.5 x 10-9
= 2.5 nsec

average instruction execution time : CPI x cycle time


: 6 x 2.5 nsec
: 15 nsec

q.consider a CPU operate at (run at) 800MHZ clock rate and iexecuting a program consiting
4000 instruction. if each instruction taking CPI 5 then total program execution time (CPU
time)?

sol. cycle time : 1 1 1 10


= x 10-8 = x x 10-8
800 x 106 8 8 10

=1.25 x 10-9 sec


= 1.25 nsec

Program ET :IC x CPI x cycle time


: 4000 x 5 x 1.25nsec
: 25000 nsec

q. what is average instruction execution time?


q. what is the MIPs rate?
q. what is total program execution time?

a. what is average instruction execution time?


(300 x 11) + (200 x 9) + (250 x 7) + (150 x 6) + (50 x 4) = 7,950
7950 / 950 = 8.3684

cycle time : 1/1.5 GHZ = (1/1.5) x 10-9 = 0.66 nano second

average instruction ET : 8.36 x 0.66 = 5.5176 nano second

b. what is the MIPs rate?


1 instruction ET : 5.51 x 10-9 second

Chapter - 3 (ALU^J data path and CU) Page 216


average instruction ET : 8.36 x 0.66 = 5.5176 nano second

b. what is the MIPs rate?


1 instruction ET : 5.51 x 10-9 second
in 1 second, how many total number of instruction executed?
in 1 second : 1 / 5.51 x 10-9 instruction or 1/5.51 x 109 instruction /sec
: 0.1814 x 109 instruction / second
: 181. 4 x 106 instruction / second
: 181. 4 MIPS

c. what is total program execution time?


total program ET : number of instruction in program x avg. instruction ET
: 950 x 5.51
: 5234.5 x 10-9
: 5.234 microsec.

a. what is average instruction execution time?


(.40 x 4) + (.40 x 6) + (.20 x 2) cycle
= 1.6 + 2.4 + 0.4 = 4.4 cyce.

cycle time : 1 nano second

average instruction ET : 4.4 x 1 = 4.4 nano second

b. what is the MIPs rate?


1 instruction ET : 4.4 x 10-9 second
in 1 second, how many total number of instruction executed?
in 1 second : 1/4.4 x 10-9s instruction / second
: (1/4.4) x 109 instruction / second
: (1000/4.4) x 106 instruction / second
: 227.2 x 106 instruction / second
: 227.2 MIPS

c. if program contains 106 instruction then what is total program ET?


total program ET : number of instruction in program x avg. instruction ET
: 106 x 4.4 x 10-9 second
: 4.4 x 10-3 second
: 4.4 mili second

note :
super computer : floaps
floating point operation per second

concept : perfomance gain (speed up factor)

perfomance gain (speed up factor) = perfomance of new / perfomance of old

Chapter - 3 (ALU^J data path and CU) Page 217


perfomance gain (speed up factor) =ETold / ETnew [0r] (1/ETnew) / (1/ETold)

same work is done by

- aryan in 10 hours
- vipul in 4 hours

so vipul perfomance is fast.

perfomance is inversely propotional 1/ET

old design :

program execution time : (∑(ICi x CPi )x cycle time)


average instruction ET : [.40 x 9 + .40 x 7 +.20 x 3] 2.3 nsec
: [3.6 + 2.8 + 0.6] x 2.3
average ETold : 16.1 nsec

new design : CPI = 1


cycle time increased by 40%

new cycle time : 2.3 + 40% 0f 2.3


: 2.3 + 0.92 = 3.22 nsec

program execution time : (∑(ICi x CPi )x cycle time)


average instruction ET : [.40 x 1 + .4 x 1 +.2 x 1] 3.22 nsec
: [0.4 + 0.4 + 0.2] x 2.3 = 2.33 nsec

average ETnew : 3.22 nsec

perfomance gain (speed up factor) = perfomance of new ETold


perfomance of old ETnew

16.1 nsec
3.22 nsec

s=5

Chapter - 3 (ALU^J data path and CU) Page 218


chapter - 5 (pipelining)
Monday, June 24, 2024 5:24 PM

instruction pipelining
non - pipelining : new input only accepted after completion of old input (or) accepting new
input after previously accepted input appearas a output at the other end, in non pipelining
non overlapping execution.
output end
s4 i1 i2 i3 i4
s3 i1 i2 i3 i4
s2 i1 i2 i3 i4
s1 i1 i2 i3 i4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
input end
cycle
sabhi cheeze common clock pulse se control ho rahi hai
s : stage total 16 cycles

- clock 1 par i1 stage 1 mai execute ho raha hai


- clock 2 par i1 stage 2 mai execute ho raha hai
- clock 3 par i1 stage 3 mai execute ho raha hai
- clock 4 par i1 stage 4 mai execute ho raha hai
.
.
- clock 15 par i4 stage 3 mai execute ho raha hai
- clock 16 par i4 stage 4 mai execute ho raha hai

pipelining : pipelining is a mechanism which is used to improve the perfomance of the system in
which task (instruction) are executed in overlapping (parallel) manner.

- pipelining is decomposition technique that means problem is divided into sub problem as assign
the sub problem to the pipes then operate the pipe under the same clock

output end i1 ends at 4

Chapter - 4 (Pipelining) Page 219


s4 i1 i2 i3 i4
s3 i1 i2 i3 i4
s2 i1 i2 i3 i4
s1 i1 i2 i3 i4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
input end
cycle
sabhi cheeze common clock pulse
s : stage total 7 cycles se control ho rahi hai

- clock 1 par i1 stage 1 mai execute ho raha hai


- clock 2 par i1 stage 2 mai aur i2 stage 1 mai execute ho raha hai
- clock 3 par i1 stage 3 mai, i2 stage 2 mai aur i3 stage 1 mai execute ho raha hai
- clock 4 par i1 stage 4 mai, i2 stage 3 mai, i3 stage 2 mai aur i4 stage 4 mai execute ho raha hai
- clock 5 par i2 stage 4 mai, i3 stage 3 mai aur i4 stage 2 mai execute ho raha hai
- clock 6 par i3 stage 4 mai aur i4 stage 3 mai execute ho raha hai
- clock 7 par i4 stage 4 mai execute ho raha hai

pipelining means accepting new input at one end before previously accepted input appears as a output
at other end. this means new input are executed along with old input in overlapping manner (in a
pipeline)

instruction : i1 i2 i3 i4
stage: s1 s2 s3 s4

pipelining mai 7 non-pipelining mai 16


cycles mai ho raha cycles mai ho raha

kyunki non-pipelining mai new input


tabhi aata jab previously accepted input
appear as a output.

example : pipeline
box 1 box 2 box 3 box 4
input output

Chapter - 4 (Pipelining) Page 220


input output

stage 4 box 4
stage 3 box 4 box 3 ......
stage 2 box 4 box 3 box 2 ......
stage 1 box 4 box 3 box 2 box 1 ......
1 2 3 4
new inputs are accepted at one end before previously accepted inputs
appear as output at the other end

example : non-pipeline
box 1 box 2 box 3 box 4
input output

box 1 box 2 box 3


input output box 4

stage 4 box 4
stage 3 box 4
stage 2 box 4
stage 1 box 4

1 2 3 4

box 1 box 2
input output box 3 box 4

stage 4 box 3
stage 3 box 3
stage 2 box 3
stage 1 box 3

5 6 7 8

new input only accepted after completion of old input

Chapter - 4 (Pipelining) Page 221


sabhi cheeze common clock pulse se control ho rahi hai

output

stage = segment

kisi bhi pipelining ke 2 end hote hai ek hota hai input end aur dusra output end and between these
end multiple pipes are interconnected to functioning of pipelining
- these pipes are called stage (or) segment
- between the stages 'buffer' are used to store the intermediate results.
- these buffer is called as pipeline register (or) interface register (or) buffer (or) latch
- all the stages along with the buffer are controlled (or) connected by common clock.

S1 ka data S2 mai daal rahe but agar S2 khali nahi hua then? islie buffer hota hai
stages ke beech mai, s1 ka data s2 mai daal rahe hai islie s1 se free karke buffer mai
daal dete hai jisse s1 free ho jaye jisse usme I2 aajaye.

Execution time calculation / perfomance evaluation

(i) execution time calculation in pipelining :


ETpipelining = k x tp + (n-1) tp

ETpipelining = [k+(n-1)] tp

k : no of stages (segments)
n : no of instructions
tp : each stage delay in pipeline.

Chapter - 4 (Pipelining) Page 222


tp : each stage delay in pipeline.

output end
s4 i1 i2 i3 i4
s3 i1 i2 i3 i4
s2 i1 i2 i3 i4
s1 i1 i2 i3 i4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
input end cycle

k : no of stages (segments) = 4
n : no of instructions = 4
tp : each stage delay in pipeline (assume) = 1 cycle

ETpipelining = [4 + (4-1)]1
ETpipelining = [4 + 3]1
ETpipelining = 7 cycle.

(ii) execution time in non-pipelining :

in a non-pipelining if 1 instruction takes tn time to execute then total time taken to


execute n instruction in non pipeline is

ETnonpipeline = n x tn

n : no of instruction
tn : each instruction execution time in non-pipeline

output end
s4 i1 i2 i3 i4
s3 i1 i2 i3 i4
s2 i1 i2 i3 i4
s1 i1 i2 i3 i4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
input end
cycle

Chapter - 4 (Pipelining) Page 223


input end
cycle

n : no of instruction = 4
tn : each instruction execution time in non-pipeline = 4

ETnonpipeline = 4 x 4
ETnonpipeline = 16 cycle.

perfomance gain of pipeline over non-pipeline :

perfomance gain perfomance of pipeline


=
(speed up factor) perfomance of non-pipeline
(s)

1
perfomance ∝
ET

1/ETpipeline
=
1/ETnon-pipeline

perfomance gain ETnon-pipeline


=
(speed up factor) ETpipeline
(s)

perfomance gain n x tn
=
(speed up factor) [k + (n-1)] tp
(s)

when large number of instruction executed (or) n value is not given (or) ideal case

perfomance gain tn tn : each instruction execution time in non-pipeline


= tp : each stage delay in pipeline.
(speed up factor) tp
(s)

example :

Chapter - 4 (Pipelining) Page 224


k = 4, n = 4 k = 4, n = 100 k = 4, n = 10000

n x tn n x tn n x tn
(s) = (s) = (s) =
[k + (n-1)] tp [k + (n-1)] tp [k + (n-1)] tp

4 x tn 100 x tn 10000 x tn
(s) = (s) = (s) =
[4 + (4-1)] tp [4 + (100-1)] tp [4 + (10000-1)] tp

4tn 100tn 10000tn


(s) = (s) = (s) =
7tp 103tp 10003 tp

execution time for one instruction :

(i) ETpipeline for 1 instruction


(ii)ETnon -pipeline for 1 instruction

2ns 2ns 2ns 2ns

perfectly balance (uniform delay) pipeline (same delay lagega)

Chapter - 4 (Pipelining) Page 225


(i) ETpipeline for 1 instruction (ii)ETnon -pipeline for 1 instruction
ETpipeline = [k+(n-1)] tp ETnon-pipeline = n x tn
ETpipeline = [4+(1-1)] 2 ETnon-pipeline = 1 x 8
ETpipeline = 8nsec ETnon-pipeline = 8nsec

tp : each stage delay in pipeline tn : each instruction execution time


tp : 2nsec
tn : 2+2+2+2 = 8 nsec

ETnon-pipeline = k x tp
tn : k x tp (only when perfectly balanced)

ideal case (or) when perfectly balanced

when each stage (all stage) are perfectly balanced (or) uniform delay then 1 task ET in
no-pipelining is

ETnon-pipeline = k x tp (only when perfectly balanced)

k=4
tp = 2

ETnon-pipelining = 4 x 2 = 8 nsec

perfomance gain tn
=
(speed up factor) tp
(s)

perfomance gain k x tp
=
(speed up factor) tp
(s)

perfomance gain no. of stages


=
(speed up factor) (K)
(s)

in ideal case when pipeline are perfectly balanced then maximum speed up factor is equal to
number of stages in pipeline

Chapter - 4 (Pipelining) Page 226


types of pipeline :

(i) uniform delay pipeline

(ii) non-uniform delay pipeline

(i) uniform delay pipeline (perfectly balanced) : in uniform delay pipeline each stage
taking the same amount of time (delay) to complete the assigned task.

s : stage
1ns 1ns 1ns 1ns

2ns 2ns 2ns 2ns

(i) ETpipeline for 1 instruction (ii)ETnon -pipeline for 1 instruction

ETpipeline = [k+(n-1)] tp ETnon-pipeline = n x tn

k=4 n: 1
n=1 tn : each instruction execution time
tp : each stage delay in pipeline tn : 2+2+2+2 = 8 nsec
tp : 2nsec ETnon-pipeline = 1 x 8
ETpipeline = [4+(1-1)]2 ETnon-pipeline = 8nsec
ETpipeline = 8nsec

in uniform delay :
1 task ET in pipeline = 1 task ET in non pipeline

if buffer delay is given or included then :


tp : stage delay + buffer delay

tp : 2 + 1 = 3 nsec
Chapter - 4 (Pipelining) Page 227
tp : 2 + 1 = 3 nsec

ETnon-pipeline = k x tp
tn : k x tp (only when perfectly balanced)

(ii)non-uniform delay pipelining : in non-uniform delay pipeline each stage taking (maintain)
different amount of time to complete the assigned task

tp : max (stage delay)

and if buffer delay is given then :

tp:max(stage delay + buffer delay)

s : stage
1ns 1ns 1ns 1ns

2ns 4ns 8ns 2ns

(i) ETpipeline for 1 instruction (ii)ETnon -pipeline for 1 instruction

ETpipeline = [k+(n-1)] tp ETnon-pipeline = n x tn

k=4 n: 1
n=1 tn : each instruction execution time
tp : max (stage delay) tn : 2+8+4+2 = 16 nsec
tp : max (2,4,8,2) ETnon-pipeline = 1 x 16
tp : 8sec ETnon-pipeline = 16nsec
ETpipeline = [4+(1-1)]8
ETpipeline = 32nsec

if buffer delay is given or included then

tp : stage delay + buffer delay


Chapter - 4 (Pipelining) Page 228
if buffer delay is given or included then

tp : stage delay + buffer delay

tp : 2 + 1 = 3 nsec

q. if n = 1000 then ET in non pipeline and pipeline for previous data

(i) ETpipeline for 1 instruction (ii)ETnon -pipeline for 1 instruction

ETpipeline = [k+(n-1)] tp ETnon-pipeline = n x tn

k=4 n : 1000
n = 1000 tn : each instruction execution time
tp : max (stage delay) tn : 2+8+4+2 = 16 nsec
tp : max (2,4,8,2) ETnon-pipeline = 1000 x 16
tp : 8sec ETnon-pipeline = 16000 nsec
ETpipeline = [4+(1ooo-1)]8
ETpipeline = 8024nsec

if buffer delay is given or included then

tp : max (stage delay + buffer delay)

tp : max (2+1, 4+1, 8+1, 2+1)

tp : 9nsec

jab perfectly balanced hai toh same time lag raha lekin jab non uniform hai toh different time lag
raha hai

important points:

(i) when pipeline are perfectly balanced (uniform delay) then 1 task ET in pipelining is same as 1
task ET in non-pipelining.

(ii) when pipeline are not perfectly balanced (non-uniform delay) then 1 task ET in pipeline is greater
than 1 task ET in non-pipeline.

T1 ≥ T2
T1 : 1 task execution time in pipeline
T2 : 1 task execution time in non-pipeline

(iii) but in non-uniform delay when number of task (instruction) increase then pipeline perfomance
Chapter - 4 (Pipelining) Page 229
(iii) but in non-uniform delay when number of task (instruction) increase then pipeline perfomance
is best in non-uniform delay (or) uniform delay pipeline.

(iv) buffer delay is included only in pipeline and in non-pipeline we are not storing the intermediate
result so in non pipeline buffer delay is not included.

k=4

tp : max (stage delay)


tp = max(10,15,20,30) = 30nsec

tn : each instruction execution time


tn = 10 + 15 +20 +30 = 75 nsec

s = tn
tp

s = 75 = 2.5
30

n= s (neta : efficiency)
k

n = 2.5 = 0.625 = 62.5%


4

k=4

tp : max (stage delay + buffer delay)


tp = max(8+2, 11+2, 20+2, 2+2) = 22nsec

tn : each instruction execution time


tn = 8 + 11 +20 +2 = 41 nsec

s = tn
tp

s = 41 = 1.86
22

Chapter - 4 (Pipelining) Page 230


k=4

tp : max (stage delay + buffer delay)


tp = max(150+5, 120+5, 160+5, 140+5) = 165nsec

n : 1000

ETpipeline = [k+(n-1)] tp
ETpipeline = [4+(1ooo-1)]165
ETpipeline = 1003 x 165 x 10-9
ETpipeline = 165.5 x 10-6

k=5

tp : max (stage delay + buffer delay)


tp = max(150+5, 120+5, 160+5, 140+5) = 165nsec

n : 100

ETpipeline = [k+(n-1)] tp
ETpipeline = [5+(1ooo-1)]165
ETpipeline = 1003 x 165 x 10-9
ETpipeline = 104 x 165nsec
ETpipeline = 17160 nsec

k=3

tp : max (stage delay)


tp = max(10,20,14) = 20nsec

n : 100

ETpipeline = [k+(n-1)] tp
ETpipeline = [3+(1oo-1)]20
ETpipeline = 102 x 20
ETpipeline = 2040 nsec

Chapter - 4 (Pipelining) Page 231


concept - throughput :

in operating system : number of processes completed in per time unit

in computer network : throughput = efficiency x bandwith

in general : throughput means rate of output (kis rate se aapko output aa raha hai)

throughput : number of tasks are completed per unit of time

for n instruction :
ETpipeline = [k+(n-1)] tp time

means in [k+(n-1)] tp time, n instruction (task) executed.

so, in 1 unit of time per unit = n


[k+(n-1)]

throughput : n
[k+(n-1)]

when numer of instructions are large (or) not given (or) in ideal case:

1
throughput :
tp

summary:

Chapter - 4 (Pipelining) Page 232


ETpipelining = [k+(n-1)] tp n
throughput :
[k+(n-1)]
k : no of stages (segments)
n : no of instructions
when numer of instructions are large
tp : each stage delay in pipeline.
(or) not given (or) in ideal case:

ETnonpipeline = n x tn 1
throughput :
tp
n : no of instruction
tn : each instruction execution time
in non-pipeline
n= s n (neta)
k
perfomance gain perfomance of pipe n : efficiency
=
(speed up factor) perfomance of non-pipe s : speed up factor
(s) k : number of stages
perfomance gain ETnon-pipeline
=
(speed up factor) ETpipeline in non-uniform
(s)
perfomance gain n x tn tp : max (stage delay + buffer delay)
=
(speed up factor) [k + (n-1)] tp
(s)
when perfectly balanced / ideal case:
k : no of stages (segments)
n : no of instructions
tp : each stage delay in pipeline. tn = k * tp

tn
tp =
when numer of instructions are large k
(or) not given (or) in ideal case: s=k

perfomance gain tn
=
(speed up factor) tp
(s)

tn : each instruction execution time


in non-pipeline
tp : each stage delay in pipeline.

Chapter - 4 (Pipelining) Page 233


instruction instruction instruction
fetch execute

two stage instruction pipeline

s= 20

n= s (neta : efficiency)
k

80% = 20
k

k = 20 x 100 = 25
80

k=4

tp : max (stage delay)


tp = max(10, 15, 20, 30) = 30nsec

tn : each instruction execution time


tn = 10 + 15 + 20 + 30 = 75 nsec

s = tn
tp

s = 75 = 2.5
30

n= s (neta : efficiency)
k

n = 2.5 = 0.625 = 62.5%


4

k=4

tp : max (stage delay + buffer delay)


tp = max(8+2, 11+2, 20+2, 2+2) = 22nsec

tn : each instruction execution time


tn = 8 + 11 +20 +2 = 41 nsec

Chapter - 4 (Pipelining) Page 234


tp = max(8+2, 11+2, 20+2, 2+2) = 22nsec

tn : each instruction execution time


tn = 8 + 11 +20 +2 = 41 nsec

s = tn
tp

s = 41 = 1.86
22

k=4

tp : max (stage delay + buffer delay)


tp = max(20, 80, 50, 10) = 80nsec

tn : each instruction execution time


tn = 20 + 80 +50 +10 = 160 nsec

in new design :
largest stage is split into 2 equal stage delay

tp = max(20, 40, 40, 50, 10) = 50nsec

s = perfomance of new
perfomance of old

s = ETold
ETnew

s = 80 = 1.6
50

tp = 50nsec

frequency = 1
time

1
frequency =
50 x 10-9

1
frequency = x 10-9
50

1000
frequency = x 106 = 20MHZ
50

Chapter - 4 (Pipelining) Page 235


k=5

old design
tp : max (stage delay)
tp = max(900, 600, 550, 450, 400) = 900nsec
1 instruction takes : 900 x 10-9 sec
in one second : 1/900 x 109 instruction
throughput : 1/900 instruction

new design
tp : max (stage delay)
tp = max(440, 460, 600, 550, 450, 400) = 600nsec
throughput : 1/600 instruction

% of increment new - old


:
in throughput
old

% of increment 1/600 - 1/900


:
in throughput 1/900

% of increment 1/6 - 1/9


:
in throughput 1/9

% of increment 9-6 9
: x = 50%
in throughput 54 1

design D1 design D2

k=5 k=8
tp : max (stage delay) tp : 2nsec
tp = max(3, 2,4,2,3) = 4nsec ETpipeline = [k+(n-1)] tp
ETpipeline = [k+(n-1)] tp ETpipeline = [8+(1oo-1)]2
ETpipeline = [5+(1oo-1)]4 ETpipeline = 107 x 2

Chapter - 4 (Pipelining) Page 236


tp : max (stage delay) tp : 2nsec
tp = max(3, 2,4,2,3) = 4nsec ETpipeline = [k+(n-1)] tp
ETpipeline = [k+(n-1)] tp ETpipeline = [8+(1oo-1)]2
ETpipeline = [5+(1oo-1)]4 ETpipeline = 107 x 2
ETpipeline = 104 x 4 ETpipeline = 214 nsec
ETpipeline = 416 nsec

time saved = 416 - 214 = 202nsec

- nonpipelined processor
- operating at 100MHz
so cycle time : 1 1 10-8 x 10
= 108 =
100 x 106 10

= 10 x 10-9 sec

cycle time : 10nsec

k=5
tp : max (stage delay + buffer delay)
tp = max(2+0.5, 1.5+0.5, 2+0.5, 1.5+0.5, 2.5+0.5) = 3nsec

s = tn
tp

s = 10 = 3.33
3

Chapter - 4 (Pipelining) Page 237


Chapter - 4 (Pipelining) Page 238
Chapter - 4 (Pipelining) Page 239
Chapter - 4 (Pipelining) Page 240
Chapter - 4 (Pipelining) Page 241
Chapter - 4 (Pipelining) Page 242
Chapter - 4 (Pipelining) Page 243
q.(i) why cycle per instruction (CPI) = 1 in pipeline.

q.(i) what is the meaning of cycle per instruction (CPI) = 1?

q.(ii) how design (construct) the pipeline?

q.(iii) why clock pulse is required?

q.(iv) how to set this CPI in uniform delay? and how to set this CPI in non-uniform
delay?

q.(i) why cycle per instruction (CPI) = 1 in pipeline.

in the pipeline average CPI = 1

ETpipeline for n instruction = [K+(n-1)] cycle

1 average instruction = [K+(n-1)] cycle (K-1) negligible (1000 instruction ke


comparison mai bahut kam k(3-4-5) ka
n value)

Cycle Per Instruction = 1

Chapter - 4 (Pipelining) Page 244


q.(ii) how design (construct) the pipeline?

circuit circuit circuit circuit

and and and and


logic gate logic gate logic gate logic gate

RISC : 5stages
if we want to construct N stage pipeline then entire CPU is divided into 'N' functional unit
(independent functional unit) which is independent from each other.

independent functional unit : it means when one functional unit perform the task, in the same
time other functional unit perform the other task (operation).
(functional unit : adder, subtractor,logic GATE, hardware etc)
agar hum computer mai N stage ki pipeline banana chate hai toh hume entire CPU ko
independent N functional unit mai construct krna padega

20ns 20ns 20ns 20ns

circuit circuit circuit circuit

and and and and


logic gate logic gate logic gate logic gate
2nsec 2nsec 2nsec 2nsec

ex. instruction pipeline : WB


i1 is in decoding stage
EX
and i2 is in fetch stage
in the same clock cycle ID i1
number 2.
IF i1 i2
1 2 3 4

Chapter - 4 (Pipelining) Page 245


note : characterstics of the pipeline is, in every new cycle new input must be inserted into
the pipeline.

cc1 cc2 cc3

1 2

clock cycle / pulse / signal

q.(i) why clock pulse is required?

- because when we will enable the clock then the operation will perfom (data will
move from one register to another register (or) any task)

- to provide the synchronization between the stages.

jab clock pulse enable karoge tabhi operation perform hoga.

q.(iv) how to set this CPI in uniform delay and in non-uniform delay?

(i) uniform delay :

20ns 20ns 20ns 20ns

circuit circuit circuit circuit

and and and and


logic gate logic gate logic gate logic gate
2nsec 2nsec 2nsec 2nsec

time : 22 nsec
frequency : 1 1
hz = hz
time 22nsec

cycle time : 22nsec

every 22 nsec new input insert in the pipeline

Chapter - 4 (Pipelining) Page 246


cc1 cc2

22 22
nsec nsec

clock cycle / pulse / signal

in uniform delay :

WB i1 i2 i3

EX i1 i2 i3

ID i1 i2 i3

IF i1 i2 i3

1 2 3 4 5 6

at
clock cycle (CC) 4 : i1 out
clock cycle (CC) 5 : i2 out
clock cycle (CC) 6 : i3 out

first instruction takes : 22nsec = 4x 22


remaining instruction takes : (n-1)22

because after every 1


cycle we are getting
output

ETpipeline = [k+(n-1)] cycle


actually : k x cycle + (n-1) cycle

for the very ist for remaining


instruction (n-1) instruction

4x 22 + (n-1)22

Chapter - 4 (Pipelining) Page 247


pehle instruction ke lie 4 cycle laga uske baad remaining (n-1)
instructions ke lie har cycle par we are getting one output

(ii) non-uniform delay :

20ns 25ns 28ns 26ns

circuit circuit circuit circuit

and and and and


logic gate logic gate logic gate logic gate
2nsec 2nsec 2nsec 2nsec

tp : max (stage delay + buffer delay)


tp : 30nsec CPI : 30nsec

q. what if we take minimum means (20+2) = 22 nsec?

time : 22 nsec
frequency : 1 1
hz = hz
time 22nsec

cycle time : 22nsec

Stage 1 will complete its work in 22 nanoseconds, and data movement from stage 1 to
stage 2 will occur within this period. However, for the remaining stages, tasks will not
be completed within this same 22 nanosecond clock cycle. Therefore, the maximum
stage delay is considered for the proper functioning of the pipeline.

maximum :

tp : 30nsec

Chapter - 4 (Pipelining) Page 248


cc1 cc2

30 30
nsec nsec

clock cycle / pulse / signal

q. if we take maximum : 30nsec then what does synchronization means?

In stage 1, the task will finish in 22 nanoseconds, but the clock cycle is set to 30
nanoseconds. This means that every 30 nanoseconds, the output from each stage
will become available. Therefore, after one complete cycle of 30 nanoseconds, the
output from stage 1 will be passed on to stage 2, and this sequence will continue
for each subsequent stage.

so synchronization!

q. what if register delay is also different?

20ns 25ns 28ns 26ns

circuit circuit circuit circuit

and and and and


logic gate logic gate logic gate logic gate
4nsec 6nsec 1nsec 4nsec

tp : max (stage delay + buffer delay)


tp : 31nsec CPI : 31nsec

Chapter - 4 (Pipelining) Page 249


nonpipelined processor
- operating at 2.5 GHz
so cycle time : 1 10-9 x 1
2.5 x 109 = 2.5

= 0.4 nsec

cycle time : 0.4 nsec

k=4
ETnon-pipeline = n x tn
ETnon-pipeline = 4 x 0.4 = 1.6 nsec

s = tn
tp

s = 10 = 3.33
3

pipelined processor
- operating at 2 GHz
so cycle time : 1 10-9 x 1
2x 109 = 2

= 0.5 nsec

cycle time : 0.5 nsec

k=5
ETpipeline = 0.5 nsec

s = tn
tp

s = 1.6 = 3.2nsec
0.5

q. relation between CPI and t p

tp : max (stage delay + buffer delay)


tp : max (5+1, 6+1, 11+1, 8+1)
tp : 12nsec

and CPI = 1 cycle


so we will set CPI = 12nsec

jo tp aaya vo apan cycle time set kardete hai.

Chapter - 4 (Pipelining) Page 250


ETpipeline = no of instruction x cycle time x CPI

same number of instruction in both processor P1 and P2

ETp1 = cycle time x CPI

20%more CPI in P2
ETp2 = cycle time x 1.2 CPI

and 25%less time


ETp2 = 0.75 x ETp1

clock frequency : 1GHZ


cycle time1 : 1nsec

0.75 x ETp1 = 1.2 CPI x cycle time2


0.75 x CPI x 1nsec = 1.2CPI x cycle time2
0.75 CPI = 1.2 CPI x cycle time2
cycle time2 = 0.75 = 0.625 nsec
1.2

frequency = 1
time2

1
frequency =
0.625 x 10-9

frequency = 1.6 GHZ

Chapter - 4 (Pipelining) Page 251


s1 s2 s3 s4
for i = 1 i1 1 3 4 6
i2 3 4 6 7
i3 4 5 8 9
i4 6 7 10 11

for i = 2 i1 7 9 11 13
i2 9 10 13 14
i3 10 11 15 16
i4 12 13 17 18

s4 i1 i1 i2 i3 i4 i1 i1
s3 i1 i2 i2 i3 i3 i4 i4 i1
s2 i1 i1 i2 i3 i4 i1 i1
s1 i1 i2 i2 i3 i4 i4 i1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

s1 s2 s3 s4
for i = 1 i1 2 3 4 5
i2 3 6 8 10
i3 5 7 9 13
i4 6 9 11 15

for i = 2 i1 8 10 12 16
i2 9 13 15 18
i3 11 14 16 21
i4 14 16 18 23

Chapter - 4 (Pipelining) Page 252


each instruction taking different amount of delay in different
stages

ETpipeline = 11 (i=1) one round


ETnon-pipeline = 23
ETnon-pipeline 23
speed up factor = = = 2.09
ETpipeline 11

Chapter - 4 (Pipelining) Page 253


Wednesday, July 10, 2024 10:40 PM

[weightage : 3-4marks]

cache memory

register cache memory main memory secondary memory


fastest fast slow slowest
flip-flop SRAM/flip-flop DRAM/capacitor

q. why cache memory?


- fastest compared to memory

register is also fastest but register size is small.

memory hierarchy : hierarchy design organize the system supported memory into 4
levels to minimize the accessing times.

main memory hit


if not then main
memory miss (page
agar data main memory mai mil jata hai fault)

locality of reference

CPU generated request

Chapter - 5 (Cache memory) Page 254


locality of reference

CPU generated request

register blocks pages secondary /


words
cache main memory logical / virtual
CPU hardware mapping paging memory

agar data cache mai mil jata hai


hit ratio : 1

cache hit
if not then cache miss

q. what is hit ratio?

number of hit
hit ratio = total number of access

if cache hit ratio is 80% that means 80% reference found in cache.

topic : memory

(i)CPU generated request initially refer to the cache

(ii) if the reference (respective data) find in the cache then that is called cache hit
(operation is called hit) then respective data is given cache to CPU in the form of words.

(iii) if the reference is not found in the cache then its called cache miss, then it is
forwarded to main memory.

(iv) if the reference is found in the main memory then it is called main memory hit (page
hit) then respective data is given main memory to cache in the form of blocks and
cache to CPU in the form of words.

(v) if the reference is not found in the main memory then it is called main memory miss
(page fault) then the reference forward to secondary memory.

(vi) secondary memory is the last level of memory in which hit ratio is always '1' so
respective data is transferred from secondary memory to main memory in the form of
pages, main memory to cache memory in the form of blocks, then cache to CPU in the
form of words.

Chapter - 5 (Cache memory) Page 255


form of words.

mapping : the process of transferring the data from main memory to cache memory is called
mapping.
h : cache hit ratio tm : main memory access time
tc : cache access time

register blocks
words
cache main memory
CPU

access time : time needed by processor to read/write memory in ROM/RAM

average memory access time : [Tavg]

Tavg = hit x time taken by memory when there is a hit + (1-H) time taken by
memory when there is a miss.

hit + miss = 1

number of hit
hit ratio = total number of access

total CPU request : 100


hit : 90 times
time taken : 20ns
hit miss : 10 times
time taken : 150ns

number of hit 90
hit ratio : = 100 = 0.9
total number of access

miss ratio : (1-H) = 1-0.9 = 0.1

Tavg = hit x time taken by memory when there is a hit + (1-H) time
taken by memory when there is a miss.
Tavg = 0.9 x 20 + 0.1 x 150
Tavg = 18 + 15
Tavg = 33ns

Chapter - 5 (Cache memory) Page 256


Tavg = 0.9 x 20 + 0.1 x 150
Tavg = 18 + 15
Tavg = 33ns

(or)

total time : 90 x 20 + 10 x 150


total time : 1800 + 1500
total time : 3300nsec
3300
Tavg =
100 = 33nsec

total CPU request : 400


hit : 300 times
time taken : 20ns
hit miss : 100 times
time taken : 150ns

number of hit 300


hit ratio : = 400 = 0.75
total number of access

miss ratio : (1-H) = 1-0.75 = 0.25

Tavg = hit x time taken by memory when there is a hit + (1-H) time
taken by memory when there is a miss.
Tavg = 0.75 x 20 + 0.25 x 150
Tavg = 15 + 37.5
Tavg = 52.5ns

(or)

total time : 300 x 20 + 100 x 150


total time : 6000 + 15000
total time : 21000nsec
21000
Tavg =
400 = 52.5nsec

hit : 80%
time taken : 5nsec
hit miss : 20%
time taken : 50nsec

number of hit 80%


hit ratio : = 100% = .8
total number of access

miss ratio : (1-H) = 1-0.8 = 0.2


Tavg = hit x time taken by memory when there is a hit + (1-H) time
taken by memory when there is a miss.
Tavg = .8 x 5 + .2 x 50
Tavg = 4 + 10
Tavg = 14ns

Chapter - 5 (Cache memory) Page 257


Tavg = 14ns

types of memory organisation :


(i) simultaneous access memory organisation

(ii) hierarchical access memory organisation

(i) simultaneous access memory organisation : all the levels of memory directly connected to
CPU (or) CPU is communication with all the
level of memory directly but access (follow)
in sequence

- when there is a miss in level 1 (l1) then hit in level 2(l2) then directly data is given from level
2 (l2) memory to CPU without copying into level 1 (l 1) memory

- when there is a miss in level 2 (l2) then hit in level 3(l3) then directly data is given from level
3 (l3) memory to CPU without copying into level 2 (l 2) and level 1 (l1) memory

hit ratio time

time : access time of the respective


l1 h1 t1
level memory

1 word access time : Tavg


l2 h2 t2

CPU number of words / sec 1


(data transfer rate) = Tavg
(bandwidth)
l3 h3 t3

. . .
. . .

ln hn tn

time required to access (read/write) 1 word from memory is called Tavg

Tavg = H1t1 + (1-H)H2t2 + (1-H1)(1-H2) H3t3 +......(1-H1)(1-H2).....(1-Hn-1)Hntn

Chapter - 5 (Cache memory) Page 258


time required to access (read/write) 1 word from memory is called Tavg

Tavg = H1t1 + (1-H)H2t2 + (1-H1)(1-H2) H3t3 +......(1-H1)(1-H2).....(1-Hn-1)Hntn

pehle level mai mila

pehle level mai nahi mila teesre level


mai mila
dusre level
mai mila
pehle, dusre aur.. level
mai nahi mila
pehle aur dusre level mai
nahi mila

hn : always 1 (last level


hit ratio is always 1)

(ii) hierarchical access memory organisation : all in the hierarchical access CPU is
communicating with only level 1 (l1)
memory

- when there is miss in level 1 (l1) and hit in level 2 (l2), firstly that data is copied from
level 2 (l2) to level 1 (l1) then level 1 (l1) to CPU

- when there is miss in level 1 (l1) and level 2 (l2) but hit in level 3 (l3), firstly that data is
copied from level 3 (l3) to level 2 (l2) then level 2 (l2) to level 1 (l1) then level 1 (l1) to CPU
in the locality of reference is present (cache works on locality of reference).

n level

h1 t1 h 3 t2 h 3 t3 . . . . . . . . h n tn

CPU l1 l2 l3 ln

Tavg = H1t1 + (1-H1)H2 (t2+t1) + (1-H1)(1-H2) H3 (t3+t2+t1) +......(1-H1)(1-H2).....(1-Hn-1)Hn (tn+tn-1+....+t3+t2+t1)

pehle level
mai mila

pehle level mai pehle aur


nahi mila dusre level
mai nahi
mila

dusre level pehle, dusre aur.. n-1

Chapter - 5 (Cache memory) Page 259


nahi mila dusre level
mai nahi
mila

dusre level pehle, dusre aur.. n-1


teesre level
mai mila level mai nahi mila
mai mila

toh t2 se t1 mai laayenge n level mai


toh t3 se t2 aur t2 se t1 mila
mai laayenge
toh tn se tn-1 mai.. t3 se t2
mai aur t2 se t1 mai
laayenge

h : cache hit ratio tm : main memory access time


tc : cache access time

register blocks
words
cache main memory
CPU

Tavg = h x tc + (1-h) [tm + tc] (or) Tavg = tc+ (1-h)tm

3 level

h1 h2

register
words
l1 cache l2 cache main memory
CPU

t1 t2
tm

Tavg = h1 t1 + (1-h1) h2 (t2+t1) + (1-h1)(1-h2)(tm+t2+t1)

h3 is missing because last level hit ratio


always 1

case (i) : if all data are available in level 1


Chapter - 5 (Cache memory) Page 260
case (i) : if all data are available in level 1

Tavg = h x tc + (1-h) [tm + tc]


Tavg = 1 x tc + 0
Tavg = tc

level 1 mai hi sab mil gaya, main memory mai jaane ki zarurat nahin.

q. hit ratio : 80%


tc : 2nsec
tm : 100nsec
by using hierarchical access.

Tavg = h x tc + (1-h) (tm + tc)


Tavg = 0.8 x 2 + 0.2 (102)
Tavg = 1.6 + 20.4
Tavg = 22nsec

(or)

Tavg = tc + (1-h)tm
Tavg = 2 + 0.2 (100)
Tavg = 2 + 20
Tavg = 22nsec

case (ii) : if 80% found in level 1 and remaining in level 2.

Tavg = h x tc + (1-h) (tm + tc)


Tavg = 0.8 x tc + 0.2 (tm + tc)

(or)

Tavg = tc + (1-h)tm
Tavg = tc + 0.2 (tm)

2 level l2 (t2)

l1 (t1)

register blocks
words
cache main memory
Chapter - 5 (Cache memory) Page 261
register blocks
words
cache main memory
CPU

hit ratio : h

Tavg = ht1 + (1-h) (t2+t1) (or) Tavg = t1 + (1-h) t2

Tavg = h x tc + (1-h) [tm + tc]


Tavg = 0.9 x 1 + (1-0.9) [100+1]
Tavg = 0.9+0.1x101
Tavg = 0.9+10.1
Tavg = 11nsec

(or)

Tavg = tc+ (1-h)tm


Tavg = 1 + 0.1 x 100
Tavg = 1 + 10
Tavg = 11nsec

Tavg = h x tc + (1-h) [tm + tc]


Tavg = 0.8 x 1 + (1-0.8) [100+1]
Tavg = 0.8+0.2x101
Tavg = 0.8+20.2
Tavg = 21nsec

(or)

Tavg = tc+ (1-h)tm


Tavg = 1 + (1-0.8)100
Tavg = 1 + 0.2 x 100
Tavg = 1 + 20
Tavg = 21nsec

Tavg = h x tc + (1-h) [tm + tc]


Tavg = 0.6 x 1 + (1-0.6) [100+1]
Tavg = 0.6+0.4x101
Tavg = 0.6+40.4
Tavg = 41nsec

Chapter - 5 (Cache memory) Page 262


Tavg = 0.6+0.4x101
Tavg = 0.6+40.4
Tavg = 41nsec

(or)

Tavg = tc+ (1-h)tm


Tavg = 1 + (1-0.6)100
Tavg = 1 + 0.4 x 100
Tavg = 1 + 40
Tavg = 41nsec

l1 = t1
l2 = t2

perfomance of l1 1/t1 t2
5 = perfomance of l = = t = 5t1
2 1/t2 t1 2

t1 = Tavg - 10
Tavg = T1 + 10

T1 = 20nsec
T2 = 5 x 20 = 100ns

Tavg = 20 + 10 = 30

Tavg = h x t1 + (1-h)t2
30 = h x 20 + (1-h) 100
30 = 20h + 100 -100h
70 = 80h
h = 70/80 = 0.875

note : cache is fast memory


- when hit ratio of cache is high that means it will take less time to access data
- when hit ratio of cache is low that means very less number of time data is found in
cache, so in the case access time increases.

Chapter - 5 (Cache memory) Page 263


t1 = 20nsec
t2 = 150nsec
Tavg = 20nsec

Tavg = h x t1 + (1-h)t2
30 = h x 20 + (1-h) 100
30 = 20h + 100 -100h
70 = 80h
h = 70/80 = 0.875

Tavg = h x t1+ (1-h)t2


Tavg = 1 x 20 + 0 x 150
Tavg = 20 nsec

Tavg = 30 + 10% of 30
Tavg(new) = 30 + 3
Tavg(new) = 33

Tavg(new) = hnew x tc + (1-hnew) tm


33 = h x 20 + (1-h) 150
33 = 20h + 150 - 150h
130h = 117
h = 117/130 = 0.9 = 90%

new hit ratio : 90%


old hit ratio : 92.33%
percentage of change : 3.22%

Tavg increases so hit ratio decreases.

1
hit ratio ∝
Tavg

topic : locality of reference

Chapter - 5 (Cache memory) Page 264


locality of reference
block :
words : 32
register pages secondary /
1 word word
cache main memory logical / virtual
CPU memory

fetched
this word

1 block = 32word

main memory to cache memory = 1 complete block


but cache memory to CPU 1 word requested/demanded

assume block size = 32word then 1 complete block (32word) will be


transferred from main memory to cache memory

but only 1 word that is requested/demanded by the CPU given from cache
memory to CPU.

blocks size >>>>>> word size

locality of reference : accessing the higher level of memory data from level 1 memory is called
locality of reference. (data kahin bhi ho hum cache memory se lenge)

types :
(i) temporary LOR
(ii) spatial LOR

(i) temporary LOR : means the same word in the same block is reference by the CPU in near
future (frequently)
(or)
same data which access again and again then type of data stored in
temporary LOR

will be accessed by CPU again

Chapter - 5 (Cache memory) Page 265


will be accessed by CPU again

1 block = 32word

(ii) spatial LOR : means the adjacent word in the same block is reference by the CPU in
a sequence.

adjacent word will be accessed by CPU

x x+1

1 block = 32word

CPU always access the data from the cache (faster/level one) memory. if there is a miss in
level one (cache) memory and hit in main memory (level 2 memory) then one complete
block is transferred from l2 memory to l1 memory and addressed word (with
request/demanded) by the CPU given from level 1 (cache) to CPU.

l2

l1
block :
words : 32
register (block)
1 word word
cache main memory
CPU TB

t1

t2

TB : block transfer time from l2 memory to l1 memory

case (i) : if block size is 1 word


TB = T2

case (ii) : if block size is n words


TB = n x T2

Chapter - 5 (Cache memory) Page 266


2 level

Tavg = ht1 + (1-h) (t2+t1) (or) Tavg = t1 + (1-h) t2

if locality of reference considered

Tavg = ht1 + (1-h)[tb + 1] (or) Tavg = t1 + (1-h) tb

3 level

Tavg = h1 t1 + (1-h1) h2 (t2+t1) + (1-h1)(1-h2)(tm+t2+t1)

if locality of reference considered

Tavg = h1 t1 + (1-h1) h2 (tb1+t1) + (1-h1)(1-h2)(tb2+tb1+t1)

(or)
Tavg = h1 t1 +m1 h2 (tb1+t1) + m1m2(tb2+tb1+t1)

m1 : miss in level 1
m2 : miss in level 2

Chapter - 5 (Cache memory) Page 267


level 1 memory access time : T1
level 1 hit ratio : h1

level 2 memory access time : T2 (TB1)


level 2 hit ratio : h2

level 3 memory access time : T3 (TB2)

Tavg = h1t1 + (1-h1)h2(t2+t1) + (1-h1)(1-h2)(t3+t2+t1)


Tavg = h1t1 + (1-h1)h2(TB1+t1) + (1-h1)(1-h2)(TB2+TB1+t1)

Tavg = t1

h2 = 100% = 1
h1 = 0
Tavg = h1t1 + (1-h1)h2(t2+t1)
Tavg = 0xt1 + (1-0)1(t2+t1)
Tavg = 1(t2+t1)
Tavg = t2+t1

h3 = 1
h2 = 0
h1 = 0
Tavg = h1t1 + (1-h1)h2(t2+t1) + (1-h1)(1-h2)(t3+t2+t1)
Tavg = 0xt1 + (1-0)0(t2+t1) + (1-0)(1-0)(t3+t2+t1)
Tavg = 1x0(t2+t1) + (1)(1)(t3+t2+t1)
Tavg = t3+t2+t1

with locality of reference :

level 1 memory access time : T1


level 1 hit ratio : h1

level 2 memory access time : T2 (TB1)


level 2 hit ratio : h2

level 3 memory access time : T3 (TB2)

Tavg = h1t1 + (1-h1)h2(TB1+t1) + (1-h1)(1-h2)(TB2+TB1+t1)

Tavg = t1

h2 = 100% = 1
h1 = 0
Tavg = h1t1 + (1-h1)h2(TB1+t1)
Tavg = 0xt1 + (1-0)1(TB1+t1)
Tavg = 1(t2+t1)
Tavg = TB1+t1

h3 = 1
h2 = 0
h1 = 0
Tavg = h1t1 + (1-h1)h2(TB1+t1) + (1-h1)(1-h2)(TB2+TB1+t1)

Chapter - 5 (Cache memory) Page 268


h3 = 1
h2 = 0
h1 = 0
Tavg = h1t1 + (1-h1)h2(TB1+t1) + (1-h1)(1-h2)(TB2+TB1+t1)
Tavg = 0xt1 + (1-0)0(TB1+t1) + (1-0)(1-0)(TB2+TB1+t1)
Tavg = 1x0(TB1+t1) + (1)(1)(TB2+TB1+t1)
Tavg = TB2+TB1+t1

level 1 memory access time : T1 = 30nsec


level 1 hit ratio : h1 = 90%

level 2 memory access time : T2 = 250nsec/word

miss in level 1 then 4 word block must be transferred from level 2 to


level 1

(hum T1, T2 aur Tavg ek word ka nikalte hai lekin yahan jo box
aayega vo n word ka hoga, T1 mai sirf one word aa raha lekin T2
mai 4 words)
if block size is n words : TB = n x T2
TB = 4 x 250 = 1,000 nsec

Tavg = hT1 + (1-h1)(TB+T1)


Tavg = 0.9 x 30 + (1-0.9)(1000+30)
Tavg = 27 + (0.1)(1030)
Tavg = 27 + 103
Tavg = 130nsec.

(or)

Tavg = T1 + (1-h)TB
Tavg = 30 + (1-0.9)1000
Tavg = 30 + 0.1 x 1000
Tavg = 30 + 100
Tavg = 130nsec.

cache miss: 50nsec


cache miss: 20%

cache hit: 5nsec


cache hit: 80%

Tavg = h x t1+ (1-h)t2


Tavg = 0.8 x 5 + (1-0.8)(50)
Tavg = 4 + (0.2)(50)
Tavg = 4 + 10
Tavg = 14nsec.

Data transfer rate : 1


Tavg

Chapter - 5 (Cache memory) Page 269


Tavg

Data transfer rate : 1


words/sec
14x10-9
= 1/14x109 words/sec

= 1000/14x106 words/sec

= 71.4 x 106 words/sec

= 72million words per second

word size : 8bit = 1byte


= 72 x 106 words/sec
= 72 x 106 byte/sec
= 72 mbps

cache memory : 1mb


block size : 256bytes

access time : 3ns


hir rate : 94%

cache miss : 20ns for first word and 5ns for remaining words
word size : 64bits = 8bytes
256B
words in a block : block size = = 32 words
word size 8B
if block size is n words : TB = n x T2

first word : 20n = 1 x 20 = 20


remaining words : 5ns = 31 x 5 = 155

Tavg = hT1 + (1-h1)(TB+T1)


Tavg = 0.94 x 3 + (1-0.94)[20+(31x5)+3]
Tavg = 2.82 + (0.06)[20+(155)+3]
Tavg = 2.82 + (0.06)[178]
Tavg = 2.82 + 10.68
Tavg = 13.5

cycle time : 500nsec


500nsec (cycle time) access = 1byte

in one second : 1 = 1000 x 106


500 x 10-9 500 byte/sec

2mbps

Chapter - 5 (Cache memory) Page 270


cache block size : 8 words
1 word size : 4 byte
cache block size : 8 x 4 = 32 byte
memory system uses : 60MHz clock

cycle time = 1
sec
60 x 106

total time taken to transfer block : 1 cycle (accept the


address) + 3 cycle (to fetch the complete block) + 8x1 (1
word per cycle) = 12 cycle

transferring 32 byte in 12 cycle

12 x 1 sec
60 x 106

in one sec = 32 x 60 x 106


12

= 160 x 106 byte/sec

Chapter - 5 (Cache memory) Page 271

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy