Instruction Set Architecture 24
Instruction Set Architecture 24
Instruction Set Architecture 24
An Instruction Set Architecture (ISA) defines the communication rules between the hardware and
software of the computer. The ISA is a design principle (conceptual) and not stored in a computer’s
memory.
non-volatile) is accessed
An Instruction Set Architecture (ISA) is part of the abstract model of a computer that defines how
the CPU is controlled by the software.
The ISA acts as an interface between the hardware and the software, specifying both what the
processor is capable of doing as well as how it gets done.
The ISA provides the only way through which a user is able to interact with the hardware. It can be
viewed as a programmer’s manual because it’s the portion of the machine that’s visible to the
assembly language programmer, the compiler writer, and the application programmer.
The ISA defines the supported data types, the registers, how the hardware manages main memory,
key features (such as virtual memory), which instructions a microprocessor can execute, and the
input/output model of multiple ISA implementations. The ISA can be extended by adding
instructions or other capabilities, or by adding support for larger addresses and data values.
Understanding what the instruction set can do and how the compiler makes use of those
instructions can help developers write more efficient code.
It can also help them understand the output of the compiler which can be useful for debugging.
Arm is opening its instruction set architecture for Cortex M cores. By allowing licensees to build their
own custom instructions, developers are able to accelerate specialized workloads. The Arm ISA
family allows developers to write software and firmware that conforms to the Arm specifications,
secure in the knowledge that any Arm-based processor will execute it in the same way.
Complex Instruction Set Computers (CISC)
CISC (Complex Instruction Set Computer) is an ISA design practice that focuses
on multi-step instructions and complex, power-consuming hardware. These
designs primarily focus on hardware components and binary instruction
complexity. Processing components are typically not interchangeable with RISC-
designed systems.
CISC Instructions Attributes:
- Single instructions take more than one CPU
cycle to complete
- Instruction length varies based on the
instruction type
- Hardware must be designed to accept more
complicated instructions
Computer Instructions
Computer instructions are written in binary, also known as machine code.
Computer hardware operates on a series of these binary instructions
through pulsating power signals that signify either OFF or ON based on the
binary digits 0 and 1 respectively.
Registers
A register is a volatile memory system that provides the CPU with rapid
access to information it is immediately using.
Functions of a register:
- Store temporary data for immediate
processing by the ALU
- Hold "flag" information if an operation
results in overflow or triggers other flags
- Hold the location of the next instruction
to be processed by the CPU
Instruction set architecture (ISA) describes the processor (CPU) in terms of what the assembly
language programmer sees, i.e. (a) the instruction set and instruction format, (b) Memory Model
and addressing methods and (c) the programmer accessible Registers. These three details of the
computer are also called Programmer's Model of a Computer. The architecture design goes along
with all the above.
Size of Operands
Although CPU brings instructions from Main Memory and executes, some kind of Internal storage is
required to carry out the instruction execution. This Internal Storage has a place in Instruction
format; it is also accessible to Assembly language programmers. There may also be additional
storage in the CPU which are not accessible at the Instruction level.
Stack Architecture
Accumulator Architecture
Stack Architecture
CPU Internal Storage is organized as Stack Memory and is used in the execution of instructions. A
stack resembles the array of contiguous storage locations stacked one over the other in the order of
their address. All read /write operations refer to the top of the stack (TOS). TOS is the only accessible
location in the stack. PUSH operation writes a word into the next location TOS+1 and causes this
location to become new TOS .i.e. a PUSH operation causes a stack to grow; Similarly a POP operation
reads the word form the top of the stack and updates TOS with (TOS-1) thus diminishing the stack
size. Figure 5.1 diagrammatically explains the Stack CPU operation.
Figure 5.1 Stack
CPU Operation
The addition operation C=A+B in a stack machine has the following set of instructions:
PUSH A
PUSH B
ADD
POP C
ADD instruction causes the top two words of the stack to be added in ALU; PUSH loads the data into
stack and POP causes the TOS to be read. A key component of stack CPU is Stack Pointer SP register,
which stores the internal address of TOS and automatically adjusts the TOS for every PUSH and POP
operation. A Program counter keeps track of instruction addresses in the usual manner as the
operands are taken from the stack.
A stack CPU evaluates arithmetic and other instructions using Polish notation, wherein X+Y is written
as x y +. The expressions are evaluated from left to right. The advantage of Polish notation is that it
eliminates the need for parenthesis.
In a stack machine, both instructions and operands implicitly taken from the stack. There is no
explicit memory access for operand fetch. No operand is part of the instruction. For this reason, the
stack architecture machines are called "Zero Address Machines" and the instruction set for these
machines are called Zero address instructions.
Olden day's Burroughs B5000 and HP systems are examples of this unique stack architecture. The
recent such computer is SUN PicoJava microprocessor designed for fast execution of compiled Java
code. Stack concept is widely used in pocket calculators.
The advantage of Zero address machines is good code density and being a simple model to evaluate
expressions. However, there are issues like difficulty in generating efficient code.
Accumulator Architecture
In accumulator architecture, one operand is implicitly in the accumulator. The other operand is
explicitly given as memory location. For this reason, these CPUs are said to be "1-Address Machines"
and the instruction set is said to be 1_address Instruction i.e. along with instruction only one
operand can be passed on. This architecture strictly follows Von Neumann architecture’s Stored
Program Concept.
Accumulator based CPU comprises a small set of registers namely Memory Address Register (MAR),
Memory Data Register (MDR), Instruction Register (IR) and Program Counter (PC). The data path of
the Accumulator based CPU is shown in figure 5.2. There is a system bus connecting the main
memory and the registers. This also shows the connectivity of the said registers. The Accumulator
(AC) plays a central role in the execution of instructions.
Accumulator Architecture
LOAD A
ADD B
STORE C
In the above example, note that each instruction has one operand only. The merit of accumulator
architecture is the reduced internal complexity of the CPU but memory traffic is large.
The CPU has a set of General Purpose registers (GPRs) as internal storage, in addition to those
registers mentioned in the Accumulator architecture. The GPRs are accessible to assembly language
programmers. Registers are not only structurally closer to CPU but also have much less access time
than main memory. Operands and intermediate results are stored in these registers, thereby
restricting the memory access during program execution. This, in turn, increases the performance of
the CPU. Both CISC and RISC follow this architecture. Nowadays CPUs configuration includes a large
number of GPRS in the range of 32 to 100+.
GPR architecture design comes with one, two or more internal bus in the CPU data path facilitating
the effective use of the registers for operand read/write. A single bus GPR architecture is shown in
figure 5.3. It can be extrapolated to two or three buses to reduce the instruction execution cycle. In
this, the instruction length is generally long so that max two operands are provided along with
instruction. CISC has provision to describe two operand addresses in the instruction while RISC has
provision for three operands or operand locations. RISC uses the GPRs more effectively than CISC for
this purpose. For this reason, this architecture is called "Two Address machine" in the case of CISC
implementation and "Three Address machine" in the case of RISC implementation.
In the cases, where the instruction requires more than one operand, Operand handling in GPR
architecture is classified into three types:
Register-Memory Type: One operand is referred to in memory and the other in GPR.
Register-Register Type: Both operands or maximum of three operands are referred to in the
register. The operands are preloaded into Registers form memory. This type is also known as
"Load/Store architecture" practised in RISC.
Memory to Memory Type: Both the operands are referred to in memory. An example of this
case could be, string manipulation instructions where large strings are moved from one
variable to the other.
The merits of GPR architecture are faster instruction completion. This architecture is suitable for
code optimization by Compilers. The compilers use free registers to assign for the operand. Although
longer instruction length was considered as a demerit, with technological advancement, it is no
more a disadvantage.
From table 5.1 one can understand the effectiveness of the GPR architecture in creating effective
equivalent code for an arithmetic equation.
Two examples are dealt with to bring out clarity on how the number of machine instructions is
reduced effectively in solving an equation. Not only the instruction, GPR architecture facilitates data
path with more number of internal bus, which in turn reduces the time required to execute each
instruction.
Zero Address
machine One Address Machine GPR Architecture
Instruction format Instruction format Instruction format includes Two Instruction format
includes Zero includes One Operand Operands includes Three Operands
Operand
-- Usually, the Operand Usually, both operands are referred to All the operands are
address is a Memory registers or one in a register and the preloaded into GPR
address other in memory Registers.
POP C
Add Store T
Store T
We deal with more details in the data path chapter. The last part of ISA, memory models and
addressing is handled in the next chapter.
Processor organisation
Processor organization includes the architecture, instruction set, and the design of the processor's
internal components, such as its registers, arithmetic logic unit (ALU), and control unit.
There are several components inside a CPU, namely, ALU, control unit, general purpose
register,Instruction registers etc. Now we will see how these components are organized inside CPU.
There areseveral ways to place these components and inteconnect them. One such organization is
shown in theFigure 5.6.In this case, the arithmatic and logic unit (ALU), and all CPU registers are
connected via asingle common bus. This bus is internal to CPU and this internal bus is used
to transfer theinformation between different components of the CPU. This organization
is termed assingle bus organization, since only one internal bus is used for transferring of
informationbetween different components of CPU. We have external bus or buses to CPU also
toconnect the CPU with the memory module and I/O devices. The external memory bus isalso shown
in the Figure 5.6connected to the CPU via the memory data and address register
MDR
and
MAR
MAR: In a computer, the Memory Address Register (MAR) is the CPU register that eitherstores the
memory address from which data will be fetched to the CPU or the address to which data will be
sent and stored. In other words, MAR holds the memory location of datathat needs to be accessed.
When reading from memory, data addressed by MAR is fed intothe MDR (memory data register) and
then used by the CPU. When writing to memory, theCPU writes data from MDR to the memory
location whose address is stored in MAR.MDR: The Memory Data Register (MDR) or Memory Buffer
Register (MBR) is the registerof a computer's control unit that contains the data to be stored in the
computer storage (e.g.RAM), or the data after a fetch from the computer storage. It acts like a buffer
and holdsanything that is copied from the memory ready for the processor to use it. The MDR is
atwo-way register. When data is fetched from memory and placed into the MDR, it is writtento go in
one direction. When there is a write instruction, the data to be written is placed intothe MDR from
another CPU register, which then puts the data into memory.
The number and function of registers R0 to R(n-1) vary considerably from one machine
toanother. They may be given for general-purpose for the use of the programmer.Alternatively,
some of them may be dedicated as special-purpose registers, such asindexregisterorstack pointers.In
this organization, two registers, namely Y and Z are used which are transperant to theuser.
Programmer can not directly access these two registers. These are used as input andoutput buffer to
the ALU which will be used in ALU operations. They will be used by CPU astemporary storage for
some instructions.For the execution of an instruction, we need to perform an instruction cycle. An
instruction cycleconsists of two phase,
Fetch cycle and
Execution cycle.
Most of the operation of a CPU can be carried out by performing one or more of the following
functions in some prespecified sequence:
1.Fetch the contents of a given memory location and load them into a CPU register.
2.Store a word of data from a CPU register into a given memory location.
3.Transfer a word of data from one CPU register to another or to the ALU.
4.Perform an arithmatic or logic operation, and store the result in a CPU register.
For any operation on a computer, the processor must interpret the operating system. A processor
consists of arithmetical logic and a control unit (CU) that measures capability in terms of:
Components of a Processor
A processor has four components: a floating point unit (FPU), an arithmetic logic unit (ALU),
registers, and cache memories.
ALU is the main component in a processor that performs various arithmetic and logic operations. It is
an integrated circuit within the CPU/GPU, due to which it is also known as an integer unit (IU). This is
the last component that performs calculations in the processor.
It is part of the computer system used for carrying out operations on floating-point numbers. These
operations include square root, multiplication, division, subtraction, and addition. It can perform
transcendental functions such as trigonometric and exponential functions; however, it may not be
accurate.
3. Registers
Registers are types of computer memory that accept, transfer, and store data and instructions being
used. It instructs ALU about the processes that must be carried out and stores the results of these
operations.
4. Cache
Cache is the smaller yet faster memory located close to the processor’s core. This memory stores the
copy of data from the frequently used main locations. There are three levels of cache: L1, L2 and L3
cache. L1 is the primary chip, which is embedded in the processor chip.
Since it is small, it has limited storage. L2 cache is the secondary cache that is either embedded on a
processor chip or a separate chip with a high-speed bus that connects it to the CPU. Also known as
processor cache, L3 is a specialized backup memory for L1 and L2. It boosts the performance of L1
and L2.
Types of Processors
Let us now discuss the different types of processors that are available at present.
It is a component used in the system-on-a-chip design. The instruction set of ASIP is customized to
benefit specific applications. For certain ASIPs, this instruction set is configurable. ASIP can be an
alternative to hardware accelerators for video coding or baseband signal processing.
According to Flynn’s taxonomy, processors can be classified based on concurrent instructions and
data streams available in architecture. Let us now discuss each of the above one by one.
Here, instructions are sent to the control unit from the memory module. Then, they are decoded and
sent to a processing unit that processes data retrieved from the memory module and then sends it
back. Examples of SSID are traditional uniprocessor machines such as PCs, old mainframes, pipelined,
and superscalar processors.
It is a type of computer that comes with multiple processing elements. It simultaneously performs
same operation on multiple data points as well as parallel computations on only a single instruction
at a given time. SIMD may be a part of the hardware design and is directly accessible through
instruction set architecture (ISA). These machines do not exploit concurrency.
It is a type of parallel computing architecture where multiple functional units perform different
operations on the same data. Every CU here handles and processes one instruction stream through
corresponding processing elements. It has an architecture that is used for fault tolerance. MISD
organisation computers are used rarely. Space Shuttle flight control computer is an example of MISD.
This refers to a technique used for achieving parallelism. Machines that have MIMD have several
processors that function independently and asynchronously. Multiple autonomous processors
execute, at any time, execute different instructions on different data pieces.
These machines can be either shared or distributed memory categories based on how MIMD
processors access memory. Shared memory may be bus-based, hierarchical or extended types.
Distributed memory may be of hypercube or mesh types.
Following are the different types of processors based on the number of cores:
A single-core microprocessor has a single core in its die. It performs the ‘fetch-decode-execute
cycle’ once per clock cycle since it runs only on a single thread. These processors have been less in
demand due to lesser processing power. Their slow speed has made multi-core systems more
popular.
3.2 Multi-core
Multi-core processors are microprocessors on a single integrated unit having two or more cores. Each
core reads and executes the program instructions. Here, a single processor can simultaneously run
instructions on a separate core. Due to this, the overall speed for programs supporting
multithreading and parallel computing techniques increases.
3.3 Hyper-Threading
It is a technology that is used in the Intel microprocessors. This technology allows a single
microprocessor to act as two processors for the operating system and the application. Through
hyperthreading, processor resources are more efficiently used, allowing multiple threads to run on
each core.
4. Special processors
It is a specialized electronic circuit that manipulates and alters memory to accelerate the creation of
images in frame buffers that are intended for output to display devices. They can efficiently
manipulate image processing and computer graphics.
Due to their highly parallel structure, they are more efficient than general-purpose CPUs for those
algorithms where processing of large data blocks is performed in parallel. GPU may be embedded
on motherboards or video cards.
Also known as the Physics Acceleration card, it is a dedicated microprocessor that handles physics
calculations, unlike GPU. It is used specifically for the physics engine of video games. This
microprocessor helps offload time-consuming tasks for the computer’s Central Processing Unit. It
provides physics simulation data and communicates this data to the CPU. These are used in high-
performance computers.
4.3 Digital Signal Processor (DSP)
It is another specialized microprocessor having an architecture optimized for the operational needs
of digital signal processing. This measures, compresses, and filters continuous real-world analog
signals. They are more power efficient, due to which they can be used in portable electronic data.
These processors fetch multiple instructions and data at the same time. DSPs are cost-effective since
they are cheaper yet provide better performance and lower latency. They do not have any
requirements for specialized cooling or larger batteries.
It is a special-purpose hardware device that is programmable. Like RISC processors, these are low-
cost and flexible, scalable and fast as ASIC chips. Such processors are used for
designing networking applications.
They have characteristics that are similar to general-purpose CPUs used in different types of
equipment and products. Firewalls, routers, switches and network security devices use network
processors.
These are smaller computers that connect networks to host computers. Data is transferred between
the front-end processor and the host computer through high-speed parallel interfaces.
They offload the host computer from managing peripheral devices, packet assembly and
disassembly, and error detection and correction. These processors communicate with peripheral
devices using serial interfaces via communication networks.
Register Organisation
General Register Organization is the processor architecture that stores and manipulates data for
computations. The main components of a register organization include registers, memory, and
instructions. The registers act as memory within the processor and are used to process instructions
as they are executed.
In Computer Organisation, the register is utilized to acknowledge, store, move information and
directions that are being utilized quickly by the CPU. There are different kinds of registers utilized for
different reasons. Some of the commonly used registers are:
o AC ( accumulator )
o DR ( Data registers )
o AR ( Address registers )
o PC ( Program counter )
o IR ( index registers )
When we perform some operations, the CPU utilizes these registers to perform the operations.
When we provide input to the system for a certain operation, the provided information or the input
gets stored in the registers. Once the ALU arithmetic and logical unit process the output, the
processed data is again provided to us by the registers.
The sole reason for having a register is the quick recovery of information that the CPU will later
process.
The CPU can use RAM over the hard disk to retrieve the memory, which is comparatively a much
faster option, but the speed retrieved from RAM is still not enough.
Therefore, we have cache memory, which is faster than registers. These registers work with CPU
memory like catch and RAM to complete the task quickly.
Fetch: The fetch operation is utilized for taking the directions by the client. The instructions
that are stored away into the main memory for later processing are fetched by registers.
Decode: This operation is utilized for deciphering the instructions implies the instructions
are decoded the CPU will discover which operation is to be performed on the instructions.
Execute: The CPU performs this operation. Also, results delivered by the CPU are then stored
in the memory, and after that, they are shown on the client Screen.
Types of Register in Computer Organization
1 Accumulator AC An accumulator is the most often utilized register, and it is used to store
information taken from memory.
2 Memory MAR Address location of memory is stored in this register to be accessed later.
address It is called by both MAR and MDR together
registers
3 Memory data MDR All the information that is supposed to be written or the information that
registers is supposed to be read from a certain memory address is stored here
4 General- GPR Consist of a series of registers generally starting from R0 and running till
purpose Rn - 1. These registers tend to store any form of temporary data that is
register sent to a register during any undertaking process.
More GPR enables the register to register addressing, which increases
processing speed.
5 Program PC These registers are utilized in keeping the record of a program that is
counter being executed or under execution. These registers consist of the
memory address of the next instruction to be fetched.
PC points to the address of the next instruction to be fetched from the
main memory when the previous instruction has been completed
successfully. Program Counter (PC) also functions to count the number of
instructions.
The incrementation of PC depends on the type of architecture being
used. If we use a 32-bit architecture, the PC gets incremented by 4 every
time to fetch the next instruction.
7 Condition code These have different flags that depict the status of operations. These
registers registers set the flags accordingly if the result of operation caused zero or
negative
11 Index registers BX We use this register to store values and numbers included in the address
information and transform them into effective addresses. These are also
called base registers.
These are used to change operand address at the time of execution, also
stated as BX
12 Memory buffer MBR MBR - Memory buffer registers are used to store data content or
register memory commands used to write on the disk. The basic functionality of
these is to save called data from memory.
MBR is very similar to MDR
13 Stack control SCR Stack is a set of location memory where data is stored and retrieved in a
registers certain order. Also called last in first out ( LIFO ), we can only retrieve a
stack at the second position only after retrieving out the first one, and
stack control registers are mainly used to manage the stacks in the
computer.
SP - BP is stack control registers. Also, we can use DI, SI, SP, and BP as 2
byte or 4-byte registers.
EDI, ESI, ESP, and EBP are 4 - byte registers
14 Flag register FR Flag registers are used to indicate a particular condition. The size of the
registered flag is 1 - 2 bytes, and each registered flag is furthermore
compounded into 8 bits. Each registered flag defines a condition or a
flag.
The data that is stored is split into 8 separate bits.
Basic flag registers -
Zero flags
Carry flag
Parity flag
Sign flag
Overflow flag.
15 Segment SR Hold address for memory
register