Answer Key
Answer Key
1. How is stack implemented in 8051? (or) What is stack pointer and write the stack
level of 8051? (NOV 2007)
The 8051 LIFO: Stack can reside anywhere in the internal RAM.
It has 8 bit stack pointer to indicate the stop of the stack using PUSH and POP
instructions.
During PUSH the SP is incriminated by one and POP the SP is decremented by one.
2. List the features of 8051 microcontroller.[ MAY 2007][NOV 2007, NOV 2011]
The 8051 is an 8-bit Microcontroller:
The CPU can work on only 8 bits of data at a time
The 8051 has
128 bytes of RAM
4K bytes of on-chip ROM
Two timers
Auxiliary carry
00- Register bank 0
User flag 0 01- Register bank 1
10- Register bank 2
11- Register bank 3
5. Give the function of the SP register of 8051. [ NOV/DEC2011]
SP: SP stands for stack pointer.
SP is an 8-bit wide register.
It is incremented before data is stored during PUSH and CALL instructions.
The stack pointer is initialized to 07H after a reset.
6 How do you calculate baud rate for serial communication for 8051? (NOV 2007,
MAY 2013,2015)
8051 divides the crystal frequency by 12 to get machine cycle frequency. 8051
UART circuitry divides the machine cycle frequency by 32.
Timer 1 is used to set baud rate using TH1 register
Baud rate TH1 (decimal) TH1(Hex)
9600 -3 FD
4800 -6 FA
2400 -12 F4
1200 -24 E8
7 What is jump range? (NOV 2015)
What is the difference between AJMP and LJMP instruction? (May 2014)
Jump Instruction Meaning Jump Range
SJMP Short jump 256B
AJMP Absolute jump 2KB
LJMP Long jump 64KB
AJMP and LJMP instructions are transfer program control to the specified vector
address.
Program control transfer ranges are 2KB for AJMP and 64KB for LJMP.
8 What is the size of the on-chip program memory and on-chip data memory of 8051
microcontroller? (AU May 2012, NOV 2011)
The size of the on-chip program memory of 8051 microcontroller : 4K
The size of the on-chip data memory of 8051 microcontroller : 128 bytes
12. Summarize the challenges in embedded computing design (Dec20, Apr21 and Dec21)
Howmuch hardware do weneed?
Howdowemeet deadlines?
Howdoweminimizepower consumption?
Howdowedesignfor upgradeability?
Does it really work?
13. List the features ofARMprocessor.
TheARM processors provide advanced featuresforavarietyof applications.
Several extensions provideimproved digital signalprocessing.
Saturation arithmetic canbeperformed withnooverhead.
A new instruction is usedfor arithmeticnormalization.
Multimediaoperationsaresupportedbysingleinstructionmultipledata operations.
Draw the architectural block diagram of 8051 microcontroller and explain. (NOV 2011, MAY 2010, NOV
2009, NOV2008, May 2008, MAY 2007, MAY 2006, NOV 2016, May 2016)
It has hardware architecture with RISC (Reduced Instruction Set Computer) concept.
The block diagram of 8051 microcontroller is shown in Fig 3.
8051 has 8-bit ALU.
ALU can perform all the 8-bit arithmetic and logical operations in one machine cycle.
The ALU is associated with two registers A & B
A and B Registers:
The A and B registers are special function registers.
A & B registers hold the results of many arithmetic and logical operations of 8051.
The A register is also called the Accumulator.
A register is used as a general register to accumulate the results of a large number of instructions.
By default, it is used for all mathematical operations and data transfer operations between CPU and
external memory.
The B register is mainly used for multiplication and division operations along with A register.
Ex: MUL AB : DIV AB.
It has no other function other than as a store data.
R registers:
"R" registers are a set of eight registers that are named R0, R1, etc. up to R7.
These registers are used as auxiliary registers in many operations.
The "R" registers are also used to temporarily store values.
Fig.3. Block Diagram of 8051 Microcontroller
The bits PSW3 and PSW4 are denoted as RS0 and RS1.
These bits are used to select the bank registers of the RAM location.
The selection of the register Banks and their addresses are given below.
0 0 0 00H-07H
0 1 1 08H-0FH
1 0 2 10H-17H
1 1 3 18H-1FH
RAM & ROM:
The 8051 microcontroller has 128 bytes of Internal RAM and 4KB of on chip ROM.
The RAM is also known as Data memory and the ROM is known as program (Code) memory.
Code memory holds program that is to be executed.
Program Address Register holds address of the ROM/ Flash memory.
Data Address Register holds address of the RAM.
I/O ports:
The 8051 microcontroller has 4 parallel I/O ports, each of 8-bits.
So, it provides 32 I/O lines for connecting the microcontroller to the peripherals.
The four ports are P0 (Port 0), P1 (Port1), P2 (Port 2) and P3 (Port3).
Explain different types addressing modes of 8051 microcontroller. (NOV 2008, NOV 2015, April 2017)
The way in which the data operands are specified is known as the addressing modes. There are various
methods of denoting the data operands in the instruction.
The 8051 microcontroller supports 5 addressing modes. They are
1. Immediate addressing mode
2. Direct Addressing mode
3. Register addressing mode
4. Register indirect addressing mode
5. Indexed addressing mode
Immediate addressing mode:
The addressing mode in which the data operand is a constant and it is a part of the instruction itself is
known as Immediate addressing mode.
Normally the data must be preceded by a # sign.
This addressing mode can be used to transfer the data into any of the registers including DPTR.
Examples:
MOV A, # 27 H : The data (constant) 27 is moved to the accumulator register
ADD R1, #45 H : Add the constant 45 to the contents of the accumulator
MOV DPTR, # 8245H : Move the data 8245 into the data pointer register.
Direct addressing mode:
In the addressing mode, the data operand is in the RAM location (00 -7FH) and the address of the
data operand is given in the instruction.
The direct addressing mode uses the lower 128 bytes of Internal RAM and the SFRs
Examples:
MOV R1, 42H : Move the contents of RAM location 42 into R1 register
MOV 49H, A : Move the contents of the accumulator into the RAM location 49.
ADD A, 56H : Add the contents of the RAM location 56 to the accumulator
Arithmetic instructions:
With example, explain arithmetic instructions in 8051 microcontroller. (NOV 2012)
ADD
• 8-bit addition between the accumulator (A) and a second operand.
• The result is always in the accumulator.
• The CY flag is set/reset appropriately.
ADDC
• 8-bit addition between the accumulator, a second operand and the previous value of the
CY flag.
• Useful for 16-bit addition in two steps.
• The CY flag is set/reset appropriately.
DAA
• Decimal adjust the accumulator.
• Format the accumulator into a proper 2 digit packed BCD number.
• Operates only on the accumulator.
• Works only after the ADD instruction.
SUBB
• Subtract with Borrow.
• Subtract an operand and the previous value of a borrow (carry) flag from the
accumulator.
• A A - <operand> - CY.
• The result is always saved in the accumulator.
• The CY flag is set/reset appropriately.
INC
• Increment the operand by one.
• The operand can be a register, a direct address, an indirect address, the data pointer.
DEC
• Decrement the operand by one.
• The operand can be a register, a direct address, an indirect address.
MUL AB / DIV AB
• Multiply A by B and place result in A and B registers.
• Divide A by B and place quotient in A register & remainder in B register.
•
Logical instructions in 8051.
ANL : It performs AND logical operation between two operands.
Work on byte sized operands or the CY flag.
• ANL A, Rn
• ANL A, direct
• ANL A, @Ri
• ANL A, #data
• ANL direct, A
• ANL direct, #data
• ANL C, bit
• ANL C, /bit
ORL: It performs OR logical operation between two operands.
Work on byte sized operands or the CY flag.
• ORL A, Rn
• ORL A, direct
• ORL A, @Ri
• ORL A, #data
XRL
Works on bytes only.
• XRL A, Rn
• XRL A, direct
CPL / CLR
Complement / Clear.
Work on the accumulator or a bit.
• CLR P1.2
• CPL Rn
RL / RLC / RR / RRC
Rotate the accumulator.
• RL and RR without the carry
• RLC and RRC rotate through the carry.
• SWAP A: Swap the upper and lower nibbles of the accumulator.
Explain interrupt structure of 8051 microcontroller. (NOV 2011, MAY 2009)
Interrupt Structure :
An interrupt is an external or internal event that disturbs the microcontroller to inform it that a device
needs its service.
The program which is associated with the interrupt is called the interrupt service routine (ISR) or
interrupt handler.
Upon receiving the interrupt signal, the microcontroller finishes current operation and saves the PC on
stack.
Jumps to a fixed location in memory depending on type of interrupt.
Starts to execute the interrupt service routine until RETI.
Upon executing the RETI the microcontroller returns to the place where it was interrupted. Get pop PC
from stack.
The 8051 microcontroller has FIVE interrupts in addition to Reset. They are
Each interrupt has a specific place in code memory where program execution begins.
EA : Global enable/disable. To enable the interrupts, this bit must be set high.
Upon reset, the interrupts have the following priority from top to down. The interrupt with the highest
PRIORITY gets serviced first.
IP.7: reserved
IP.6: reserved
1. Explain the different modes of operation of timers in 8051 in detail with its associated registers.
Describe different modes of operation of timers /counters in 8051 with its associated registers.
(NOV 2009, MAY 2009. May 2007, May 2016)
Draw and explain the functions of TCON and TMOD registers of 8051. (Dec 2008)
Explain the on-chip timer modes of an 8051 Microcontroller. (April 2010, NOV 2016)
Timer Registers.
The 8051 has two timers/counters, they can be used either as timers (used to generate a time delay)
or as event counters.
TIMER 0:
Timer 0 is a 16-bit register and can be treated as two 8-bit registers (TL0 & TH0).
These registers can be accessed similar to any other registers like A, B or R1 etc
Ex : The instruction MOV TL0,#07 moves the value 07 into lower byte of Timer0.
Similarly MOV R1, TH0 saves the contents of TH0 in the R1 register.
TIMER 1:
Timer 1 is also a 16-bit register and can be treated as two 8-bit registers (TL1 & TH1).
These registers can be accessed similar to any other registers like A, B or R1etc
Ex : The instruction MOV TL1,#05 moves the value 05 into lower byte of Timer1.
Similarly MOV R0,TH1 saves the contents of TH1 in the R0 register.
2. Explain the serial programming of 8051 with its associated registers. (May 2014, 2013)(Or)
Explain how to program for sending and receiving data serially using 8051 (April 2010, 2011)
Explain 8051 serial port programming with examples. (May 2016, NOV 2012)
Explain the serial modes of operation of 8051 microcontroller. (May 2007)
RS232
It is an interfacing standard RS232.
It was set by the Electronics Industries Association (EIA) in 1960.
The standard was set long before the advent of the TTL logic family.
Its input and output voltage levels are not TTL compatible.
In RS232, a 0 is represented by -3 to -25 V, while a 1 bit is +3 to +25 V.
IBM introduced the DB-9 version of the serial I/O standard.
MAX232
A line driver ( MAX232) is required to convert RS232 voltage levels to TTL levels, and vice versa.
8051 has two pins that are used specifically for transferring and receiving data serially.
These two pins are called TxD and RxD and are part of the port 3 (P3.0 and P3.1).
These pins are TTL compatible.
They require a line driver to make them RS232 compatible.
Baud rate:
The baud rates in 8051 are programmable.
8051 divides the crystal frequency by 12 to get machine cycle frequency.
8051 UART circuitry divides the machine cycle frequency by 32.
Explain in detail the serial communication registers of the 8051. (NOV 2009)
SBUF:
It is an 8-bit register used for serial communication.
For a byte data to be transferred via the TxD line:
Byte must be placed in the SBUF register.
Bytes are framed with the start and stop bits and transferred serially via the TxD line.
SBUF holds the byte of data when it is received by 8051 RxD line.
When the bits are received serially via RxD.
8051 de-frames byte by eliminating the stop and start bits.
SCON:
It is an 8-bit register used to program the start bit, stop bit and data bits of data framing.
SM0 SM1 SM2 REN TB8 RB8 TI RI
1. TMOD register is loaded with the value 20H, indicating the use of timer 1 in mode 2 (8-bit auto-
reload) to set baud rate.
2. The TH1 is loaded with one of the values to set baud rate for serial data transfer.
3. The SCON register is loaded with the value 50H, indicating serial mode 1, where an 8-bit data is framed
with start and stop bits.
4. TR1 is set to 1 to start timer 1
5. TI is cleared by CLR TI instruction.
6. The character byte to be transferred serially is written into SBUF register.
7. The TI flag bit is monitored with the use of instruction JNB TI, xx, to see if the character has been
transferred completely.
8. To transfer the next byte, go to step 5.
THE EMBEDDED SYSTEM DESIGN PROCESS:
Briefly explain about the steps involved in embedded system design. (NOV/DEC
2006, 2007, 2009, May 2012, April 2018, Dec20) Designing with Computing platforms
(Dec2022/Jan 2023, May 2023)
The embedded system design process aimed at two objectives.
□ First, it will give us an introduction to the various steps in embedded system design before
we delve into them in more detail. Second, it will allow us to consider the design
methodology itself. A design methodology is important for three reasons.
□ First, it allows us to keep a scorecard on a design to ensure that we have done everything
we need to do, such as optimizing performance or performing functional tests.
□ Second, it allows us to develop computer-aided design tools. Developing a single program
that takes in a concept for an embedded system and emits a completed design would be a
daunting task, but by first breaking the process into manageable steps, we can work on
automating the steps one at a time.
□ Third, a design methodology makes it much easier for members of a design team to
communicate.
□ By defining the overall process, team members can more easily understand what they are
supposed to do, what they should receive from other team members at certain times and
what they are to hand off when they complete their assigned steps.
□ Since most embedded systems are designed by teams, coordination is perhaps the most
important role of a well-defined design methodology.
□ In this top–down view, we start with the system requirements. In the next step,
specification, we create a more detailed description of what we want.
□ But the specification states only how the system behaves, not how it is built.
□ The details of the system’s internals begin to take shape when we develop the architecture,
which gives the system structure in terms of large components.
□ Once we know the components we need, we can design those components, including both
software modules and any specialized hardware we need. Based on those components, we
can finally build a complete system.
7
Major levels of abstraction in the design process.
□ The top–downdesign will begin with the most abstract description of the system and
conclude with concrete details. The alternative is a bottom–up view in which we start with
components to build a system.
□ Bottom–up design steps are shown in the figure as dashed- line arrows. We need bottom–up
design because we do not have perfect insight into how later stages of the design process
will turn out.
□ Decisions at one stage of design are based upon estimates of what will happen later: How
fast can we make a particular function run? How much memory will we need? How much
system bus capacity do we need? We also need to consider the major goals of the design.
□ Manufacturing cost.
□ Performance (both overall speed and deadlines); and
□ Power consumption.
□ We must also consider the tasks we need to perform at every step in the design
process. At each step in the design, we add detail:
□ We must analyze the design at each step to determine how we can meet the
Specifications.
□ We must then refine the design to add detail.
□ We must verify the design to ensure that it still meets all system goals, such as
cost, speed and so on.
8
1.1.1 Requirements
□ Clearly, before we design a system, we must know what we are designing. The initial
stages of the design process capture this information for use in creating the architecture and
components.
◻ We generally proceed in two phases: First, we gather an informal description from the
customers known as requirements, and we refine the requirements into a specification that
contains enough information to begin designing the system architecture.
◻ Separating out requirements analysis and specification is often necessary because of the
large gap between what the customers can describe about the system they want and what
the architects need to design the system.
◻ Consumers of embedded systems are usually not themselves embedded system designers or
even product designers. Their understanding of the system is based on how they envision
users’ interactions with the system. They may have unrealistic expectations as to what can
be done within their budgets; and they may also express their desires in a language very
different from system architects’ jargon.
◻ Capturing a consistent set of requirements from the customer and then massaging those
requirements into a more formal specification is a structured way to manage the process of
translating from the consumer’s language to the designer’s.
◻ Requirements may be functional or nonfunctional. We must of course capture the basic
functions of the embedded system, but functional description is often not sufficient.
Typical nonfunctional requirements include:
Performance: The speed of the system is often a major consideration both for the usability
of the system and for its ultimate cost. As we have noted, performance may be a
combination of soft performance metrics such as approximate time to perform a user- level
function and hard deadlines by which a particular operation must be completed.
◻ Cost: The target cost or purchase price for the system is almost always a consideration.
Cost typically has two major components: manufacturing cost includes the cost of
components and assembly; nonrecurring engineering (NRE) costs include the personnel
and other costs of designing the system.
◻ Physical size and weight: The physical aspects of the final system can vary greatly
depending upon the application. An industrial control system for an assembly line may be
designed to fit into a standard-size rack with no strict limitations on weight. A handheld
device typically has tight requirements on both size and weight that can ripple through the
entire system design.
◻ Power consumption: Power, of course, is important in battery-powered systems and is
often important in other applications as well. Power can be specified in the requirements
stage in terms of battery life—the customer is unlikely to be able to describe the allowable
wattage.
◻ Validating a set of requirements is ultimately a psychological task since it requires
understanding both what people want and how they communicate those needs. One good
9
way to refine at least the user interface portion of a system’s requirements is to build a
mock-up.
□ The mock- up may use canned data to simulate functionality in a restricted demonstration,
and it may be executed on a PC or a workstation. But it should give the customer a good
idea of how the system will be used and how the user can react to it.
□ Physical, nonfunctional models of devices can also give customers a better idea of
characteristics such as size and weight.
Sample requirements form.
Name
Purpose
Inputs
Outputs
Functions
Performance
Manufacturing cost
Power
Physical size and weight
□ Requirements analysis for big systems can be complex and time consuming. However,
capturing a relatively small amount of information in a clear, simple format is a good start
toward understanding system requirements.
□ To introduce the discipline of requirements analysis as part of system design, we will use a
simple requirements methodology.
We can use the requirement form as a checklist in considering the basic characteristics of the
system.
□ Name: This is simple but helpful. Giving a name to the project not only simplifies talking
about it to other people but can also crystallize the purpose of the machine.
□ Purpose: This should be a brief one- or two- line description of what the system is supposed
to do. If you can’t describe the essence of your system in one or two lines, chances are that
you don’t understand it well enough.
□ Inputs and outputs: These two entries are more complex than they seem. The inputs and
outputs to the system encompass a wealth of detail:
□ Types of data: Analog electronic signals? Digital data? Mechanical inputs?
□ Data characteristics: Periodically arriving data, such as digital audio samples? Occasional
user inputs? How many bits per data element?
□ Types of I/O devices: Buttons? Analog/digital converters? Video displays?
□ Functions: This is a more detailed description of what the system does. A good way to
approach this is to work from the inputs to the outputs: When the system receives an input,
what does it do? How do user interface inputs affect these functions? How do different
functions interact?
10
□ Performance: Many embedded computing systems spend at least some time controlling
physical devices or processing data coming from the physical world. In most of these cases,
the computations must be performed within a certain time frame. It is essential that the
performance requirements be identified early since they must be carefully measured during
implementation to ensure that the system works properly.
□ Manufacturing cost: This includes primarily the cost of the hardware components. Even if
you don’t know exactly how much you can afford to spend on system components, you
should have some idea of the eventual cost range. Cost has a substantial influence on
architecture: A machine that is meant to sell at $10 most likely has a very different internal
structure than a $100 system.
□ Power: Similarly, you may have only a rough idea of how much power the system can
consume, but a little information can go a long way. Typically, the most important decision
is whether the machine will be battery powered or plugged into the wall. Battery-powered
machines must be much more careful about how they spend energy.
□ Physical size and weight:You should give some indication of the physical size of the
system to help guide certain architectural decisions. A desktop machine has much more
flexibility in the components used than, for example, a lapel mounted voice recorder.
12
Putting the requirements into chart format:
1.2.2 DCC
□ The Digital Command Control (DCC) was created by the National Model Railroad
Association to support interoperable digitally-controlled model trains.
□ Hobbyists started building homebrew digital control systems in the 1970s and
Marklin developed its own digital control system in the 1980s.
□ DCC was created to provide a standard that could be built by any manufacturer so
that hobbyists could mix and match components from multiple vendors.
The DCC standard is given in two documents:
□ Standard S-9.1, the DCC Electrical Standard, defines how bits are encoded
on the rails for transmission.
□ Standard S-9.2, the DCC Communication Standard, defines the packets that
carry information.
□ Any DCC-conforming device must meet these specifications. DCC also provides
several recommended practices. These are not strictly required but they provide some
hints to manufacturers and users as to how to best use DCC.
□ The DCC standard does not specify many aspects of a DCC train system. It doesn’t
define the control panel, the type of microprocessor used, the programming language
to be used, or many other aspects of a real model train system.
□ The standard concentrates on those aspects of system design that are necessary for
interoperability.
15
□ Over standardization, or specifying elements that do not really need to be
standardized, only makes the standard less attractive and harder to implement.
□ The Electrical Standard deals with voltages and currents on the track
□ The standard must be carefully designed because the main function of the track is to
carry power to the locomotives. The signal encoding system should not interfere with
power transmission either to DCC or non-DCC locomotives. A key requirement is
that the data signal should not change the DC value of the rails.
□ The data signal swings between two voltages around the power supply voltage. bits
are encoded in the time between transitions, not by voltage levels. A 0 is at least 100
ms while a 1 is nominally 58ms.
□ The durations of the high (above nominal voltage) and low (below nominal voltage)
parts of a bit are equal to keep the DC value constant. The specification also gives the
allowable variations in bit times that a conforming DCC receiver must be able to
tolerate.
□ The standard also describes other electrical properties of the system, such as
allowable transition times for signals.
□ The DCC Communication Standard describes how bits are combined into packets and
the meaning of some important packets.
□ Some packet types are left undefined in the standard but typical uses are given in
Recommended Practices documents.
We can write the basic packet format as a regular expression:
PSA (sD) + E ( 1.1)
16
□ D is the data byte, which includes eight bits. A data byte may contain an address,
instruction, data, or error correction information.
□ E is a packet end bit, which is a 1 bit.
□ A packet includes one or more data byte start bit/data byte combinations. Note that
the address data byte is a specific type of data byte.
□ A baseline packet is the minimum packet that must be accepted by all DCC
implementations. More complex packets are given in a Recommended Practice
document.
□ A baseline packet has three data bytes: an address data byte that gives the intended
receiver of the packet; the instruction data byte provides a basic instruction; and an
error correction data byte is used to detect and correct transmission errors.
□ The instruction data byte carries several pieces of information. Bits 0–3 provide a 4-
bit speed value.
□ Bit 4 has an additional speed bit, which is interpreted as the least significant speed
bit. Bit 5 gives direction, with 1 for forward and 0 for reverse. Bits 7–8 are set at 01
to indicate that this instruction provides speed and direction.
□ The error correction data byte is the bitwise exclusive OR of the address and
instruction data bytes.
□ The standard says that the command unit should send packets frequently since a
packet may be corrupted. Packets should be separated by at least 5 ms.
UML collaboration diagram for major subsystems of the train controller system.
18
A UML class diagram for the train controller showing the composition of the subsystems.
□ Let’s break down the command unit and receiver into their major components. The
console needs to perform three functions: read the state of the front panel on the
command unit, format messages, and transmit messages. The train receiver must
also perform three major functions: receive the message, interpret the message.
□ It shows the console class using three classes, one for each of its major
components. These classes must define some behaviors, but for the moment we
will concentrate on the basic characteristics of these classes:
□ The Console class describes the command unit’s front panel, which contains the
analog knobs and hardware to interface to the digital parts of the system.
□ The Formatter class includes behaviors that know how to read the panel knobs and
creates a bit stream for the required message.
□ The Transmitter class interfaces to analog electronics to send the message along
the track.
19
□ There will be one instance of the Console class and one instance of each of the component
classes, as shown by the numeric values at each end of the relationship links. We have also
shown some special classes that represent analog components, ending the name of each with
an asterisk:
□ Knobs describes the actual analog knobs, buttons, and levers on the control panel.
□ Sender describes the analog electronics that send bits along the track.
□ Likewise, the Train makes use of three other classes that define its components:
□ The Receiver class knows how to turn the analog signals on the track into digital form.
□ The Controller class includes behaviors that interpret the commands and figures out how to
control the motor.
□ The Motor interface class defines how to generate the analog signals required to control the
motor.
We define two classes to represent analog components:
□ Detector detects analog signals on the track and converts them into digital
form.
□ Pulser turns digital commands into the analog signals required to control
the motor speed.
□ We have also defined a special class, Train set, to help us remember that the system
can handle multiple trains.
□ The values on the relationship edge show that one train set can have t trains. We would not
actually implement the train set class, but it does serve as useful
documentation of the existence of multiple receivers.
Q1. Discuss about ARM architecture versions.
ARM Architecture Versions:
The ARM architecture has evolved significantly and will continue to be
developed in the future. Six major versions of the instruction set have been defined to
date, denoted by the version numbers from 1 to 6. Of these, the first three versions
including the original 26-bit architecture (the 32-bit architecture was introduced at
ARMv3) are now OBSOLETE. Other versions are Version 4, version 5 and
version6.Versions can be qualified with variant letters to specify collections of
additional instructions that are included as an architecture extension. Extensions are
typically included in the base architecture of the next version number, ARMv5T being
the notable exception. Provision is also made to exclude variants by prefixing the
variant letter with x, for example the XP variant described below in the summary of
version 5 features.
The valid architecture variants are as follows: ARMv4, ARMv4T, ARMv5T,
(ARMv5TExP), ARMv5TE, ARMv5TEJ, and ARMv6
The following architecture variants are now OBSOLETE: ARMv1, ARMv2, ARMv2a,
ARMv3, AR M v 3 G , AR M v 3 M, ARMv4xM, ARMv4TxM, ARMv5, ARMv5xM,
and ARMv5TxM.
Version 4 and the introduction of Thumb (T variant):
The Thumb instruction set is a re-encoded subset of the ARM instruction set. Thumb
instructions execute in their own processor state, with the architecture defining the
mechanisms required to transition between ARM and Thumb states. The key difference
is that Thumb instructions are half the size of ARM instructions (16 bits compared with
32 bits).
Thumb code usually uses more instructions for a given task, making ARM code
best for maximizing performance of time-critical code.
ARM state and some associated ARM instructions are required for exception
handling.
New features in Version 5T:
Improved efficiency of ARM/Thumb interworking.
Count leading zeros (ARM only) and software breakpoint (ARM and Thumb)
instructions added
Additional options for coprocessor designers (coprocessor support is ARM only)
Tighter definition of flag setting on multiplies (ARM and Thumb).
Introduction of the E variant, adding ARM instructions which enhance
performance of an ARM processor on typical digital signal processing (DSP)
algorithms:
o Several multiply and multiply-accumulate instructions that act on 16-bit
data items.
o Addition and subtraction instructions that perform saturated signed
arithmetic. Saturated arithmetic produces the maximum positive or
negative value instead of wrapping the result if the calculation overflows
the normal integer range.
New features in Version 6:
The following ARM instructions are added from the older version to an improved
version:
• CPS, SRS and RFE instructions for improved exception handling
• REV, REV16 and REVSH byte reversal instructions
• SETEND for a revised endian (memory) model
• LDREX and STREX exclusive access instructions
• SXTB, SXTH, UXTB, UXTH byte/half word extend instructions
• A set of Single Instruction Multiple Data (SIMD) media instructions
• Additional forms of multiply instructions with accumulation into a 64-bit result.
Q2. Explain ARM Architecture. (DEC 21)/Functional Blocks of ARM processor
(Dec 2022/Jan 2023)
3. ARM Architecture (ARM7 TDMI):
T: Thumb, 16-bit instruction set
D: on-chip Debug support, enabling the processor to halt in response to a debug
request
M: enhanced Multiplier, yield a full 64-bit result, high performance
I: Embedded ICE hardware
The ARM architecture supports two basic types of data:
The standard ARM word is 32 bits long.
The word may be divided into four 8-bit bytes.
ARM7 consists of
1 dedicated program counter
1 dedicated current program status register
5 dedicated saved program status registers
30 general purpose registers
The architecture of ARM7 is given below.
Fig: Architecture of ARM7 Processor.
The current processor mode governs which of several banks is accessible. Each mode
can access
a particular set of r0-r12 registers
a particular r13 (the stack pointer, sp) and r14 (the link register)
the program counter, r15 (pc)
the current program status register, CPSR
Privileged modes (except System) can also access
a particular SPSR(saved program status register)
Features of ARM Processor:
The ARM processors provide advanced features for a variety of applications.
Several extensions provide improved digital signal processing.
Saturation arithmetic can be performed with no overhead.
A new instruction is used for arithmetic normalization.
Multimedia operations are supported by single instruction multiple data
operations.
A separate monitor mode allows the processor to enter a secure world to
perform operations not permitted in normal mode.
23
Write short notes on assembly and linking.(NOV/DEC 2008, May 2011)
□ Assembly and linking are the last steps in the compilation process they turn a list of
instructions into an image of the program’s bits in memory.
□ Loading actually puts the program in memory so that it can be executed.
□ The compilation process is often hidden from us by compilation commands that do
everything required to generate an executable program.
□ As the figure shows, most compilers do not directly generate machine code, but instead
create the instruction- level program in the form of human-readable assembly language.
□ Generating assembly language rather than binary instructions frees the compiler writer
from details extraneous to the compilation process, which includes the instruction format
as well as the exact addresses of instructions and data.
□ The assembler’s job is to translate symbolic assembly language statements into bit- level
representations of instructions known as object code. T
□ The assembler takes care of instruction formats and does part of the job of translating
labels into addresses.
□ However, since the program may be built from many files, the final steps in determining
the addresses of instructions and data are performed by the linker, which produces an
executable binary file.
□ That file may not necessarily be located in the CPU’s memory, however, unless the
linker happens to create the executable directly in RAM. The program that brings the
program into memory for execution is called a loader.
□ The simplest form of the assembler assumes that the starting address of the assembly
language program has been specified by the programmer. The addresses in such a
program are known as absolute addresses.
24
3.9.3. Assemblers
□ When translating assembly code into object code, the assembler must translate opcodes
and format the bits in each instruction, and translate labels into addresses.
□ Labels make the assembly process more complex, but they are the most important
abstraction provided by the assembler.
□ Labels let the programmer (a human programmer or a compiler generating assembly
code) avoid worrying about the locations of instructions and data. Label processing
requires making two passes through the assembly source code as follows:
□ The first pass scans the code to determine the address of each label.
□ The second pass assembles the instructions using the label values computed in the first
pass.
□ The name of each symbol and its address is stored in a symbol table that is built during
the first pass. The symbol table is built by scanning from the first instruction to the last.
□ During scanning, the current location in memory is kept in a program location counter
(PLC).
□ Despite the similarity in name to a program counter, the PLC is not used to execute the
program, only to assign memory locations to labels.
□ For example, the PLC always makes exactly one pass through the program, whereas the
program counter makes many passes over code in a loop.
□ Thus, at the start of the first pass, the PLC is set to the program’s starting address and the
assembler looks at the first line.
□ After examining the line, the assembler updates the PLC to the next location and looks at
the next instruction.
□ If the instruction begins with a label, a new entry is made in the symbol table, which
includes the label name and its value. The value of the label is equal to the current value
of the PLC.
□ At the end of the first pass, the assembler rewinds to the beginning of the assembly
language file to make the second pass.
25
□ During a second pass when a label name is found, the label is looked up in the symbol
table and its value substituted into the appropriate place in the instruction.
□ But how do we know the starting value of the PLC? The simplest case is absolute
addressing.
□ In this case, one of the first statements in the assembly language program is a pseudo-op
that specifies the origin of the program, that is, the location of the first address in the
program.
□ A common name for this pseudo-op (e.g., the one used for the ARM) is the ORG
statement, which puts the start of the program at location 2000.
ORG 2000
□ This pseudo-op accomplishes this by setting the PLC’s value to its argument’s value,
2000 in this case.
□ Assemblers generally allow a program to have many ORG statements in case instructions
or data must be spread around various spots in memory.
3.9.4 Linking:
□ Many assembly language programs are written as several smaller pieces rather than as a
single large file.
□ Breaking a large program into smaller files helps delineate program modularity.
□ If the program uses library routines, those will already be preassembled, and assembly
language source code for the libraries may not be available for purchase.
□ A linker allows a program to be stitched together out of several smaller pieces.
□ The linker operates on the object files created by the assembler and modifies the
assembled code to make the necessary links between files.
□ Some labels will be both defined and used in the same file. Other labels will be defined in
a single file but used elsewhere.
□ The place in the file where a label is defined is known as an entry point. The place in the
file where the label is used is called an external reference.
□ The main job of the loader is to resolve external references based on available entry
points.
□ As a result of the need to know how definitions and references connect, the assembler
passes to the linker not only the object file but also the symbol table.
26
□ Even if the entire symbol table is not kept for later debugging purposes, it must at least
pass the entry points.
□ External references are identified in the object code by their relative symbol identifiers.
Discuss in detail about basic compilation technique. (DEC20, APR21, May 2023)
□ It is useful to understand how a high- level language program is translated into
instructions.
□ Since implementing an embedded computing system often requires controlling the
instruction sequences used to handle interrupts, placement of data and instructions in
memory, and so forth, understanding how the compiler works can help you know when
you cannot rely on the compiler.
□ Next, because many applications are also performance sensitive, understanding how code
is generated can help to meet the performance goals, either by writing high- level code
that gets compiled into the instructions you want or by recognizing when we must write
our own assembly code.
□ Compilation begins with high- level language. Simplifying arithmetic expressions is one
example of a machine- independent optimization. Not all compilers do such
optimizations, and compilers can vary widely regarding which combinations of machine-
independent optimizations they do perform.
□ Instruction- level optimizations are aimed at generating code.
□ They may work directly on real instructions or on a pseudo- instruction format that is later
mapped onto the instructions of the target CPU.
□ This level of optimization also helps modularize the compiler by allowing code
generation to create simpler code that is later optimized. For example, consider the
following array access code:
x[i] = c*x[i];
□ A simple code generator would generate the address for x[i] twice, once for each
appearance in the statement.
28
□ While in this simple case it would be possible to create a code generator that never
generated the redundant expression, taking into account every such optimization at code
generation time is very difficult.
□ Better code and more reliable compilers are get by generating simple code first and then
optimizing it.
3.9.5. Explain several techniques for optimizing software performance? (May 2023)
Loops are important targets for optimization because programs with loops tend to spend a lot
of time executing those loops.
There are three important techniques in optimizing loops:
code motion
induction variable elimination and
Strength reduction.
Code motion lets us move unnecessary code out of a loop. If a computations result
does not depend on operations performed in the loop body, then we can safely move
it out of the loop.
Code motion opportunities can arise because programmers may find some
computations clearer and more concise when put in the loop body, even though they
are not strictly dependent on the loop iterations.
A simple example of code motion is also common. Consider the following loop:
for (i = 0; i < N*M; i++)
{
z[i] = a[i] + b[i];
}
The code motion opportunity becomes more obvious when we draw the loop s
CDFG as shown in Figure 5.23.
The loop bound computation is performed on every iteration during the loop test,
even though the result never changes.
We can avoid N _M _1 unnecessary executions of this statement by moving it
before the loop, as shown in the figure.
An induction variable is a variable whose value is derived from the loop iteration
variables value.
The compiler often introduces induction variables to help it implement the loop.
Properly transformed, we may be able to eliminate some variables and apply
strength reduction to others.
A nested loop is a good example of the use of induction variables.
29
Here is a simple nested loop:
for (i = 0; i < N; i++)
for (j = 0; j < M; j++)
z[i][j] = b[i][j];
b) Cache Optimizations
A loop nest is a set of loops, one inside the other.
Loop nests occur when we process arrays. A large body of techniques has been
developed for optimizing loop nests.
Rewriting a loop nest changes the order in which array elements are accessed.
This can expose new parallelism opportunities that can be exploited by later
stages of the compiler, and it can also improve cache performance.