Practical Guide
Practical Guide
COS2621
Year 2025
School of Computing
BARCODE
COS2621/102/0/2024
Dear Student
1 Study unit 0
We would like to instill the following attitude: "Look, I don't know this particular brand
of computer, but since I know that almost all computers operate in basically the same
way, give me one or two weeks and the set of relevant manuals and I will be quite at
home with this computer".
The theoretical part of the syllabus for COS2621 is covered in the prescribed book.
This tutorial letter (often referred to as the guide, or the study guide) contains all the
study material you will need for the practical component of this module.
Stallings, W. 2021. Computer Organization & Architecture: Designing for performance. 11th
edition. Prentice Hall.
Note that we are not going to study the chapters in Stallings in the order in which they
are presented. We have to introduce the assembly language concepts as soon as
possible in order for you to master the practical work. A short summary of the concepts
that we study in this module follows:
COS2621/102/0/2024
Study unit 1: Computer Organisation and Computer Architecture, Boolean
algebra and number systems (This study unit supplements chapters 1, 10 and 12
as well as appendix B of Stallings.)
more operand fetches, followed by zero or more operand stores, followed by an interrupt check
(if interrupts are enabled).
The major computer system components (processor, main memory, I/O modules)
need to be interconnected in order to exchange data and control signals. The most
popular means of interconnection is the use of a shared system bus consisting of
multiple lines. In contemporary systems, there typically is a hierarchy of buses to
improve performance. (This study unit supplements chapter 3 of Stallings.)
Study unit 5: Computer Arithmetic (This study unit supplements Chapter 10 of Stallings.)
Study unit 6: Instruction sets: Characteristics and Functions (This study unit
supplements chapter 13 and appendix B, as well as appendix O (online) of
Stallings.)
Study unit 7: Instruction sets: Addressing mode and formats (This study unit
supplements chapter 13 of Stallings.)
Study unit 8: Cache memory (This study unit supplements Chapter 4 of Stallings.)
3
COS2621/102/0/2024
Study unit 9: Internal memory (This study unit supplements chapter 5 of Stallings).
Study unit 10: External memory (This study unit supplements chapter 6 of Stallings.)
Study unit 11: Input/ Output (This study unit supplements chapter 7 of Stallings.)
Study unit 12: RISC computers (This study unit supplements chapter 15 of Stallings.)
DEBUG: DEBUG, that is available as part of DOS within the Windows environment, creates an
environment in which relatively small assembly language programs can be developed, run and
debugged with ease. It can also be used to debug larger programs.
This is a very powerful debugging tool which you should learn to use effectively.
Important!
This study guide supplements the prescribed book. You cannot study this module using
only
this study guide. All the theoretical study material is contained in Stallings.
COS2621/102/0/2024
Study unit 1
Organisation and architecture
This study unit supplements the preface, chapters 1, 10 and 12, as well as Appendix B of
Stallings. The most important components of a computer are introduced. A short overview f t
hit t i i d d ii f fi t k
Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
• be able to convert values between the binary, decimal and hexadecimal numbers
systems
Stallings
Study chapter 1; section 1.1 and section 1.2 ;chapter 10; chapter 12: section
5
COS2621/102/0/2024
Chapter 1
Introduction
Chapter 1 introduces the concept of the computer as a hierarchical system. A computer can
be viewed as a structure of components and its function described in terms of the collective
function of its cooperating components.
Each component, in turn, can be described in terms of its internal structure and function. The
major levels of this hierarchical view are introduced. The remainder of the book is organized,
top down, using these levels.
Stallings: Study chapter 1 section 1.1 and section 1.2 Read section 1.3 and 1.4
1.1 Basic building blocks Stallings: Study chapter 10. When working on the low level
of computer systems, we frequently use the binary and hexadecimal number systems.
It is important that you feel comfortable with both these number systems.
Stallings: Study chapter 12.1. The concepts discussed in chapter 12 form the basic
building blocks of computer components.
Activity 1-1
Simplify the following expressions using Boolean algebra. Name the identity used in every
step.
a) [(CD)′ + A]′+ A + CD + AB
a) [(CD)′ + A]′ + A + CD + AB
= ((CD)′)′ A′ + A + CD + AB De Morgan
= xy + x′y Complement
= y(x+x′) Distributive
=y Complement
= A + AC + CB' Complement
It is important that you understand the functions of multiplexers and decoders respectively.
1.3 Summary
The scope of the study material in the text book is discussed in the preface. It is advisable
to read this through to get an overview of the study material covered in this module.
7
COS2621/102/0/2024
Remember that we will not study all the chapters in Stallings and will not work through
the text book in the order in which the chapters are presented.
Make sure that you are able to answer the relevant review questions given at the end of
each of the chapters covered in this unit!
----oooOooo----
Study unit 2
5 Computer performance
4
Performance Issues
2.1 Designing for performance - Microprocessor speed, performance balance, improvements in Chip
Organization and Architecture
2.2 Multicore, MICs, and GPGPUs
2.3 Two Laws that Provide Insight:Amdahl's Law and Little's Law - Amdahl's Law
Little's Law
2.4 Basic Measures of Computer Performance - Clock Speed, Instruction Execution Rate
2.5 Calculating the Mean - Arithmetic Mean, Harmonic Mean, Geometric Mean
9
COS2621/102/0/2024
Hardware and software are generally logically equivalent. This is a simple but very important
concept. Hardware and software are frequently capable of performing the same function.
Compare performing the same logical function in Boolean arithmetic (section 12.1), and
using hardware (sections 12.2 and 12.3). In computer design, we often have to decide which
functions to implement in hardware and which in software.
Both methods have pros and cons that should be considered. Hardware offers speed but not
flexibility, whereas software offers flexibility but less speed. This phenomenon, known as a
"trade-off", is often relevant in this module. Remember that it is a design decision. Except for
the reasons mentioned here, there is no special reason why it is advisable to implement a
specific operation in hardware rather than in software or vice versa, but cost often plays a role
in the trade-off between the different approaches.
COS2621/102/0/2024
2.6 Key Questions
----oooOooo----
11
COS2621/102/0/2024
7 Study unit 3
COS2621/102/0/2024
8 Computer functions and interconnections
13
This study unit supplements chapter 3 of Stallings. The instruction cycle is described. We
also look at different ways in which computer components can be connected.
COS2621/102/0/2024
• Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
Stallings
Chapter 3 of Stallings gives a general introduction to the structures that are used to connect
different computer components. The instruction cycle is explained as well as the handling of
different types of interrupt. We look at ways in which buses are used for interconnection.
Stallings: study section 3.1. Stallings discusses the components that a computer system
consists of and explains why connections between these components are necessary.
Stallings: study section 3.2. The instruction cycle is explained. We look at interrupts and
how and why an interrupt cycle is introduced into the instruction cycle. In this way we can test
for an interrupt after the execution of every instruction. Stallings also discusses ways in which
multiple interrupts could be handled.
3.4 Interconnections
Stallings: study section 3.3. The structure of the different interconnections in a computer system is
discussed in this section.
15
COS2621/102/0/2025
Stallings: study sections 3.4. The general structure of a bus, as well as the way in
which multiple buses can be used to optimise performance, are discussed. Different types of
bus are described and we look at bus arbitration and bus width. The concepts of
synchronous/asynchronous timing between communicating components are considered.
Stallings: read sections 3.5 and 3.6. PCI, as well as the newer PCI Express buses are frequently
used in microcomputers. It is not necessary to pay attention to detail.
3.6 Summary
Make sure that you understand the key terms and that you know what the acronyms listed
----oooOooo----
9 Notes
10 Study unit 4
16
COS2621/102/0/2024
11
Processor Structure and Function
This study unit supplements chapter 15 of Stallings. The most important components of the
CPU are described and processor register organisation is discussed. We look in detail at the
register organisation of the Intel x86 and the ARM processors respectively. The
Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
17
COS2621/102/0/2024
Stallings
4.1 Introduction
We look at CPU organisation using the instruction cycle to explain the need for the different
components comprising the CPU. The basic principles of instruction pipelining are
discussed without going into too much detail. The organisation of the Intel x86 family is studied
in detail since we need this knowledge to write assembly language programs for this family
of microprocessors. Finally, we look at the organization of the ARM processor.
Stallings, study section 15.1. Once you understand the instruction cycle, the organisation of the
CPU will become clear.
Stallings, study section 15.2. The purpose of a variety of sets of CPU registers is explained.
We study the register organisation of the Intel x86 family. We give more detail regarding the
register set of the Intel x86 relevant to our practical work in section
4.5 of this study unit.
Stallings, study section 15.3. Indirect addressing on the Intel x86 is discussed in
19
COS2621/102/0/2024
4.5 The x86 processor
Stallings, study section 15.5. Stallings does not give much detail regarding the assembly
language of the Intel x86 family. As you will be doing your practical work in x86 Assembly
Language, we provide you with additional notes regarding the architecture and assembly
language of the Intel x86 family. It is important to realise that this particular assembly
language is but one of many different assembly languages. Each family of machines has its
own assembly language.
To avoid having to buy an expensive second text book for the practical work, we include a
description of the most important assembly language instructions in this section as well as in
the next two study units. It might take a while before you will be able to see how the different
concepts fit together, but working through some examples is always the easiest way to learn
a new programming language.
Registers are special storage components inside the CPU (Central Processing Unit). The
registers common to all the Intel x86 processors are 16 bits in size and can be classified as
follows:
Flags register: Nine of the bits (flags) in this register are of some
COS2621/102/0/2024
importance to us.
Most of the processors in the Intel family have extended general-purpose registers, extended
index registers and extended pointer registers that are 32 bits in size. These are written as
above with a prefix E. The extended versions of the general-purpose registers are called
EAX, EBX, ECX and EDX. Current Intel processors have 64-bit registers.
For your practical work, we are concerned only with the 16‐bit registers of the Integer
Unit common to all the Intel chips. In this section, we therefore restrict our attention to these
ones.
The general-purpose registers are used for data movement and for arithmetic. Each of these
registers can be addressed as either one 16-bit (2 bytes) register or as two 8-bit (1 byte)
registers. The leftmost byte is the high portion and the rightmost byte is the low portion.
For example, AH and AL are the high and low portions of the AX register:
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
16-bit AX register
Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
Note that the bits are numbered from right to left on the Intel machines.
AX register: The AX register is known as the primary accumulator. Some text books refer
to the AX register as the accumulator. The AX register is mainly used for operations involving
data movement, input/output and arithmetic. Some instructions such as MUL and DIV
assume that AX contains the multiplicand and dividend respectively.
21
COS2621/102/0/2024
BX register: The BX register is the only general-purpose register that can be used as a
pointer to extend addressing. It is therefore referred to as the base register. It is also used for
arithmetic. We will have another look at extended addressing when we discuss addressing
modes.
CX register: The CX register is known as the count register since it is used, for example, to
control the number of times a loop is to be executed, or the number of shifts that should be
performed. It can also be used for arithmetic.
DX register: Some input/output operations such as IN and OUT use the DX register. Multiply
and divide operations involving 16-bit registers make use of the combination of DX and AX.
The DX register is thus known as the data register.
Memory consists of a collection of segments each of which is 64K bytes long. A segment is
an area in memory that begins on a paragraph boundary, in other words on an address that
is a multiple of 1610 (10h). This means that one segment could start at address 00000,
another at address 00010h, another at address 00020h, et cetera. This also means that
segments can overlap. Note that the starting address of a segment always ends with a 0
when written as a hexadecimal number. For this reason, the designers decided that it would
be unnecessary to store the last digit, which is always 0, in the segment register, so the
address of a segment is always stored without the rightmost 0.
For example, the address of the segment starting at 18A30h will be stored as 18A3h.
Some text books show the rightmost 0 in square brackets, for example 18A3[0]h.
A segment may be located anywhere in memory (as long as it starts on a paragraph boundary)
and may be as small as 1 paragraph (10h bytes) and as large as 64K bytes in length.
However, it requires only as much space as is required for execution by the program that
uses it.
A program written in x86 assembly language is divided into three primary segments, namely:
The code segment (pointed to by the CS register) contains the machine instructions of the
program. All references to memory locations that contain instructions of a program are relative
to the start of a segment specified by the contents of the CS register. A byte of memory is
referenced by the segment that contains it, followed by an offset from the beginning of that
segment. The offset is indicated by the IP register (instruction pointer).
The notation used is segment:offset. The offset is between 0000h and FFFFh (FFFFh = 64K-1).
The IP register always contains the offset (relative to the start of the code segment) of the
next instruction to be executed, so CS:IP forms the actual 20-bit address of the next
instruction to be executed. The actual 20-bit address is also referred to as the effective
address.
Suppose CS contains BCEFh and IP contains 123h. The actual 20-bit hexadecimal address referred
to by BCEF:0123 is calculated as follows:
(CS): rightmost 0
0123 in
address
Offset (IP): Add the two addresses
t th
Activity 4.1
(ii) 74D6:0100
(iii) 3DA3:311
23
COS2621/102/0/2024
Solution
We do not always refer to addresses in the notation given above. When a memory location
in a specific segment is addressed from within that segment, we use only the offset part.
Example: If a program is stored in some code segment, only the offset is used to refer to
other memory locations within this segment. Thus, JMP 123h will cause a jump to offset 123h
within the memory segment where the program is loaded.
The IP register cannot be referenced directly in a program by a programmer. Its value can be
changed when using DEBUG during the debugging and testing phase of a program. (Refer
to Appendix A.3.2.) However, the value of IP can be changed indirectly by the programmer by
using JMP (jump) and conditional branch instructions.
You need not worry about the CS register. During normal execution of a program, instructions
are automatically fetched from memory. You do not need to concern yourself with the values
of CS and IP or any of the values of the segment registers.
The data segment (pointed to by the DS register) contains the variables, constants and work areas
of a program.
COS2621/102/0/2024
The actual address where data is fetched from memory when an instruction such as MOV
AL,[120h] is executed, is not the physical memory location 120h but the memory location
120h relative to the contents of the DS register.
Suppose the DS register contains 0600h. The actual 20-bit location from which the byte of data
will be moved can be calculated as follows:
Actual address:
You need not worry about the DS register either. The operating system sets the DS
register to a suitable value when a program is loaded. In actual fact, the operating system
decides where in memory both the program and the data should be located. One can set
the DS register to insist on using specific locations for data, but we will not use this
functionality in COS2621.
The stack segment (pointed to by the SS register) contains the program stack. The stack is
an area in memory that is used to save data and addresses that need to be temporarily stored
while the program is executing.
We demonstrate the use of the SS register in study unit 6 of this study guide.
The extra segment register (ES) is used during the manipulation of sequences of characters
by some string operations. The ES register is associated with the DI register which is
discussed in a later section. We will not work with the ES register in COS2621.
SI and DI: Index registers contain the offset of memory positions relative to the start of the
segment. SI (Source Index) and DI (Destination Index) are used to indicate source and
destination addresses when strings of characters are written to and read from memory. SI
usually contains an offset value from the DS register, but it can address any memory position.
The source string is pointed to by the SI register. The DI register is the destination for string
25
COS2621/102/0/2024
movement instructions. It usually contains an offset from the ES register but can address any
memory position.
The use of the SI and DI registers will be demonstrated in examples in study unit 6 of this study
guide.
The stack is a special area in memory that is used for the temporary storage of addresses
and data. The stack is located in the stack segment. You may think of the stack as a number
of boxes stacked on top of each other. The last one to be added on top (pushed onto the
stack) is also the first one to be removed (popped) from the stack.
The Stack Pointer (SP) register holds the address (in the stack) of the last element to be added
to (pushed onto) the stack. In other words:
We can say that the SP register contains the offset from the beginning of the stack to the top of
the stack. So SS:SP contains the address of the top-of-the-stack.
The Base Pointer (BP register contains an offset from the SS register. Normally, the only
word in the stack that is accessed is the one on top of the stack. However, the BP register
can also keep an offset in the stack segment and can be used in procedure calls, especially
when parameters are passed to subroutines. SS:BP contains the address of the current
word being processed in the stack. More information about the stack and the BP and SP
registers will be given in study unit 6 of this study guide.
COS2621/102/0/2024
Stallings: figure 15.22. The EFLAGS register is used as a flags register. Nine of the 32 bits
of this register are common to all x86 processors. The other bits are not important for
COS2621 and we will not consider them in this section.
Each time an arithmetic instruction is executed, certain flags will be set either to 1 or 0 to
indicate the outcome of the operation. Not all instructions affect the flags and not all flags are
affected by instructions that do have an effect on flags. The way in which the individual flags
are affected by an instruction is shown in Appendix E where the instructions comprising the
x86 instruction set are listed.
The following bits in the flags register represent the individual flags we are interested in:
0 CF: Carry Flag 1) The carry from the high-order 1) If addition produced a
(leftmost) bit following an arithmetic carry or if a subtraction
operation produced a
borrow
2) The result of a
CoMPare instruction 2) When the data
elements compared are not
equal
3) The bit which has been shifted
or rotated out of a register or memory 3) If the bit which has
location been shifted or rotated out
of a register or memory
location is equal to 1
2 PF: Parity Flag Mainly used when transmitting data If the result of an operation
has an even number of 1-
bits
27
COS2621/102/0/2024
4 AF: Auxiliary Flag Similar to CF, except that it indicates See CF
the presence or absence of a carry
or overflow based on a 4-bit numeric
representation in bits 0, 1, 2, 3. It is
useful for operations on ‘packed
decimal’ (BCD) numbers
6 ZF: Zero Flag Indicates whether or not the result of Set to 1 for a zero result!
an arithmetic or compare operation
is equal to zero
7 SF: Sign Flag Contains the resulting sign after an Set to 1 if result is negative
arithmetic operation
For signed data, the range of signed numbers that can be stored in n bits is -2
The following table summarises the conditions under which overflow occurs:
No No No
No Yes Yes
Yes No Yes
Yes Yes No
Confusing as it may seem, the contents of a data field mean whatever you intend it to mean.
As long as the process writing the data and the one reading it interpret it the same way, there
is no problem. Apart from signed and unsigned integers we may also store floating-point
numbers or character data in ASCII, EBCDIC, UNICODE or any other character code.
Let us look at a few examples to illustrate how the OF and CF are used to indicate overflow for
signed as well as unsigned numbers.
Note: The results in columns 2 and 3 are the equivalent of the binary numbers in column
1 interpreted as unsigned binary numbers and signed numbers in twos complement
representation, respectively. Look at the second addition for example:
29
COS2621/102/0/2024
(1000 0100)2 is the twos complement representation of (-124)10 in column 3 which differs from (119
+ 13 = 132)10 in column 2.
Examples:
1111 1111 1110 0111 -25 No overflow. The carry into the
sign bit (1) is equal to the carry
+ 1111 1111 1111 0110 -10 out (1) of the sign bit.
(1)1111 1111 1101 1101 -35 Thus: OF = 0
This is the twos complement Valid
representation of -35. Remember
that we discard the carry out of the
sign bit.
The x86 machines store bytes in reverse byte order in memory. Consider the number
0015h. Two bytes are required to store this number. The 16-bit number 0015h consists of a
high-order (most significant) byte 00h, and a low-order (least significant) byte 15h.
Suppose we want to store 0015h in memory from location 101h onwards. In memory the
low-order byte, 15h, will be stored in the low-order position 101h, and the high-order byte,
00h, will be stored in the high-order position 102h. Thus 0015h is stored in memory as
follows:
31
COS2621/102/0/2024
Memory position 101h 102h
The CPU expects numeric data in memory (not registers) to be stored in reverse byte
order and processes it accordingly. You must be aware of this peculiarity otherwise you may
get confused when you examine the contents of memory locations. The CPU reverses the
bytes again if data is loaded into registers from memory locations.
Activity 4.2
Execute the following two instructions and inspect the contents of memory locations 200h and
201h:
[200h],ax
4.5.3 Interrupts
An interrupt causes the interruption of normal program execution. In other words, the flow of
control is interrupted. Interrupts can be caused by hardware or software. Interrupts are
handled either by DOS or by BIOS. The DOS routines are part of the particular operating
system that is loaded, whereas most of the BIOS routines are hardwired onto the
motherboard.
service from the operating system. Interrupts can be invoked by using the INT instruc- tion.
Software interrupts can also be referred to as system calls
The steps taken when a hardware interrupt occurs are similar to those when a software interrupt
occurs. The hardware interrupt is handled by the interrupt handler that
associates a specific vector in the interrupt vector table with that particular hardware interrupt.
Control is passed to the relevant interrupt service routine in the same way as is done when
a software interrupt occurs.
4.5.3.1 The interrupt vector table
Stallings, table 15.3. A part of memory is reserved for the interrupt vector table. Each of
the interrupts available on the Intel chips has a 4-byte (2-word) entry in the interrupt vector
table to accommodate an offset and a code segment address. For interrupt type 0, the
instruction offset is stored in the word at address 0, and the code segment address is stored
in the word at address 2. For interrupt type 1, the offset is stored in the word at address 4,
and the code segment address is stored in the word at address 6. In general, for interrupt
type n, the instruction offset is stored in the word at address 4*n, and the code segment
address is stored in the word at address (4 * n) + 2.
Interrupt Address Contents of address
number
33
COS2621/102/0/2024
. .
. .
. .
6 Code segment
address
1 4 Offset address
0 0 Offset address
Each code segment and offset points to its own interrupt handler (also called interrupt
service routine). This is a block of code similar to a procedure, which must be executed if
that particular interrupt occurs.
4.5.3.2 The INT instruction ‐ software interrupts
The above description of how interrupts function might seem complicated but fortunately, the
operating system does most of the work. The programmer needs only to place one or more
values in registers and to invoke the relevant interrupt instruction. The instruction used for
interrupts by the Intel family is the INT instruction. Since (256)10 (i.e. 0 to FFh) different types
of interrupt are allowed, we must specify which one is required.
The format of the INT instruction is as follows: int number (number is an integer from 0 to
FFh).
The interrupt instructions which we will use frequently are INT 20h and INT 21h. INT 20h
causes the program to terminate and control to be transferred back to DOS. INT 21h allows
operations such as reading a character from the keyboard, printing a character on a printer,
and displaying a character on the screen. We call the specific action to be performed a
function or service. The AH register is used to specify which function is required. Example
(refer to INT 21h in Appendix D):
mov ah,2 ; Select function 2, display character
COS2621/102/0/2024
There are a number of routes we can follow when creating an assembly language program on
the Intel x86. Two of the possibilities are shown in Figure 4.1. We can use NASM to assemble
the source code. We can then use DEBUG, which is available under DOS, to trace through the
program. DEBUG can also be used to assemble small programs but it has limitations.
ALGORITHM
Source Program
DEBUG EDITOR
*.ASMfile
DEBUGinput
DEBUG
35
COS2621/102/0/2024
Stallings, read section 15.6. It should become clear why we need to study the register
organisation of the machine we are working with at a low level if we compare the register
organisation of the ARM processor to that of the Intel x86.
4.7 Summary
Stallings, section 15.8. Make sure that you understand the processor organisation of the Intel
x86 and that you know the meaning of all the key words listed in this section.
----oooOooo----
12 Notes
13 Study unit 5
COS2621/102/0/2024
Computer arithmetic
14
This study unit supplements chapter 11 of Stallings. The way in which integer and floatingpoint
numbers are represented in a computer is explained. We look at instructions that can be used
on the Intel x86 to perform arithmetic and logical, as well as bit-manipulation
Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
• use assembly language to perform bit manipulation operations on the Intel x86
• calculate the value of a floating-point number that is stored in a computer given the
37
COS2621/102/0/2024
Stallings
5.1 Introduction
Stallings: study section 11.1 The way in which integers and floating-point numbers are stored
in a computer is discussed. We look at some instructions that can be used on the Intel x86 to
do arithmetic and logical operations.
Stallings: study section 11.2. There are a number of different ways in which integers can be
represented on a computer. Stallings only discusses sign-magnitude and twos complement
representation. Remember that integers can also be referred to as fixed-point numbers.
This section of the study guide contains examples that illustrate the use of arithmetic and
bitmanipulation instructions. You can use either DEBUG or NASM to assemble these instructions
to see what happens.
COS2621/102/0/2024
Remember that DEBUG treats all numbers as hexadecimal whereas NASM treats numbers as
decimal unless otherwise indicated.
Activity 5.1
Consider the following arithmetic operations:
Very important: Note that the first operand, i.e. a, must first be loaded into AL/AX in order to
execute an IMUL/IDIV on 8/16 bits.
(a) a*b (both a and b are 8 bits long) use IMUL (integer multiplication)
39
COS2621/102/0/2024
(b) a/b (both a and b are 8 bits long) use IDIV (integer division)
; 16-bit remainder in DX
Things to remember:
With IMUL, MUL, IDIV and DIV, the first operand is assumed to be in AL or AX. Be careful:
IMUL AL,BL will not be flagged as a syntax error but will give incorrect results because BL will
be ignored: the contents of AL will be multiplied with the contents of the first operand as
specified in the instruction, namely AL, i.e. IMUL AL,BL is interpreted as IMUL AL.
DX:AX is used for 16-bit multiplication operations giving a 32-bit result stored in the combination
of the DX and AX registers.
With 16-bit division operations the dividend is assumed to be in the 32-bit register DX:AX.
The quotient is stored in AX and the remainder is stored in DX.
We will write a simple assembly language program to illustrate the movement of data between
registers and memory.
Activity 5.2
Use memory location temp as a temporary storage area. Assume that a, b, c, d and e are 8-
bit integers and that they are stored in memory positions a, b, c, d and e respectively.
Note the following:
(i) The x86 assembly language does not allow direct data movement from one memory
position to another. Such data movement must occur via one of the registers.
(ii) The first operand is the destination operand, e.g. MOV AX,BX will result in the contents of
BX being stored in the AX register.
41
COS2621/102/0/2024
(iii) DEBUG and NASM do not allow the specification of a memory position in a multiplication
operation. Thus imul [a] is illegal in both DEBUG and NASM. However, it is legal in some
commercially available assemblers such as MASM.
AX = a*b:
mov al,[a] ; Move the contents of ‘a’ to AL
AX = c*d
mov [temp],ax ; Store the contents of AX in ‘temp’
mov al,[c] ; Move the contents of ‘c’ to AL
AX = c*d*e
mov bl,[e] ; Move the contents of ‘e’ to BL
; Product in AX
AX = a*b + c*d*e
Our program might not give the correct result. Why not? Think about this for a moment.
We assumed that the result of c * d and of c * d * e would fit into AL, in other words into 8 bits.
Things to remember:
MOV AL,[a] moves one byte from memory to register AL because AL is an 8-bit register, therefore
it is a byte operation.
MOV AX,[b] moves two bytes from memory to register AX because AX is a 16-bit register. It
is thus a word operation.
Activity 5.3
MOV [a],'N' is illegal in NASM and DEBUG because the size of the operands cannot be
determined, i.e. whether it should move 8 bits or 16 bits. We have to store the ASCII character
'N' in either an 8-bit or a 16-bit register and move the contents of this register to a.
mov cl,'N'
mov [a],cl
The instructions available in x86 Assembly Language for Boolean operations that can be used
for bit manipulation are AND, OR, NOT and XOR.
Activity 5.4
For Intel x86 machines: Remember that bits are numbered from right to left, starting from 0.
Suppose we want to determine whether or not bit 5 of the AL register is set to 1. We set up a
so-called mask in another register (BL in the code below) with all the bits equal to 0 except bit
5 which is equal to 1. Use BL for the mask (BL = 00100000).
When we execute AND BL,AL all the bits in the destination register, BL, will be set to 0 if
bl,al
The destination register BL will contain all 0s if bit 5 of AL was equal to 0, otherwise bit 5 of BL
will be equal to 1.
Direct bit access instructions such as BSF, BSR, BT, BTC, BTR and BTS are available
45
COS2621/102/0/2024
10111011
SHL AL,1 01110110 1 Shifts each bit one position to the left, filling with
a 0 on the right (low-order bit).
*SHL AL,2 11011000 1 Shifts each bit two positions to the left, filling
with 0s on the right. High-order bits move
through CF, so the last one to move out will be in
CF.
MOV CL,3 CL = 3
Contents AL of CF Explanation
after
execution
instruction of
10111011
COS2621/102/0/2024
SHR AL,1 01011101 1 Shifts each bit one position to the right, filling
with 0 on the left (high-order bit).
Low-order bit moves to Carry Flag (CF).
*SHR AL,2 00010111 0 Shifts each bit two positions to the right, filling
with 0s on the left. Second bit out moves to
CF.
MOV CL,3 CL = 3
The SAL (Shift Algebraic Left) and SAR (Shift Algebraic Right) instructions The
SAR is not identical to SHR. The sign bit is not regarded as a data bit. Bits are shifted to the
right a specified number of times and the sign bit retains its original value. Low- order bit
moves to Carry Flag (CF).
Contents of CF Explanation
AL after
execution of
instruction
10111011
SAR AL,1 11011101 1 Shifts each bit one position to the right, filling left-
most position (high-order bit) with a copy of the
sign bit. Low-order bit (right-most bit) moves to
Carry Flag (CF).
*SAR 11110111 0 Shifts each bit two positions to the right, duplicating
AL,2 the sign bit on the left. Second bit that moves out
on the right-hand side moves to CF.
47
COS2621/102/0/2024
MOV 00000111 AL = 07
AL,07
SAR AL,1 00000011 1 Shifts to the right. Duplicate sign bit on the left. Sign
bit retains its original value (0). Low-order bit (right-
most bit) moves to Carry Flag (CF).
Contents of AL CF Explanation
after execution
of instruction.
10111011
ROL AL,1 01110111 1 Shifts each bit one position to the left, filling
right (low-order) bit with a copy of the sign bit.
ROL AL,1 11101110 0 Shifts each bit to the left, duplicating the sign bit
on the right. The sign bit is also copied to CF.
ROR AL,1 01110111 0 Shifts each bit to the right. The bit on the right
(low-order) moves into position on the left (most-
significant bit) and into the CF.
The RCL (Rotate through Carry Left) and RCR (Rotate through Carry right)
COS2621/102/0/2024
Contents of AL CF Explanation
after execution
of instruction
10111011 0
RCL AL,1 01110110 1 Shifts (rotates) each bit one position to the left,
filling right (low-order bit) with a copy of the CF.
The high-order bit moves into the Carry Flag
(CF).
RCR AL,1 01110110 1 Shifts (rotates) each bit to the right. The bit in
the CF moves into the position on the left
(most- significant) and the bit on the right
moves into the CF.
Activity 5.5
49
COS2621/102/0/2024
the program. DEBUG interprets the assembled instruction differently, but executes it correctly!
Rather code shifts as shown in the example given above.
Suppose we want to determine the value of bit 0 of the AL register and jump to memory position
200h if it is equal to 0.
Stallings: study section 11.4. Stallings only considers the IEEE and the IBM S390 standards.
To give you some additional background, we look at different representations that are used on
other machines. These are the older IBM 370 format and the format used on the DEC PDP/11
and VAX machines.
The format for single-precision floating-point numbers as discussed in Stallings can be represented as
follows:
8 bits 23 bits
e f
s
To get the value of a number given a 32-bit bit pattern, we use the following formula: value
Note the implied 1 to the left of the binary point. Remember that in the IEEE format the number
is normalised in such a way that we have one 1 to the left of the binary point.
Since this 1 is always present, it is not necessary to store it. This is known as a hidden bit.
Note that this formula cannot be used for numbers that have not been normalised.
Activity 5.6
7.687510 = 111.10112
= 1.1110112 × 22 normalise
Remember that we have a hidden bit to the left of the binary point and that this is not stored
as part of the number. This means that f = 1110110...0.
s e f
51
COS2621/102/0/2024
1000000 1110110....0
1
1
7 bits 24 bits
e f
s
Note that the radix of the exponent is 16 and not 2! The exponent is represented using excess
64.
Activity 5.7
7.687510 = 111.10112
s e f
100000 011110110....0
1
1
The format of a single-precision floating-point number in DEC PDP 11/VAX format is as follows:
8 bits 23 bits
E f
s
This format also uses a radix 2 for the exponent. Note that in this case, the 1 to the right of
the binary point in the formula. Again, since we are working with normalised numbers, it is not
necessary to store the first 1 after the binary point. So we again have a hidden bit in this
representation. Let us look at an example.
Activity 5.8
Show how you would represent 7.6875 using the DEC single-precision floating-point representation.
7.687510 = 111.10112
= 0.1111011 × 23 normalise
53
COS2621/102/0/2024
So e = 10000011
We have a hidden bit to the right of the binary point so the first 1 of the significant is not stored.
f = 1110110...0
s ef
1000001 1110110......0
1
0
COS2621/102/0/2024
5.5 Summary
Stallings: section 11.7. Make sure that you understand the arithmetic, shift and logical
instructions discussed in this study unit and that you can convert floating-point numbers to the
format in which they will be stored. You should know the meaning of all the key terms that are
listed in this section.
----oooOooo----
15 Study unit 6
55
COS2621/102/0/2024
Instruction sets 16
This study unit supplements chapter 13 and Appendix B in Stallings. Several very
important concepts are discussed. These include operand types, instruction types and types
of operation. The use of subroutines and macros are discussed and we also look at t i h i
Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
Stallings
Study sections 13.1 - 13.4; the first part of section 13.5; Appendix B.1, and also section
14.5.
6.1 Introduction
Stallings: study section 13.2. Operands are data on which machine instructions
operate. Make sure that you understand the different categories of data.
Note that an address is also a data item in its own right.
Stallings: study section 13.3. It is important that you work through sections 4 to 6 of
Appendix B in this study guide before continuing with this study unit. You will not
understand the examples if you do not know the different data types and how storage
space is reserved when using NASM.
Stallings: study section 13.5 up to, but not including, the section on x86 SIMD instructions.
Read the rest of section 13.5.
57
COS2621/102/0/2024
We consider some of the concepts discussed in this chapter of Stallings in more detail:
The JMP instruction is used for branching and is called an unconditional jump.
To branch or not to branch! A conditional jump uses the status of the flags to decide
whether or not to branch. A conditional branch frequently follows after a compare statement,
CMP. Example:
Note that a conditional jump instruction is only two bytes long. The destination address is
stored in one byte as an offset from the current value of IP, in other words, the difference
between the current value of IP and the destination address. This offset must fit into one
byte, otherwise NASM will flag an error. This means that we can only branch one "byte" far!
Suppose we want to assemble the instruction JL 300h at address 200h. You will see that
the offset from address 200h to address 300h is 100h bytes (300h - 200h). If you try to
assemble this instruction it will be flagged as an error by NASM because we have only one
byte to store the offset from 202h (the current value of IP which points to the next instruction
COS2621/102/0/2024
to be executed) to the destination address. This means that, from address 202h, we cannot
conditionally jump further than address 281h forward (281h - 202h = 7Fh = 12710), or
Activity 6.1
Conditional branching
temp = y; x
= 0;
top:
= x + 1;
temp = temp - z;
main:
59
COS2621/102/0/2024
There are, of course, various ways in which loop control can be handled. In x86 assembly
language, the CX register, in combination with the LOOP instruction, can be used for this
purpose. It works as follows:
Initially, we set CX to the number of times we want the loop to be repeated. When the LOOP
instruction is executed, CX is decremented by 1. If CX ≠ 0, we branch back to LOOP_1. If CX
= 0, the instruction following the LOOP instruction is executed.
Activity 6.2
COS2621/102/0/2024
Write an assembly language program equivalent to the following short algorithm: This algorithm
implements the operation x = y * z by repeated addition.
61
COS2621/102/0/2024
x = 0; for temp =
y to 0
x = x + z;
add al,bl ; AL = AL + z
loop loop_1 ; Repeat loop if CX <> 0
mov [x],al ; x = y * z
int 20h
;end
Note that we assume that the result will fit into one byte.
COS2621/102/0/2024
Refer to the examples in Appendices C and D in this study guide. You will see that we use a
set of DOS and BIOS routines for all input and output. When we want to use one of these
routines, we have to set up the registers in a predetermined way and issue an interrupt
instruction. When the INT instruction is executed, control is transferred to either DOS or BIOS
(depending on the interrupt issued) that handles the I/O and returns control back to our
program.
Appendix D of this guide contains a number of I/O functions that we will find handy to use.
Subroutines
Procedures are used in Pascal, and methods in C++ and Java, to divide a program into smaller
portions. In assembly language, a subroutine can be regarded as the equivalent of a
procedure.
There are various reasons for using subroutines in a program. The program may be very
large and different parts of it may be written by different programmers. Each programmer
may write his/her parts of the program in the form of subroutines. One programmer will be
responsible for writing the main program and all the routines will eventually be linked
together into one program. Another reason may be that parts of the program are written in
assembly language for efficiency and the rest of the program in a high-level language like
C++ or Java. Thirdly, a subroutine may be reusable; written once and used by different programs.
One of the reasons for using subroutines in COS2621 is to make the writing and testing of our
assembly language programs easier. The program can be divided into small routines that can
be tested in isolation. If we know that a routine is working correctly, there will be no need to
step through the code again as we test the rest of our program. The program needs a main
routine that will call the individual subroutines as required. All the code is combined into one
program. In COS2621, subroutines are not assembled and stored separately.
63
COS2621/102/0/2024
. .
. .
When the CALL instruction is executed, the current value of the IP, i.e. the address of the
instruction following the call, is automatically pushed onto the stack. The address of the first
instruction in the subroutine is stored in the IP and execution carries on from that point. When
the RET instruction is encountered in the subroutine, the top of the stack is popped into the IP
and the calling program can start executing at the instruction following the call. From this
description the importance of returning from a subroutine to the calling program via the RET
instruction should be clear. One should never jump out of a subroutine back to the calling
program without restoring the state of the stack. Fortunately, this is done automatically for us
as part of the RET instruction.
When a program is very large it can be subdivided into separate subroutines that are
assembled individually and eventually linked together. This can be done using NASM, but this
will not be covered in COS2621.
Parameters can be passed to the subroutine in a number of different ways. We look at the
following possibilities in this section:
1. By defining a block in memory, called a parameter block, where the parameters are stored.
2. By storing the parameters in registers before calling the subroutine.
3. By pushing the parameters onto the stack.
In x86 assembly language, parameter passing can be done in any of the ways described
above. It is very important for reusability that the comments in the subroutine heading describe
very clearly where and how the subroutine expects to find the parameters and where the
result, if one is passed back to the caller, will be stored by the subroutine.
COS2621/102/0/2024
Calling by name
65
COS2621/102/0/2025
If parameters are passed using a parameter block in memory (calling by name), the address
of the block is stored in a specific register before the subroutine is called. Let us look at an
example.
Accept_string:
ret ; Return
This parameter-passing mechanism is suitable to use when we have one or more parameters
that may include data structures such as arrays or strings.
If parameters are passed in registers (calling by value), the calling program must simply store
the parameters in the relevant registers before calling the subroutine.
This mechanism is fast but can only be used when we have to pass a few values that will
fit into registers. COS2621/102/0/2025
Parameters can also be passed on the stack although this may involve tricky
programming. Remember that the return address is pushed onto the top of the stack when
the subroutine is called.
This mechanism is suitable for passing a small number of individual values/addresses but only
in cases where there is a restriction on the number of available registers. It is not suitable for
passing parameters that involve relatively large data structures. Note that it will be slower than
passing parameters in registers since we have to manipulate the stack which resides in
memory.
We use the exercises in the following section to illustrate the different possibilities.
Consider a subroutine that returns the largest of three numbers that are passed to it as
parameters. The following code forms the body of the subroutine, i.e. that part of the
subroutine that does the work and will be the same no matter how the parameters are passed.
;
; We assume that the three numbers are in AX, BX and CX
; respectively. The result is returned in AX
;
body: cmp ax,bx ; AX > BX?
jg next1 ; Yes, compare next mov ax,bx ; BX
greater
67
COS2621/102/0/2025
epilog: ...
Assumptions: Let us assume that the three values are stored in memory in A, B and C
respectively. The result must be stored in A. We also assume that the contents of registers
AX, BX, CX and DX will be undefined after the subroutine call.
Activity 6.3
The parameters are passed on the stack and the result is returned on the stack.
The three parameters are pushed onto the stack before the call. The result (i.e. the largest number)
is returned on the stack.
Start-up sequence
push [A] ;
push [B] ;
push [C] ;
Push the three values into the stack
Prolog
Epilog
Clean-up sequence
69
COS2621/102/0/2024
pop [D] ; Pop the result into D
Activity 6.4
The parameters are passed in AX, BX and CX respectively. The result is returned in AX.
Start-up sequence
mov ax,[A] ;
mov bx,[B] ; Get values into registers mov
cx,[C] ;
Prolog
Nothing needs to be done in the prolog. The parameters are already in the required registers.
COS2621/102/0/2024
Body
Epilog
Nothing further needs to be done in the epilog. The result is already in AX.
ret
Clean-up sequence
Activity 6.5
The address of the parameter block (in memory) is passed to the subroutine in the DX
register. The result is returned in the parameter block.
71
COS2621/102/0/2024
sequence
Prolog
Body
Epilog
; block
ret
Clean-up sequence
Calling program, get the result: Nothing needs to be done in the clean-up sequence. The result
is already stored in memory at the required address.
The subroutine calls we have looked at so far are all near calls. This means that the main
program as well as the subroutine reside within the same segment. The value of the CS
register will be the same for both.
If the main program and the subroutine are not within the same segment, or if the subroutine
was assembled and stored as a separate unit, the main program has to make a far call to the
subroutine. When a far call is made, the value of the IP as well as the current value of the CS
register is stored on the stack. On return from the subroutine, the CS register is restored to
the code segment address of the caller and the IP register is restored to point to the
instruction following the call where the execution of the caller must be resumed. We are only
going to use near calls in our programs.
6.4.7 Recursion
This section does not form part of the study material but is included for interest’s sake. You
have probably already encountered recursion in COS2611. DEBUG is an excellent tool to
use to see exactly what happens when we call a recursive routine. Use the
73
COS2621/102/0/2024
following example to step through the program and study the stack after each call and after
each return.
Activity 6.6
bits 16 org
0x100
jmp main ; Jump to main program
;
; This recursive subroutine calculates the sum of ; the
numbers 1 to n (CX initially contains n).
; AX must contain 0 on first call. Result is returned in ;
AX. recur:
or cx,cx ; Is CX = 0?
jz return ; Yes, return
Return address CX AX AX
1st call: L1 6 0 0
2nd cal L2 5 6 6
3rd call: L2 4 11 B
4th call: L2 3 15 F
5th call: L2 2 18 12
6th call: L2 1 20 14
7th call: L2 0 21 15
In our example 1 + 2 + 3 + 4 + 5 + 6 = 21(decimal) = 15h.
We suggest that you test the above program by using the T option of DEBUG. Inspect the
stack after each call, and after each return instruction has been executed.
Things to remember:
The return address: In x86 assembly language, the CALL instruction pushes the offset of the
next instruction to be executed onto the stack and transfers control to the subroutine. The
RET instruction pops the address on top of the stack into the IP and execution resumes from
this location.
75
COS2621/102/0/2024
Stallings: revise section 12.5. The main reason for using assembly language is to improve
the efficiency of a program if the execution time of a program is crucial.
There is also method in the madness of our teaching assembly language to our students! It
helps one to understand the underlying operations of a computer system and also to
understand the way in which a compiler operates.
Pseudo-instructions are also called assembler directives. These are instructions to the
assembler and do not form part of the program code that will be executed during run time.
DB and DW are examples of assembler directives that we have come across in our
programs so far. Assembler directives tell the assembler where and how to reserve
memory and how it should be initialised. There are several other assembler directives that
are important when we are implementing large programs but we will not consider these
here.
6.5.2 Macros
It is important to note the difference between a macro and a subroutine. A macro is not
the same as a subroutine. We have already looked in detail at subroutines in the previous
section. When a subroutine is called, we branch to an area in memory where the
subroutine is stored and execute the code. Then we branch back to our main or caller
program. This happens at run time.
COS2621/102/0/2024
When a macro is called, the piece of code comprising the macro is duplicated in the
program. We call this process macro expansion. This happens during assembly time.
Activity 6.7
A macro gives a name to a piece of code. Whenever we refer to the macro, the code is
duplicated in the program during the assembly process.
db 'Hello World',10,13,'$'
mes2:
%endmacro
;
77
COS2621/102/0/2024
14 mov ah,09
15 int 21h
16 %endmacro
17
18 main:
19 disp mes1
20 00000029 BA[0300] <1> mov dx,%1
21 0000002C B409 <1> mov ah,09
COS2621/102/0/2024
79
COS2621/102/0/2024
Note how the code within the macro definition is duplicated in the main program. The macro definition
does not exist in the assembled program.
NB: You should be able to manipulate the stack on an x86 machine. See section
4.5.1.4 in the study guide. We will consider the stack again in the next study unit.
6.7 Summary
Stallings: section 13.7 Make sure that you understand the different categories of data, the use of all
the different kinds of instruction, and the use of subroutines, macros and pseudo-instructions. You
should also know the meaning of all the key terms that are listed in section 13.7.
----oooOooo----
17 Study unit 7
101
COS2621/102/0/2024
Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
• give examples of instructions in which specific x86 addressing modes are used
Stallings: study section 14.1. The addressing mode used in an instruction indicates the way in
which the operand is accessed. We have two operands in binary operations such as addition, for
example, but generally, at least one of the operands is stored in a register.
Consequently, we refer to the addressing mode of the operand that is not obtained directly from a
register as the addressing mode of the instruction. The examples below illustrate this principle.
Stallings: study section 14.2 excluding the section on the ARM addressing modes.
COS2621/102/0/2024
Read the rest of section 14.2, i.e. the section on ARM addressing modes.
We examine the addressing modes of the Intel x86 in more detail below.
A number of different addressing modes can be identified in the Intel x86 instruction set. The addressing
mode is determined by the operands in the instruction.
Register addressing:
add ax,bx ; AX = AX + BX
Both operands are in registers. The first operand is always the destination operand.
Immediate addressing:
; into BX
103
COS2621/102/0/2024
Note: In contrast to this, mov bx, temp stores the address of the label temp in BX.
One of the operands is the address of the actual operand. This address is the offset (displacement)
from the start of the Data Segment.
Stallings: revise and study section 13.3. We use examples to illustrate the use of the following two
addressing modes where indirect addressing are used and which are a bit more complicated. Both of
these are examples of register indirect addressing.
Indirect addressing means that the operand given in the instruction contains the address of the actual
operand. On the Intel x86 we can only have register indirect addressing using SI, DI, BP and BX. Hence
indexed addressing and base-indexed addressing, which we consider in the examples given in section
7.2.1 are both different forms of register indirect addressing. We can also use BP and BX for indirect
addressing.
Activity 7.1
Consider the case where we want to access each individual element in a character string. We
can regard the string as an array of characters and use a pointer to access each individual
character. Suppose we have the following string that starts at memory position 102h:
String S C I E N C E I S F U N
Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13
If we set SI (the source index register) to the start of the string, we can use SI as a pointer to access
the characters one at a time. The character 'S' is stored at offset 0 from the start of the string and
the character 'N’ is stored at offset (13)10 from the start of the string.
Example:
Suppose we want to access the character 'F’ which is stored at offset (11)10 from the start of the string.
db 'SCIENCE IS FUN'
.
mov si,102h; Move addr of start of string to SI add si,11 ;
105
COS2621/102/0/2024
Let us look at another example to illustrate how we can use indexed addressing to access each
element in the string.
Activity 7.2
Read each element of a string (array of characters), which we assume is in upper case, from
memory, convert it to lower case and store it in an output string.
Assume that the input string, 'SCIENCE IS FUN', is stored in memory. We use indexed
addressing as follows:
We use SI to point to the input string, so SI is initialised to the starting address of the input string.
We use DI to point to the output string, so DI (Destination Index) is initialised to the starting
address of the output string.
Convert to lower case (This is done by adding 20h to the contents of AL since
COS2621/102/0/2024
the difference between an upper and lower case ASCII character is 20h)
Increment SI
Increment DI
End do;
Solution
Note that we do not test whether the character is in fact an upper case character, so the program will
give incorrect results if the input string contains characters that are not upper case (which is indeed
the case for the spaces in our string).
107
COS2621/102/0/2024
4 0000000C 532046554E
; output string
6 out_len: equ $-out_str ; Define length of output
; string
7 main:
11 loop_1:
16 0000002F 47 inc di
The input string starts at offset 103h and the output string starts at offset 111h.
Base-indexed addressing can also be used to access array elements. BX and BP can be used
as base registers. When we use this type of addressing mode, the base register is normally
set to the start of the array and the index register is used as an offset from the start of the
array.
Activity 7.3
org 0x100
;
; Program to select a classic movie from a list of 9
;
jmp main ; jump to main program
;
; Prompt message
;
message: db 'Please enter a theatre number',0ah,0dh,'$'
;
; List of movies you can select from
;
109
COS2621/102/0/2024
;
; Each of the values in m_addr occupies 2 bytes.
; This means that the first movie adress is at offset 0 in
; m_addr,
; the second movie address is at offset 2,
; the third movie address is at offset 4,
; the fourth,…, ninth movie addresses are at offsets ; 6,8,10,12,14,16
respectively. ;
; Display message on the screen
; DX points to start of message
;
display:
mov ah,09 ; Display message function
int 21h ; DOS system call
ret
;
; Accept a character from the keyboard
; Character (in ASCII) is returned in AL
; Character is not echoed to the screen
; input:
COS2621/102/0/2024
main:
Activity 7.4
Two‐dimensional arrays:
Calculate the sum of all the elements in the second column of a 3 x 5 array.
111
COS2621/102/0/2024
db 60h,70h,80h,90h,0xa0
;
; Add the elements in the second column main:
mov bx,first_row ; BX points to the first row
mov si,1 ; SI points to column 2
Run this program under DEBUG. The AX register should contain 0091h on termination.
Stack Pointer (SP): The SP register contains the offset from the beginning of the stack to the top of
the stack. We use the instructions PUSH and POP to push values onto the stack and to pop them off
the stack respectively. This addressing mode is called stack addressing.
COS2621/102/0/2024
PUSH: An item is stored on top of the stack and the stack pointer adjusted accordingly. SP is
decremented before a new item is placed on the stack. This means that the stack grows from high
memory to low memory. We say that the stack grows backward in memory. SP points to the last item
that was pushed onto the stack.
POP: An item is removed from the top of the stack and the stack pointer adjusted accordingly. SP is
incremented after an item has been removed from the stack. SP points to the new top of stack.
Note that the operations of PUSH and POP are slightly different on some older machines. This may
not concern you. If you have an older machine though, this might explain differences between the
results obtained by you on your machine and the results given in the next example which was tested
on a relatively new Intel machine. If your results are different, the following is the cause: On some of
the older machines, when the PUSH instruction is executed, SP is decremented after an item has been
pushed on the stack. This means that SP points to the next available position on the stack. With the
POP instruction, SP is incremented before the item on top of the stack is popped.
Activity 7.5
The table given below shows the state of the stack and the registers after each instruction listed
has been executed. All values are in hexadecimal.
113
COS2621/102/0/2024
SP is decremented by 2. (The
stack grows from high memory
to low memory.)
POP AX 0002 0001 0002 FFF8 0001 The value on top of the stack (ie
0002) is popped into AX.
00A5 SP is incremented by 2.
After POP AX has been executed, AX contains the value that was on top of the stack, namely
0002. The stack pointer is now pointing to the value 0001. The popped value (0002) no longer
forms part of the stack. Only the values below the stack pointer form part of the stack.
Things to remember:
7.5 Summary
Stalling: section 14.7. You should be able to identify addressing modes used in Intel x86 instructions
and to give examples of instructions where a specific addressing mode is used. Make sure that you
understand the issues involved when choosing appropriate instruction formats and that you know the
meaning of the key terms listed in section 14.7.
----oooOooo----
115
COS2621/102/0/2024
Study Unit 8
This study unit supplements chapter 5 of Stallings. Concepts pertaining to cache memory
principles, design and organisation are discussed in this study unit.
Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
Cache memory
Stallings: study section 5.1. The characteristics of memory systems are discussed. These include
memory capacity, the basic unit of transfer, the method of accessing and performance. The memory
hierarchy as well as the principle of locality of reference are also explained.
8.2 Cache memory principles
Stallings: study section 5.2. The structure of cache memory is explained. Stallings also describes
how a read from cache memory takes place to illustrate why this is much faster than a read from main
memory.
COS2621/102/0/2024
Stallings: carefully read section 5.3 and study the last page. The concepts explained here are often
referred to in the media in advertisements of laptops, for example.
Stallings: study the section on the cache organisation of the Pentium 4 in section 5.4.
Stallings: read the rest of section 5.4 and read section 5.5. Note the difference between the cache
organisations of the Pentium 4 and the ARM processor.
8.5 Summary
Make sure you understand the meaning of the key terms listed in this section.
---ooo0ooo---
117
COS2621/102/0/2024
Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
Stallings
Stallings: study section 6.1. The operation of a memory cell is explained. Stallings also describes
the operation of dynamic RAM (DRAM) and static RAM. Finally, the different types of ROM are
discussed. Pay special attention to the discussion on flash memory.
Stallings: study section 6.2. Make sure that you understand the difference between a hard failure
and a soft error. The principles on which error-correcting codes function are discussed and the
Hamming error-correcting code is described.
Stallings: study section 6.3. Make sure that you understand the difference between SDRAM and
traditional DRAM.
9.4 Summary
Make sure you understand the meaning of all the key terms in this section.
---oooOooo---
119
COS2621/102/0/2024
This study unit supplements chapter 7 of Stallings. Magnetic disks are discussed as well
as optical storage devices.
Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
Stallings
Stallings: study section 71. The way data is stored on and retrieved from magnetic disks is
discussed. The physical characteristics of a magnetic disk are described as well as the factors that
play a role in the performance of a disk.
10.2 RAID
Stallings: read section 7.2. The basic principles of a RAID scheme are discussed. This section
also contains detailed discussions of the various levels of RAID.
Stallings: study section 7.3. The principles on which different solid state storage devices function
are described.
Stallings: study section 7.4. The principles on which different optical storage devices function are
described.
Stallings: read section 7.5. The principles on which magnetic tape storage devices function are
described.
131
COS2621/102/0/2024
10.6 Summary
Make sure that you understand the meaning of the key terms listed in this section.
---ooo0ooo---
COS2621/102/0/2024
This study unit supplements chapter 8 of Stallings. Various aspects of Input/Output are
discussed including the operation of some external devices, the operation of I/O modules and
different I/O methods. Stallings also discusses external I/O interfaces like FireWire and
InfiniBand.
Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
133
COS2621/102/0/2024
Stallings
Study sections 8.1 - 8.3, the first part of section 8.4, and sections 8.5 -
Stallings: study section 8.1. The different categories in which external devices can be classified
are discussed. We look in some detail at the operation of the keyboard and monitor and also disk
drives. Note that ASCII is the US version of IRA referred to in this section.
Stallings: study section 8.2. The most important functions, as well as the structure of an I/O
module are discussed.
Stallings: study section 8.3. The way in which Programmed I/O operates is discussed.
Stallings also gives a short description of the I/O commands as well as I/O instructions.
It is important to study this section together with the following two sections to clearly understand the
advantages and disadvantages of each method of communication.
Stallings: study section 8.4 up to (but not including) the section on the Intel Interrupt Controller.
Make sure that you understand how interrupts are used to implement this type of I/O communication
and that you can discuss the relevant design issues involved when this method is used.
135
COS2621/102/0/2024
Stallings: study section 8.5. The principles of Direct Memory Access I/O are discussed.
Note the comments regarding Programmed I/O and Interrupt-driven I/O.
Stallings: study section 8.6. Stallings gives a summary of the evolution of the I/O function in
computers. The characteristics of I/O channels are also considered.
Stallings: study section 8.7. Different types of interface are first considered and then Stallings
The principles of Thunderbolt technology are explained and we look at the configurations involved.
You need not go into the protocol details. You can read on the Internet about Thunderbolt version 2
as well.
Stallings also discusses InfiniBand architecture. You need not go into the protocol details (InfiniBand
operation).
11.8 Summary
Stallings: section 8.10. Make sure you understand the meaning of all the key terms in this section.
----oooOooo----
Notes
Study unit 12
Reduced Instruction Set Computers
COS2621/102/0/2024
18
This study unit supplements chapter 17 of Stallings. RISC and CISC design principles, as well
as the RISC vs CISC controversy are discussed. Stallings also considers compiler-based
register optimisation.
Learning outcomes
Once you have mastered the study material in this study unit, you will be able to
Stallings
Stallings: study the introduction to this chapter as well as section 17.1. The way in which an
instruction is executed is once again revised to help us understand what CISC and RISC design
principles are all about.
137
COS2621/102/0/2024
Stallings: study section 17.2. One of the most important design principles of RISC machines is
the use of a large number of registers. The concept of register windows and the use of a large
register file versus the use of cache memory are discussed.
Stallings: study section 17.3. Compiler-based optimisation is discussed and the trade- off between
using this, rather than a large set of registers, is also considered.
Stallings: study section 17.8. The merits of both the RISC and the CISC approaches are
discussed.
12.6 Summary
Make sure you understand the meaning of all the key terms in this section.
---ooo0ooo---
19 Appendix A
COS2621/102/0/2024
Contents
A.1 Introduction
A.3.1 A (Assemble)
A.3.2 R (Register)
A.3.3 T (Trace)
A.3.4 G (Go)
A.3.5 P (Proceed)
A.3.7 U ("Unassemble")
A.3.8 E (Enter)
A.3.9 N (Name)
L (Load)
W (W it )
A.3.10 H (Hexadecimal)
139
COS2621/102/0/2024
A.3.11 I (Input)
A.3.12 Q (Quit)
Note
DEBUG does not allow variable names or labels. We have to work with actual
memory locations.
A.1 Introduction
DEBUG, which forms part of DOS within the Windows environment, is very useful for writing and
debugging small machine language and assembly language programs. We suggest that you
execute all the commands on your machine as you work through this appendix. You will not learn
much by just reading this section without using DEBUG.
We suggest that you create a directory (folder) called C2621 to store the programs that you write
for this module.
COS2621/102/0/2024
Open a DOS-window, go to the C2621 directory by typing cd \c2621, and type debug at the DOS
prompt. DEBUG responds with a hyphen (-) called the ‘DEBUG prompt’.
c:\c2621>debug <Enter>
A.3.1 A (Assemble)
The "a” command translates assembly language source statements into machine code. One can
use the "a" command to write and test small assembly language programs.
141
COS2621/102/0/2024
We will now create a short program consisting of only six instructions. We want to store this
program from memory address 100h (h signifies a hexadecimal number).
Key in the program as listed below and press the Enter key TWICE after you have finished
entering the program. We will use this program in the following sections to explain some of the
DEBUG commands. Do not get upset if you make a typing error as you can always change a
specific instruction. We will explain how to do this shortly. DEBUG will warn you if you key in an
illegal instruction and you will be allowed to key in the correct one.
Example: (Your entries are shown in italics and DEBUG's responses in bold - DO NOT TYPE
the hyphen (-).) Type in only the instructions shown in italics.
(Ignore the first step (-a 100 <enter>) if you have already entered it.)
a 100 <Enter>
nnnn might differ from session to session. This is the address of the code segment, which
need not concern you at this stage. The numbers following after nnnn are memory addresses
specifying the address (called the "offset") at which each instruction is stored.
.
The program is stored from position 0100h to position 010Fh in memory (remember that, in
DEBUG, all references to memory and all values are assumed to be in hexadecimal). The
length of the program in memory is: (010F - 0100 + 1) = 10h bytes. You may have noticed
that instructions may be one, two or three bytes in length. The first byte contains the opcode
and the other byte(s) contain(s) the operand(s). The NOP instruction is an example of a
one-byte instruction. The first two MOV instructions occupy three bytes each, and the
instruction SUB CX,AX occupies two bytes. The last two MOV instructions occupy three and
four bytes respectively. (You may find it strange that these two MOV instructions are of
different lengths. This is due to the instruction format of the x86 machines. Instructions that
involve the AX register are sometimes shorter than similar instructions involving other
registers.) The AX-register is also called the accumulator.
You can change any of the instructions at any stage. Suppose that you made a typing error
and have typed SUB AX,AX instead of SUB CX,AX. You need not retype the whole program.
You may correct the instruction in the relevant memory locations provided that the new
instruction is of the same length as the one that it replaces, otherwise it will overwrite the
instruction that follows it. Type a (at the DEBUG prompt), followed by the memory address
where you want to store the corrected instruction.
143
COS2621/102/0/2024
- a 106
A.3.2 R (Register)
The"r" command displays the contents of registers and allows you to change this value.
register
- r ip <Enter>
IP 0180 This means that the current value of IP is 0180h. (You can now enter a
new value to be loaded into IP or press Enter to leave the register unchanged.)
A.3.3 T (Trace)
The "t" command allows you to execute a program step by step. We are now ready to execute
the program that we have entered and to inspect the registers after the execution of each
instruction. Enter the program as described under the Assemble command (if you have not yet
entered it). We want to execute instructions from address 100h and must ensure that the IP
COS2621/102/0/2024
register, which acts as the program counter, contains the value 100h. To do this we use the "r
ip" command and reset IP to 100h.
We will then enter the "t" command repeatedly until we get to the NOP instruction.
N.B. Do not execute the instructions following the "nop". We do not know which instructions or data might
be stored there and something unforeseen might happen. Reboot your machine and start again if things
go wrong.
(Remember that your entries are printed in italics, DEBUG's responses in bold, and comments
in brackets):
- r ip <Enter>
:100 <Enter>
We will now execute the program by tracing it instruction by instruction. We do this by giving
DEBUG the "t" (for trace) command repeatedly. The instruction starting at the memory location
pointed to by IP (i.e. 100h) will be executed first. In the explanation that follows, we give a step-
by-step explanation of the consequences of the execution of each instruction.
Type:
145
COS2621/102/0/2024
The instruction MOV AX,0015 stored at memory location 100h is executed. Examine the
contents of the AX, CX and IP registers after the execution of each instruction.
IP=0103 NV UP EI PL NZ NA PO NC
Your display will look slightly differently. We only show the information that interests us at this stage.
Second line of display above: IP points to address 0103h, which is the next instruction to be executed. NV,
UP, EI, PL, NZ, NA, PO and NC represent the status flags. The ones that we are interested in are the following:
The first one - the Overflow flag. NV means No Overflow. OV means Overflow occurred.
The fourth one - the Sign flag. PL means that the sign of the previous arithmetic operation was PLus (positive).
NG means it was NeGative.
The fifth one - the Zero flag. NZ means that the result of the previous arithmetic operation was Not
Zero. ZR means that is was ZeRo.
The last one - the Carry flag. NC means that No final Carry resulted from the previous arithmetic operation.
CY means that there was a final CarrY.
Third line of the display above: The hexadecimal number B92300h is stored at segment nnnn
and offset (address) 0103h. This value, i.e. B92300h, is the hexadecimal representation of the machine
code for the assembly language instruction MOV CX,0023. This instruction is to be executed next since IP
points to 0103h. Type t again to execute the instruction pointed to by IP (the one in 0103h).
COS2621/102/0/2024
-t
<Enter>
The contents of AX remain unchanged and 0023h is stored in CX as expected. IP contains 0106h and
points to the next instruction, i.e. SUB CX,AX, which is stored from memory address 0106h. The machine
code for this instruction is 29C1. Type t again.
- t <Enter>
IP=0108 NV UP EI PL NZ AC PO NC
The contents of AX remain unchanged. CX now contains 000Eh which is the result of the subtraction. IP
points to 0108h which contains the instruction MOV [0120],AL. (Remember that AL is the low-order byte of
AX). The machine code of the instruction MOV [0120],AL is A22001h.
151
COS2621/102/0/2024
Note that additional information appears on the right-hand side of the third line of the DEBUG display above.
This tells us that memory location 0120h will be addressed during the next instruction. At the moment it
can contain anything (we indicated "any- thing" by ??) that was stored there previously. Type t again.
- t <Enter>
nnnn:010B 8A0E2001
The contents of AX and CX remain unchanged. IP points to the next instruction, for example MOV
CL,[0120], which is stored at memory address 010Bh (CL is the loworder byte of CX). The machine code for
this instruction is 8A0E2001h. The memory location at address 120h now contains 15h. Type t again.
-t <Enter>
AX=0015 CX=0015
nnnn:010F 90 NOP
COS2621/102/0/2024
The
contents of AX remain unchanged. The contents of memory address 0120h were loaded into CL and it now
contains 15h as expected. CH remains unchanged. IP points to the next instruction, for esample NOP,
which is stored from memory address 010F. The machine code for this instruction is 90h.
A.3.4 G (Go)
The "g" command executes a program up to a certain specified point (called a breakpoint). Let us suppose
that we are debugging a huge program and that we know the exact location in memory (hundreds of
statements from the beginning) at which things start going wrong. It will be tedious to use the trace option to
reach this point. By specifying a breakpoint we can tell DEBUG to GO until this breakpoint is encountered.
Suppose we want to execute up to location 108h. the instruction in 108h will be the one which IP will point to
when execution is halted. Reset IP to 100h and type g 108. All the instructions of the program will be executed
up to but not including the instruction at address 108h.
NOTE: When we enter DEBUG commands, we do not type "h" to indicate hexadecimal numbers since all
numbers are assumed to be in hexadecimal.
A.3.5 P (proceed)
[NB: Use Proceed instead of Trace or Go when the next instruction to be executed is
an INT.]
The "p" command is similar to Trace except when instructions such as CALL, INT and LOOP are encountered.
If "p" is used, DEBUG will perform ALL the machine code instructions associated with the instruction and will
stop at the instruction following the CALL, INT or LOOP. Proceed can be used to avoid tracing all the
instructions of loops or subroutines of which we do not want to see the detail. It will also avoid tracing system
calls like interrupts (INT instructions).
153
COS2621/102/0/2024
We will now add a new instruction, INT 20, to our program. We assume that you still have the program in
memory and that you are still in DEBUG. (If not, key in the program again.) Type:
- a 110 <Enter>
<Enter>
nnnn:0110 int 20
The INT 20 instruction causes the program to terminate and return to DEBUG.
Rerun your program and type p when INT 20 has to be executed. DEBUG will respond with:
-
A.3.6 D (Dump or Display)
This command can be used to examine in more detail the way our assembly language instructions, and the
data that is defined within the program, are stored in memory.
The "d" command displays a portion of memory in hexadecimal. d 100 10F will display (list) the contents of
memory locations from address 100h up to 10Fh.
nnnn:0100 B8 15 00 B9 23 00 29 C1-A2 20 01 8A 0E 20 01 90
This is how the program is stored in memory - each instruction following immediately after the previous one.
COS2621/102/0/2024
A.3.7 U ("Unassemble")
The u command "unassembles" (rather "disassembles") machine code instructions. We can use this
command to list the assembly language instructions associated with a program stored in memory. U 100 10F
will "unassemble" the instructions from memory addresses 100h up to 10Fh. (It attempts to translate machine
code instructions to assembly language instructions.)
A.3.8 E (Enter)
The "E" command can be used for different purposes. We can use this command to enter data or machine
code instructions directly into memory. Type "e" (at the DEBUG prompt) followed by the address and then
the data or machine code that you want to store in the memory locations, starting at the specified address.
155
COS2621/102/0/2024
Examples:
- e 106 23 67 2a <Enter>
Enter byte values. The values 23h, 67h and 2Ah will be stored in the three memory
Enter a character string. Each ASCII character in this string occupies 1 byte. The first
character, i.e. ‘I’, is stored in position 109h, the second in 10Ah and so on.
- e 108 29 C1 <Enter>
Enter the instruction SUB CX,AX (machine code 29C1h) directly into memory positions
108h d 109h
We can also use e to display the contents of memory locations.
- e 108 <Enter>
- e 109 <Enter>
COS2621/102/0/2024
This means that 108h contains 29h and 109h contains C1h.
157
COS2621/102/0/2024
A.3.9 N (Name), L (Load) and W (Write)
Suppose we created a program under DEBUG and we want to store this program for future use. The "n"
command is used to give a name to the executable program file in which we want to store the program. The
"w" command is used to write the file to disk, but we must first specify how many bytes must be written. The
combination of the BX and CX registers (BX:CX) that forms a 32-bit register, is used to specify the
number of bytes to be written. Our programs are going to be relatively small, so BX will normally contain
0000 and CX the size (in bytes) of the program.
Suppose we want to write 66h bytes, starting from memory location 100h. We want to name the program
beep.com. (Remember that we can only assemble .com files with DEBUG.) Type the following at the
DEBUG prompt:
-r bx <Enter. bx ????
<Enter>
:0000
<Enter>
-r cx
cx ????
writing 66 bytes
The "l" command will load the contents of the current file into memory from address CS:100 onwards. Thus,
to load a file from disk into memory starting at location 100h, we will type:
-n beep.com
-l
COS2621/102/0/2024
Note that
we must first name the file.
CX will contain the length of the file after the load has been completed. In general, as we have explained
above, the 32-bit register which is formed by the combination of the BX and CX registers, gives the length of
the file being loaded/written. BX contains the high- order and CX the low-order part of the file length. For small
files, BX = 0 and CX contains the length.
A.3.10 H (hexadecimal)
The "h" option is quite handy and can be used to do addition and subtraction of two hexadecimal numbers.
Example: -h 65 23
0088 0042
This means that 65h + 23h = 0088h and that 65h - 23h = 42h.
The "I" option is used to input and display one byte from a port, and the "o" option to send one byte to a
port.
A.3.12 Q (Quit)
Type q at the DEBUG prompt to exit from DEBUG and return to DOS.
21 Appendix B
159
COS2621/102/0/2024
CONTENTS
B.1 Introduction
B.5 Pseudo-instructions
B.6 Constants
161
COS2621/102/0/2024
B.1 Introduction
NASM (The Netwide Assembler) is an assembler for x86 assembly language. We will use NASM to assemble
x86 assembly language programs and create executable (.com) files.
However, we will use DEBUG to step through and debug our programs.
In this section we are going to show you how to use NASM. Note that this appendix was compiled from the
NASM User Manual available on the web. The assembly language instructions for the Intel x86 are listed in
Appendix E.
You need to have a copy of NASM.exe in the directory (folder) in which you want to work, say C2621. Create
your source file and give it any name with an .asm file extension (we will explain in section B3 how this should
be done). Then type:
where name.asm is the name of your source file containing assembly language instructions, name.com
is the executable file that is created by NASM, and name.lst is the output listing file that is produced by NASM.
(Note: -o is the alphabetic character "o" and not the digit zero, and -l is the alphabetic character "l" and not the
digit one.)
An .lst file will only be created once your source file is free of syntax errors. However, while there are syntax
errors in the source file, error messages with appropriate line numbers will appear on the screen.
COS2621/102/0/2024
Making
your life easier: When you install NASM on your machine from the CD that is provided when you register for
this module, a file called as.bat is also copied to your nasm directory. We include this file to make the assembly
process much easier. Instead of typing the assembly line as given above, you need only type
as name
where "name" is the name of your program. The as.bat file will generate the assembly line and your program
will be assembled.
163
COS2621/102/0/2024
B.3 Layout of a NASM source line
As is the case with most assemblers, each NASM source line is a combination of four fields.
Most of these fields are optional; the presence or absence of any combination of a label, an instruction and
a comment is allowed. The type of instruction will determine the number of operands (0 or more) the relevant
instruction requires. (Appendix E contains a list of the Intel x86 instructions.)
Spaces are allowed within a line. Labels may have spaces before them. The colon after a label is optional, but
we use it for the sake of clarity.
Labels: Valid characters in labels are letters, numbers, _, $, #, @, -,. (full stop), and ? (question mark).
Generally, the first character of an identifier (a label or a variable/constant name) must be a letter. There are
two exceptions to this rule:
• An identifier may be prefixed with a $-sign to indicate that it is intended to be read as an identifier
and not a reserved word. Thus, you can use an identifier $EAX in your program; the $ as the
first character distinguishes the identifier from the EAX register.
• If a period is used as the first character of a label it means that the label is local. (See Section
B.9 for a discussion of local labels.)
Instructions: The opcode field contains the mnemonic (abbreviation) of the operation to be performed by
the instruction. Examples of mnemonics are the following:
MOV for moving data between registers, or between a register and memory,
and
Enter the assembly language program given below using a text editor and save it as prog1.asm. You have
to use an ASCII editor. Notepad and Edit are suitable editors. The program displays the message 'Hello World’
on the screen. (Note that the line number must not be entered. We only give the line numbers for the
explanation that follows.)
Open two DOS windows, one to run NASM and the other one for the editor. Work through the following steps
if you are not familiar with the way to do this. Keep all your NASM programs in one folder (directory). We called
our folder C2621. Create this folder and copy the program files NASM.exe and AS.BAT to it.
Step 1 Right click on the DOS icon (or Command prompt for newer Windows versions)
on your PC. Select properties on the drop-down window. Select screen and
choose Window under Usage.
165
COS2621/102/0/2025
ey in your program.
be saved in the the
is error-free. When
DOS command that
r keyboard to display
Prog1.asm (Do not enter the line numbers; we only use line numbers in order to refer to individual lines
in the explanation that follows. Your listing file (ie the prog1.lst file) will have line numbers):
COS2621/102/0/2024
1 bits 16
Line 1: Compiler directive for using 16 bits. This is sufficient for our purposes.
Line 2: All .com programs start execution at offset 100h. For this reason, we start our assembly at address
100h.
Line3: The convention we use is to have all data declarations at the beginning of the program. Because
execution always starts at address 100h, we jump to the first executable instruction of the program.
Line 4: We declare a string ‘Hello World’ which we want to display on the screen. The values 0ah and
0dh, which form part of the string to be displayed, are the ASCII characters for a line feed and carriage
return respectively. After displaying the message, the cursor will thus go to the beginning of the next
line. The ‘$’ sign at the end of the string indicates the end of the output message.
Line 5: The main program starts here. Most of the I/O is done by means of the INT instruction. The I/O is
handled either by DOS or by BIOS. (Refer to Appendix D for details.) We want to display a message on
the screen and for this we make a request to DOS (also called a DOS system call) by issuing the
instruction int 21h. For such a system call,
DOS obtains details of the I/O request from certain registers. In lines 5 and 6, we set up the registers
for the INT 21h call to display a message on the screen. DX points to the start address of the output
message.
167
COS2621/102/0/2024
Line 6: For INT 21h, the AH register is used to tell DOS what I/O we want it to handle. AH = 09 means
that we want to display a message on the screen. (See Appendix D.3.)
Line 7: We use a software interrupt to hand control over to DOS which will handle the I/O that we
requested. DOS will display the message on the screen and return control to our program.
Line 8: INT 20h terminates the program and goes back to DOS.
You can now use the text editor to inspect the NASM output, i.e. the prog1.lst file.
1 bits 16
5 0000000C 6C640A0D24
Note the following points when you study the .lst file:
Column 1: Line numbers (in decimal) as allocated by NASM. These numbers do not necessarily
correspond to the lines in your assembly language program. See lines
4, 5 and 6, for example.
COS2621/102/0/2024
Column 2: The addresses (in hexadecimal) corresponding to the instructions. Note that these addresses
are
relative to address 100h. Look at line 4 above, for example. The relative address is 3, but our
program starts at 100h. This means that the offset at which this message is stored is 103h.
Column 3: The hexadecimal representation of the machine code for each instruction or data item.
Debugging: To debug the program, type debug prog1.com at the DOS prompt.
DEBUG will load your program and you can execute it step by step to trace the execution of the individual
instructions as we have illustrated in Appendix A. Do an Unassembly of the program. You will see that,
in the DEBUG listing, the offset of 100h has already been added to the addresses. Look at the machine
code for the mov dx,mess instruction.
These two machine code instructions refer to the same memory position. The DEBUG address refers to
offset 103h in memory. The NASM instruction refers to relative address 3 which, when added to the offset
of 100h, is equal to 103h.
B.5 Pseudo‐instructions
In addition to machine instructions, an assembly language program can also contain commands to the
assembler itself. One may, for example, ask the assembler to allocate storage space. These commands
are called pseudo-instructions. We are interested in the following pseudo-instructions:
• DB, DW, DD, DQ and DT which are used to allocate storage space and initialise the space to
specified values
169
COS2621/102/0/2024
• RESB, RESW, RESD, RESQ and REST which allocate (reserve) storage space but do not initialise
it
• the EQU command for defining constants
• the TIMES prefix for repeating instructions or data
We will return to these pseudo-ops (in Section B.7) once we know what format NASM expects data to be
in.
B.6 Constants
Four types of constant can be defined: numeric, character, string and floating-point.
A numeric constant is simply an integer. NASM allows integers to be specified in binary, decimal, octal or
hexadecimal representation.
Examples: mov ax,100 ; Move the decimal number 100 to the AX register
; NASM requires the prefix 0x if the hex number starts with a letter
A character constant consists of up to four characters enclosed in either single or double quotes. The
type of quotes used makes no difference to NASM. However, the use of single quotes allows double
quotes to appear within the constant and vice versa. A character constant which consists of more than
one character will be stored in little-endian byte order (refer to Stallings, pages 381 - 385). In the case of
mov ax,'ab', the constant generated is 0x6261 and not 0x6162. However, if you store the value in
memory, it will read 'ab' and not 'ba'. Create a source file as follows:
[120h],ax
Listing file:
1 org 0x100
2 00000000 B86162 mov ax,'ab'
3 00000003 A32001 mov [120h],ax
Use DEBUG to execute the program (ie the .com file that is created by NASM) and inspect the AX register
and memory locations 120h and121h. You will find that AX = 6261h that represents 'ba', while memory
positions 120h and 121h contain 'a' and 'b' respectively. We generally restrict the use of character
constants to those consisting of only one character.
171
COS2621/102/0/2024
NASM distinguishes between character constants and string constants. Character constants have a
maximum length of four (more than four characters cannot fit into a 32- bit register). A string constant
(character string) looks like a character constant but may be longer than four characters. Character strings
can be defined using the DB and DW pseudoinstructions. This means that db 'ab' will reserve two
bytes. According to the NASM manual:
"When used as an operand to db, a constant like 'ab' is treated as a string constant despite being short
enough to be a character constant, because otherwise db 'ab' would have the same effect as db 'a', which
would be silly. Similarly, three-character or four-character constants are treated as strings when they are
operands to dw. "
Let us look at a few examples: The following two declarations are equivalent:
Note that dw 'hello' is not equivalent to the above. The DW directive reserves words
(multiples of two bytes). In this case 6 bytes are reserved with the number 0 concatenated at the end.
To get the equivalent of the above using the db directive, we have to code
db 'ninechars',0,0,0
The following is the output listing of a small program to illustrate the above:
(DB reserves 1, 2, 3, 4, etc. bytes; DW reserves 2, 4, 6, 8, etc. bytes; DD reserves 4, 8, 12, 16, etc
bytes.)
B.6.4 Floating‐point constants
We will not use floating-point constants in the COS2621 module, but we include them for interest’s sake.
Floating-point constants are acceptable only as arguments to DD, DQ and DT. They are expressed in
the traditional form: digits, then a period, then optionally more digits, then optionally an E followed by
an exponent. The period is mandatory, so that
NASM can distinguish between dd 1, which declares an integer constant, and dd 1.0 which declares
a floating-point constant.
173
COS2621/102/0/2024
Some examples:
DB, DW and DD are used to reserve space and to initialise these storage positions.
Activity B-1:
Set up a source file (let us call it db_dw.asm) with the following instructions:
bits 16
org 0x100 ; start program at offset
100h
L1: db 0x55
L2: db 0x55,0x56,0x57
L3: db 'a',0x55
L4: db 'hello',13,10,'$'
175
COS2621/102/0/2024
L5: dw dw 0x1234
dw
L6: 'a'
L7: 'ab'
L8: dw dd 'abc'
L9:
0x12345678
L10: dd 1.234567
Use NASM to assemble this program. Then type debug db_dw.com to load the executable file, and
display from memory position 100h (refer to Appendix A, section A.3.6). Let us investigate how the above
data is stored in memory.
The dump is as follows (remember that all the values in the display are in hexadecimal):
0100:55 55 56 57 61 55 68 65 - 6C 6C 6F 0D 0A 24 34 12
corresponding to: 55h 55h 56h 57h a 55h h e - l l o 1310 1010 $ 0x1234
0110:61 00 61 62 61 62 63 00 - 78 56 34 12 4B 06 9E 3F
- The insertion of 00h in cases where DW was used and an uneven number of characters was
specified.
- The fact that dw 0x1234 stores the value 1234h in reverse byte order, while dw 'ab' does not.
RESB, RESW and RESQ can also be used to reserve space but these pseudo- instructions do not
initialise the reserved storage positions.
COS2621/102/0/2024
Examples:
buffer: resb 64 ; Reserve 64 bytes
The pseudo-instruction EQU can be used to assign a name to a given constant value. This is called a
symbolic constant. When EQU is used, the source line has to contain a label. The action of EQU is to
associate the given label name with the value of the operand. This definition is absolute, and cannot
change later. (This happens at assembly time.) The following program illustrates the concept:
Use DEBUG to load the resulting .com file. Do an "unassembly" of the program and you will notice that
the instruction mov ax,msglen has been assembled to mov ax,0005.
The value of msglen is evaluated once only and is calculated as follows: msglen is equal to the current
memory address (ie 0008) minus the address of message (ie 0003). This means that EQU associates
$ with the current value of the program counter. In the example given above, $ evaluates to the memory
position at the beginning of the line containing msglen (0008). Thus:
$ - memory_position_of_message = 8 - 3 = 5
177
COS2621/102/0/2024
Note that the operand to an EQU is a critical expression (see section B.10) and msglen cannot be
redefined later.
The TIMES pseudo-instruction causes the instruction (or pseudo-instruction) immediately following it to
be assembled a number of specified times.
For example:
times 10 db 'z'
will reserve 10 bytes and initialise these to the ASCII character 'z'. This means that the pseudo-instruction
db 'z' is repeated 10 times.
The argument to TIMES does not need to be a numeric constant. It can also be an expression, as can be
seen from the following example:
The second pseudo-instruction given above will store exactly enough extra spaces to extend the total
length of buffer to 64.
TIMES can also be applied to assembly language instructions, so we can code trivial loops with it:
except that the latter will be assembled about 100 times faster due to the internal structure of the
assembler.
NASM gives special treatment to symbols beginning with a period. A label starting with a period is treated
as a local label, which means that it is associated with the previous non- local label.
Example:
In the above code fragment, each JNE instruction (if taken) jumps to the .loop immediately preceding
it. The two definitions of .loop are kept separate by virtue of each being associated with the previous
non-local label. The first .loop is associated with label1 whereas the second .loop is associated
with label2.
179
COS2621/102/0/2024
One limitation of NASM is that it is a two-pass assembler. It will always do exactly two assembly passes.
Because of this, it is unable to cope with source files that are complex enough to require three or more
passes. The first pass is used to determine the size of the assembled code and data. This means that all
the symbolic addresses the code refers to during the second pass, when generation of code occurs, are
known. NASM cannot handle code the size of which depends on the value of a symbol declared after
the code in question.
NASM will not accept this: since the address of label is not known yet, it cannot evaluate the expression
in the times pseudo-instruction when it is first encountered. It will just as firmly reject the slightly
paradoxical code
undefined!
NASM does not allow these because of a concept called a critical expression. A critical expression is
defined as an expression the value of which is required to be computable in the first pass of the assembler.
This means that the evaluation of the expression must depend only on symbols defined preceding it.
Because of this, the arguments to the TIMES family of pseudo-instructions are also critical expressions.
On the first pass, NASM cannot determine the value of symbol1 because it is defined to be equal to
symbol2 which NASM hasn't seen yet. On the second pass, therefore, when it encounters the line
mov ax,symbol1, it is unable to generate the code for it because it still doesn't know the value of
symbol1. On the next line, it would see the equ again and be able to determine the value of symbol1,
but by then it would be too late.
NASM avoids this problem by defining the right-hand side of an equ pseudo-instruction to be a critical
expression, so the definition of symbol1 would be rejected during the first pass.
Note that not all forward references are rejected, of course. The following is an example of an acceptable
forward reference:
mov ax,[bx+offset]
offset equ 10
During pass one, NASM has to calculate the size of the instruction mov ax,[bx+offset] without
knowing the value of offset. It has no way of knowing that offset is small enough to fit into a one-
byte field and that it could therefore get away with generating a shorter form of the effective-address
encoding; for all it knows, in pass one, offset could be a symbol in the code segment, and it might
need the full four-byte form. So it is forced to compute the size of the instruction to accommodate a
possible four- byte address part. In pass two, having made this decision, it is now forced to honour it
and keep the instruction large, so the code generated in this case is not as small as it could have been.
This problem can be solved by defining offset before using it, or by forcing a byte size address part
in the effective address by coding
[byte bx+offset].
181
COS2621/102/0/2024
Suppose we want to write a program that asks the user to enter his (her) name, Peter say, on the
keyboard and then displays the message "Hello Peter" on the screen. We can break this up into three
distinct parts:
1. Display a message on the screen (used for input prompt and for display of the "Hello"
message)
3. A main program
The program could be written as follows (call the source code file prog2.asm for example):
;
COS2621/102/0/2024
; Input_buffer
ret
prompt
183
COS2621/102/0/2024
CR and LF
mov dx,user ; Address of user name entered mov bx,user ; Get start of input string add
The .lst file generated by NASM is as follows (because of the length of some of the lines in the listing file,
we
do not show all the comments):
1 bits 16
2 org 0x100
5 ;
9 ;
11 ; prompt
13 0000000C 72206E616D6520706C-
14 00000015 65617365202024
15 ;
16 ; Input_buffer
17 0000001C 14 in_buf: db 20
18 0000001D 00 len_in: db 00
20 ;
185
COS2621/102/0/2024
24
25 ;
27 ;
32 00000054 C3 ret
34 ;
35 get_chars:
39 00000059 C3 ret
40 ; -------------------- Main program ----------------------------
41 main:
187
COS2621/102/0/2024
53
54
191
COS2621/102/0/202
(i) The maximum number of characters that will be accepted by DOS. This is
set to 20 for this program.
(ii) DOS will insert the actual number of characters that was entered in this byte(i.e.
at relative address 1Dh).
(iii) The buffer for the actual user input at label user (ie at relative
address 1Eh).
DOS will only accept the string of input characters once you press the ‘Enter’ key. The ASCII
code for this key is 0Dh. This character is also written to the input buffer. Step through the
.COM program using DEBUG, i.e. type debug prog2.com at the DOS prompt and execute
the instructions one at a time using the T or P command. (Remember to use the P command
when you trace an INT instruction, otherwise you will go into the DOS interrupt handler.)
Execute all the instructions up to the instruction at relative address 52h. (You can also use
the command G 152 to accomplish this.) Inspect the data area starting at 01Eh by typing D
11E. You will see the string that you entered, terminated by a carriage return (0Dh). We
need to remove the carriage return character in order to display the name entered,
otherwise it will be displayed and immediately cleared from the screen. We do this by
overwriting the 0Dh (ASCII carriage return) with a space (ASCII 20h) (see lines 55 and 56
in the listing file).
----oooOooo----
COS2621/102/0/2024
23 Appendix C
24 ASSEMBLY LANGUAGE PROGRAMS-EXAMPLES
CONTENTS
25
193
COS2621/102/0/2025
Write an assembly language program to compare the unsigned values in AL, BL and CL respectively,
and move the smallest to BH.
Solution:
bits16 org 0x100
main:
mov al,7 ; In a big program one would read in the
mov bl,8 ; values for AL, BL and CL. Here we
initialse
mov cl,5 ; the values for testing purposes
mov bh,al ; Mov AL to BH
cmp bh,bl ; Compare BH to BL
jbe label_1 ; If BH <= BL, jump to label_1
; else
mov bh,bl ; move BL to BH
label_1:
195
Example 2: Find the factorial of a number (IMUL, LOOP)
The following program calculates the factorial of a number using an iterative loop.
org 0x100
jmp main ; Jump to main program
main:
mov ax,[number] ; AX = n (6 in this example) mov
cx,ax ; CX is used as a loop counter dec cx ;
CX = CX - 1 (next number)(n – 1)
repeat:
imul cx ; DX:AX = AX*CX (n * (n-1))
; We have two 16-bit operands and a 32-bit result
loop repeat ; Decrement CX. Repeat loop if CX <> 0 int 20h
; Terminate program
(c) Read one character from the keyboard, and display it (one character) on the screen (i.e. with
echo).
(d) Write one character, ‘Q’ say, to the printer. (Use the default printer.)
(e) Use INT 21h (function 6) to read one character from the keyboard. Echo it to the screen.
Solution:
(a)
org 0x100 jmp main
; Define input parameter block for string.
buffer: db 20 ; Maximum number of characters = 20
db 0 ; Actual number of characters entered
; is stored in this byte
resb 20 main: ; Reserve 20 bytes for input string
COS2621/102/0/2025
197
int 21h int ; Display string
20h ; Terminate program
(c)
org 0x100
main:
(d) The printing of a carriage return is to force immediate printing. Many printers keep each
character in an internal buffer until either a "carriage return" or "line feed" is printed or the
buffer is full. It is not necessary to set AH to 05 before calling INT 21h for a second time.
The contents of AH are not affected by INT 21h. The character to be printed must be in DL
before the system call.
bits 16 org
0x100
mov ah,05 ; Printer output
mov dl,51h ; ASCII character 'Q'
int 21h mov ; Display character
dl,0dh
; Carriage return
int 21h int ; DOS System call
20h
; Terminate program
(e) The function does not wait for input from the user but returns to the program immediately
after the INT 21h has been executed. We have to test whether a key has indeed been
pressed on the keyboard. (See Appendix D.3.) This is called "programmed I/O" as you will
learn from the text book.
COS2621/102/0/2024
org 0x100
main:
mov ah,06 for ; Service - char I/O. (DOS does not wait
input)
mov dl,0xff; DL = FFh means that we want input
read:
Example 4: BIOS system call ‐ INT 17h (Printer output) Use INT 17 to do the following:
(b) Write a string of characters to the printer one at a time. The end of the stringis
indicated by "$".
Refer to Appendix D.2 for details of BIOS (Basic Input/Output System) printer services.
Solution:
(a)
bits 16 org
0x100
199
mov ah,00 ; Print a character mov al,51h ; 51h =
ASCII char 'Q' mov dx,00 ; Printer port (default
port = 0) int 17h ; Print character
mov al,0dh ; 0Dh = ASCII for carriage return int
17h ; Carriage return int 20h ; Terminate program
bits 16
org 0x100 jmp main
Solution:
COS2621/102/0/2025
(a) Scrolling begins at row 0, column 0. The screen scrolls past until the bottom is reached at
row 18h (= 24) and column 4Fh (= 79). The whole screen will thus be blanked out.
bits 16 org
0x100
mov ax,0600h ; Service - scroll up screen
mov cx,0000 ; Starting row:column
mov dx,184Fh ; Ending row:column
(b)
bits 16 org
0x100
Write an assembly language routine to load a string of bytes onto the stack and store them from
there in an array. The order of the string of bytes must stay unchanged.
Solution:
201
This program assumes the length of the string to be 20 characters. Remember that the last byte that
was pushed onto the stack will be on top of the stack when we start popping.
This means that we get the last one first.
main:
mov si,string ; SI contains the address of ‘string’
mov cx,20 ; CX is the loop counter (20
characters)
mov ah,0 loop1: ; Clear AH
Consider the following DEBUG listing (remember that all values used in DEBUG are in hexadecimal):
-a 100
119C:0100 jmp 134 ; Jump to main program
119C:0102 db 00 00 '$' 00
119C:0106 ;
119C:0106 ;Clear_Screen
119C:0106 mov ax,0600 ; Service - scroll up screen.
119C:0109 mov bh,07 ; Colour attributes: white (7) on
black (0).
119C:010B mov cx,00 ; Starting row:column.
203
119C:011E nop ; Return from subroutine.
119C:011F ret
119C:0120 ;
119C:0120 ;Display_String
119C:0120 mov cx,100 ; Set loop counter, CX to 100h
(25610).
119C:0123 lea offset dx,[102] ; Load Effective Address (load
c) Type G 0115 and use the Dump option of DEBUG to inspect the two-byte value on
top of the stack. What is the value of this number and what does it mean?
f) What would the value of CX be just before the instruction at address 0133h is
executed?
205
COS2621/102/0/20
h) What would the effect be if the contents of address 0102h is changed from 00 to
24h?
i) Which displayable ASCII character can never be displayed on the screen by this
program?
j) Will the program still work if the instruction at address 0123h is replaced by MOV
DX,0102 by typing the following?
A 0123
----- :0123 MOV DX,102
----- :0126
Solution:
a) The program occupies memory locations 0100h to 013Eh, thus a total of 003Fh
memory locations:
c) Run the program up to memory position 0115h by using the G (GO) option of
DEBUG. Remember that the instruction at address 0115h will not be executed.
Use the "R" command of DEBUG to display the contents of the registers and the status of the
flags. Locate (on your screen) the value of SP. When we ran the program we found that
SP=FFEAh. Therefore, the stack pointer points to offset FFEAh. (This address might be
different when you run the program. Use the value of SP as displayed on your machine.)
Use the D (DUMP) option to display the contents of the memory position pointed to by SP.
In our case:
Contents of FFEB: 01
Remember that bytes are stored in reverse byte order in memory. Therefore the value stored
at the top of the stack is 013Ah.
When a CALL instruction is encountered, the address of the instruction following the call is
pushed onto the stack. When the RET instruction is executed in the subroutine, ie RETurn to
calling program, the address of the next instruction to be executed is popped from the stack.
Do not worry about this: this is all handled by the operating system. However, you have to be
aware of the fact that, when you enter a subroutine, the return address is always on top of the
stack.
In this case, 013Ah is the address of the instruction that the program will return to after
execution of the subroutine.
207
COS2621/102/0/20
d) It loads the address 102h into the DX register (Load Effective Address).
e) Remember that DEBUG always starts executing at address 100h. The computer
will try to execute the data in memory position 0102h if the first instruction does not
jump to the main program.
f) CX acts as a loop counter and is decremented by 1 each time the LOOP instruction
is executed. As soon as CX = 0, the loop terminates and the instruction in the next
memory location is executed. Therefore, the instruction in 0133h will only be
executed when CX = 0.
g) CX specifies the starting row:column numbers where the cursor must be positioned
(CH = row position; CL = column position). In this case the cursor will be positioned
at row 0, column 0 (top left-hand corner).
h) Address 0102h contains the ASCII code of the first character that we want to
display. If we change the contents of 0102h from 0 to 24h, the display will start with
a different character.
i) The character "$". This character is used as a string delimiter, i.e. it indicates the
end of the string when INT 21h is used to display the string.
j) MOV DX,102 will have the same effect as LEA DX,[102]. However, the
LEA DX,[102] instruction occupies four bytes (0123h, 0124h, 0125h, 0126h) whereas
MOV DX,102 will occupy only three bytes (0123h, 0124h, 0125h). The contents of
byte 0126h will then not "make sense". We can rectify the situation by storing NOP in
0126h.
COS2621/102/0/2024
Accept an ASCII character string that represents a decimal number from the keyboard
(maximum of four characters). Convert this string to a binary integer value that can be used
in arithmetic calculations. Multiply this number by 3 and print the answer in decimal.
BEGIN
REPEAT
Mult. num-var with 10 (move digits one decimal position to the left)
END_IF
END
209
COS2621/102/0/20
Procedure Convert-to-ascii (num-var)
BEGIN
DO
num-var = num-var / 10
END_DO
END
Multiply num-var by 3
It is important to remember that ALL output to the screen and to the printer must be in ASCII
code and that ALL input from the keyboard is received in ASCII.
This means that, in order to display or print a decimal value, say 6710, one has to convert the
integer value to a string consisting of two ASCII characters. Simply moving the integer value
to DL and writing it to the screen does not yield the desired result. For example:
211
COS2621/102/0/20
We give the program to convert the positive binary integer value stored in AX to an ASCII string
and display it on the screen.
bits 16 org
0x100 jmp
main
Write the following routines using assembly language. Define data areas where necessary.
(a) Convert the ASCII character in AL to lower case if it is a valid upper case character.
(b) StrToNum is a subroutine that uses DS:DX as input. DS:DX points to an input buffer
that contains a string of characters representing a decimal integer entered from the
keyboard using INT 21h with AH = 0Ah. StrToNum must return the integer value of the
string of digits (ASCII string) in the AX register.
Solution:
(a) The only difference between the ASCII codes for upper and lower case is bit 5 (set to 1 for
lower case and set to 0 for upper case), for example, the ASCII code for "a" is 01100001
(61h) and for "A" it is 01000001 (41h). (The difference is 20h.)
org 0x100
jmp main ; Jump to main program
str1: db 'Invalid character.' , '$' str2:
db 'Enter a character:','$' str3: db
0x0a,0x0d,'$'
; disp_str:
mov ah,09 ; Service - display a string of characters int 21h
; Display string
ret
;
; Read a character from the keyboard
; Character is returned in AL
213
COS2621/102/0/20
; read_char:
mov ah,01 ; Service - Read character and
; echo it to the screen
int 21h ; Read character from keyboard ret
;
; Display line feed , carriage return
; line_carr:
mov dx,str3 ; Address of CR, LF string call
disp_str ; Display string ret main:
dis_err:
call line_carr
mov dx,str1 ; Address of error message call
disp_str ; Display message int 20h ; Terminate
program
(b) Implementation of the ASCII string to binary integer conversion algorithm given in example
8.
org 0x100
;
; The subroutine str_to_num converts an ASCII string (of digits) to
; a binary integer value. The address of the string is passed in DX.
; The end of the string is indicated by a carriage return.
; The result is returned in AX.
; We assume that the 0 <= number < 65536.
; The contents of the registers are not preserved.
; If an error occurred, AL is set to 'E'.
; str_to_num:
xor ax,ax ; Initial value of AX = 0
xor bh,bh ; BH = 0
mov cx,10 ; To build integer in AX (multiply by 10)
mov si,dx ; DX points to start of input buffer
next_char:
215
COS2621/102/0/20
cmp bl,39h ; ASCII for the character '9' is 39h
jg error ; > '9', invalid character
sub bl,30h ; Convert to numeric value (ASCII '0'
=
mov bl,[si] ; Move contents of memory pointed to by
SI to BL 30h).
jl error ; < 0, invalid character
imul cx ; DX:AX = AX * 10 (32-bit result)
add ax,bx ; Add next digit
inc si ; Pointer to next char
jmp next_char ; Repeat for next character
error: mov al,'E' ; Flag an error
finis:
Write an assembly language program that will count the number of vowels in a name and
surname entered from the keyboard (upper case) and display it on the screen. Make use of
subroutines.
COS2621/102/0/2024
;
; Program Count_Vowels
;
; This program accepts a name from the keyboard, counts the number ;
of vowels in the name, and displays this number on the screen.
; Name must be entered in upper case.
; bits 16 org
0x100 jmp
main
prompt:
db 'Enter your name and surname in UPPERCASE' db
13,10,'$'
input_buf:
db 40 db 0
resb 40
out_mess: db 'Your name contains
217
db 13,10,'$' ; carriage return and line feed
;
; This routine displays a prompt and reads a string of characters ; from
the keyboard
; disp_prompt:
mov ah,09 ; Service - display message
mov dx,prompt ; Address of message
int 21h ; DOS system call
mov ah,0ah ; Service - read string
COS2621/102/0/2025
221
; remainder in AH
; Main program
; main:
loop1:
; Repeat loop
; Display count of vowels
; Terminate program
----oooOooo----
COS2621/102/0/2024
223
COS2621/102/0/2025
25Notes:
26 Appendix D 27
INTERRUPTS
CONTENTS
INT 10h is used for screen handling. The specific function to be executed is specified in AH. The
interrupt may change the contents of AX. We list a few of the functions:
225
COS2621/102/0/2025
AH = 00 and Set standard graphics mode - 320x200 resolution. The cursor disappears
DH = row;
DL = column;
AH = 06: Scroll up screen. This function could be used to clear the entire screen.
Set:
CH = starting row;
CL = starting column;
DH = ending row;
DL = ending column;
We can also set the colour attributes when we scroll the screen. The attributes
are specified in the BH register as follows:
Bit 6 4 3 0
background colour
227
AL = number of lines, 00 for scrolling down the entire screen;
COS2621/102/0/2025
CH = starting row;
CL = starting column;
DH = ending row;
DL = ending column.
COS2621/102/0/2025
Interrupt 17h provides three functions for printing, specified by the AH register. The following
values specify the printer ports for INT 17h:
1 for LPT1
2 for LPT2
AH = 00: Print a character. Load the character to be printed in AL and the printer number
in DX. DOS will set AH to 01 if the character could not be printed, otherwise AH =
00.
AH = 01: Initialise the printer port. Resets the printer and initialises it for output. The system
call returns a status code (similar to the status code for AH = 02) in AH. The call
returns 90h if the printer is selected and not busy. You can also use this function
to do a form feed.
AH = 02: Get printer port status. This function determines the status of the printer. INT
17h returns a status code in AH which "describes" the status of the printer. The
printer number is specified in DX.
Bit number
7 6 5 4 3 0
229
COS2621/102/0/2024
D.3 DOS Interrupt 21h
INT 21h is called a DOS function call or DOS system call. There are 87 different functions
supported by this interrupt. The functions are identified by a function number placed in the AH
register. Some of these are listed below.
AH Purpose Description
3 Auxiliary input Waits for a character from the communications port and puts it in AL
6 Keyboard input/ Performs both input and output. It can also determine the input output.
status. Unlike the other input functions, it does not wait for an input character. Also
note that the input character is not automatically displayed and that the
Ctrl-Break command does not terminate the operation.
COS2621/102/02025
As usual, the DL register contains the output character and the AL register receives the input
character. You ask for input by placing the value FFh in the DL register. On return from the
function, the zero flag is set if no character is ready. If the zero flag is clear, it means a
character was read into the AL register. If the DL register contains any value other than FFh,
that character is sent to the standard output device. Also compare with functions 7 and 8.
7 Keyboard input Waits for a character from the keyboard and puts it in AL.
8 Keyboard input Waits for a character from the keyboard, returns it in AL and
9 Display string We have seen that functions 2 and 6 can display single
characters, but function 9 is much easier to use for more
than one character. Of course, non-displayable characters
(such as carriage return, line feed, and Esc) can also be
included in the string.
231
COS2621/102/0/2024
Input DX points to the input buffer: 1st byte = max. number of characters
allowed and 2nd byte = actual number of characters entered. The string
is stored from the third byte onwards.
28 Notes:
233
COS2621/102/0/2024
29 Appendix E
Contents
E.2 Instructions
The general format of an instruction is as follows (the square brackets indicate that an item is
optional in some statements):
where
mov ax,07
val1: db 45
label1: mov
bx,ax clc
Digits: 0 to 9
The first character must be an alphabetic or special character ($ or .). The maximum length
of a name or label is 31 characters.
Register names, instruction mnemonics and compiler directives are reserved words and may not
be used as names or labels.
E.2 Instructions
The description of instructions as set out in this Appendix was mainly compiled from the following
book:
235
COS2621/102/0/2024
Peter Abel. IBM PC Assembly Language and Programming, 3rd edition, Prentice-Hall, 1995.
NOTATION USED:
PF: Parity Flag (1 for even parity, 0 for odd parity) CF:
Carry Flag
operand
Adjusts the result in AL after two ASCII digits have been added together.
Converts unpacked BCD digits in AH and AL to a single binary value in preparation for
the DIV instruction.
Adjusts the result in AX after two unpacked BCD digits have been multiplied together.
Adjusts the result in AX after a subtraction operation where two ASCII digits are involved.
Adds the source and destination operands, and adds the contents of the CF to the sum,
which is then stored in the destination.
237
COS2621/102/0/2024
ADC reg,mem ADC mem,immed
ADC AL,immed
Adds the source operand to the destination operand and stores the result in the
destination operand.
AND: AND's the bits in the source and destination operands. The result is stored in the destination.
AND AL,immed
Scans a bit string for the first 1-bit. BSF scans from right to left and BSR scans from left
to right. Operand_2 contains the string to be scanned and the position of the first 1-bit
(if any) is placed in operand_1.
Flags: ZF
BT/BTC/BTR/BTS mem,reg
BT/BTC/BTR/BTS reg,immed
BT/BTC/BTR/BTS mem,immed
Flags: CF
239
COS2621/102/0/2024
CALL: CALLs a near or far procedure
If the procedure is NEAR (in the same segment): Pushes the current location (only the
offset) of the next instruction onto the stack and transfers control to the destination.
If the procedure is FAR (in a different segment): Pushes the current location (both
segment and offset) of the next instruction onto the stack and transfers control to the
destination.
CALL mem32_pointer
Flags: None
Format: CBW
Flags: None
Format: CLC
Flags: CF=0
Format: CLD
Flags: DF = 0
Format: CLI
Flags: IF=0
241
COS2621/102/0/2024
Complements the CF flag, i.e. reverses the CF bit value.
Format: CMC
Flags: CF is complemented.
Compares the destination to the source by doing an implied subtraction of the source
from the destination. The values of the operands do not change.
AL,immed
A REPn prefix normally precedes these instructions, along with a maximum value in
CX. REPn decrements CX by 1 for each repitition. The operation terminates when the
compared value is found (REPNE) or not found (REPE) or if CX = O. Note: DI and SI
are advanced past the byte that caused termination.
CMPS source,dest
Fills the DX with the sign bit (bit 15) of the AX register.
Format: CWD
Flags: None
Format: CWDE
243
COS2621/102/0/2024
Flags: None
Adjusts the binary sum in AL after two packed BCD values have been added.
The sum is converted to two BCD digits in AL.
Format: DAA
Converts the binary result of a subtraction operation to two packed BCD digits
in AL.
Format: DAS
Pseudo-ops give information to the assembler regarding the reservation and initialisation of
memory.
DB Allocates one byte of storage
DEC: DECrement
Divides an unsigned (the leftmost bit is treated as part of the data and not as a sign)
dividend by an unsigned divisor. If the divisor is 8 bits, the dividend is assumed to be in
AX, the quotient is stored in AL and the remainder in AH. If the divisor is 16 bits, the
dividend is assumed to be in DX:AX, the quotient is stored in AX and the remainder in
DX.
ESC: ESCape
245
COS2621/102/0/2024
For use with coprocessors. Provides the coprocessor with an instruction and an
optional operand for execution.
Flags: None
HLT: HaLT
Stops the CPU until a hardware interrupt occurs. The CS and IP registers
point to the instruction following the HLT. As soon as the interrupt occurs,
these two registers are pushed onto the stack and the interrupt service
routine is executed. On return from the interrupt, processing resumes
following the HLT.
Format: HLT
Flags: None
Note that we only specify the 2nd operand in the instruction. The first operand is
assumed to be in AX or AL.
Note that we only specify the 2nd operand in the instruction. The 1st operand is
assumed to be in DX:AX or in AX.
Flags: OF, CF
247
COS2621/102/0/2024
SF, ZF, AF and PF (all undefined)
Inputs a byte or word from an input port into AL or AX. Operand_2 is a port address
expressed as either an 8-bit constant or a 16-bit address in DX.
IN AX,immed IN AX,DX
Flags: None
INC: INCrement
Adds 1 to a register or memory position. The operation does not affect the carry flag.
INT: INTerrupt
Generates a software interrupt, which in turn calls a BIOS or DOS routine. INT performs
the following:
COS2621/102/0/2024
Decrements the SP by 2
Decrements the SP by 2
Decrements SP by 2
Fills IP with the low-order word of the interrupt service routine address
Flags: TF, IF
Format: INTO
Flags: IF, SF
249
COS2621/102/0/2024
IRET: Interrupt RETurn
Returns from an interrupt service routine. IRET retraces the steps that the interrupt
originally took and performs a return.
IRET performs the following:
• Pops the word at the top of the stack into the IP register.
• Increments SP by 2 and pops the top of the stack into the CS register.
• Increments the SP by 2 and pops the top of the stack into the flags register.
Format: IRET
Flags: All
Conditional Jumps:Jcondition
Jump if Carry
JC
Jump if CX is Zero, or if ECX is Zero
JCXZ/JECXZ
Jump if Equal, or Jump if Zero
JE/JZ
JG/JNLE Jump if Greater, or Jump if Not (Less than or
Equal)
Jump if No Carry
JNC
Jump if Not Equal, or Jump if Not Zero
JNE/JNZ
Jump if No Overflow
JNO
Jump if No Parity, or Jump if Parity Odd
JNP/JPO
JNS Jump if No Sign - jumps if an operation set the
sign to positive
251
COS2621/102/0/2024
Format: Jump_condition short_label
Jump_condition mem
Loads the lowest 8 bits of the flags register into the AH register.
Format: LAHF
Flags: None
LSS/LFS
Initialises a far address and offset of a data item so that succeeding
instructions can access it.
Flags: None
Calculates and loads the 16-bit effective address (offset) of a memory position into a
register.
Flags: None
Format: LODS
LODSB
LODSW
Flags: None
Executes a routine a specified number of times. The CX register must be loaded with the
iteration count before the start of the loop. Loop appears at the end of the loop. It
decrements CX and causes a jump if CX is not equal to zero. If CX=0, the next
instruction after LOOP is executed.
253
COS2621/102/0/2024
Flags: None
These instructions are similar to LOOP except that they terminate if CX=0 or if the ZF
flag is set to zero (i.e. a non-zero condition).
Similar to LOOP except that they terminate if CX=0 or if the ZF flag is set to 1 (zero
condition).
segreg,reg_16
MOV mem,immed
Flags: None
MOVS/MOVSB/
Copies a byte, word or double word from memory addressed by DS:SI to memory
addressed by ES:DI. Normally used with the REP prefix and the length in CX.
MOVS requires both operands to be specified. If DF=0, the operation moves data from left to
right and increments the DI and SI by 1, 2 or 4. If DF=1, the operation moves data from
right to left and DI and SI are decremented. The REP instruction decrements CX by
1 for each repetition. The operation terminates when CX=0.
Flags: None
AL (8 bits) reg/mem AX
255
COS2621/102/0/2024
Note that we only specify the 2nd operand in the instruction. The first operand is
assumed to be in AX or AL.
NEG: NEGate
NOP: NO Operation
Format:NOP
Flags: None
Flags: None
OR: Performs a logical OR on two operands. The result is stored in operand_1 (the destination operand).
OR mem,reg OR mem,immed
OR reg,mem OR AL,immed
OR AX,immed
AF is undefined
Transfers a byte or word from AL or AX to an output port. The port address is either a
constant in the range 0 to FFh or a variable in DX if the value of the port address is
greater than FFh.
Flags: None
POP: POPs a word or double word from the stack to a specified destination and adds 2 or 4
to the SP
257
COS2621/102/0/2024
Format: POP reg_16 POP mem_16 POP seg_reg
Flags: None
Pops the top 8 words from the stack into DI, SI, BP, SP, BX, DX, CX, AX, in that order
and increments the SP by 16. (Normally used when a PUSHA was used to push the
registers.)
Format: POPA
Flags: None
Pops a word from the stack to the flags register and increments the SP by 2.
Format: POPF/POPFD
Flags: All
Flags: None
Pushes the AX, CX, DX, BX, SP, BP, SI, DI, in that order, onto the stack and decrements
the SP by 16. (Normally, a POPA later pops the
registers.)
Format: PUSHA
Flags: None
Decrements the SP by two and copies the flags register to the top of the stack.
Format: PUSHF
Flags: None
RCL rotates (shifts) the destination operand to the left a number of times as specified
by the source operand. The carry flag is copied into the least significant bit and the most
significant bit is copied into the carry flag with each shift. The CL register or a constant
may be used to control the number of rotations.
259
COS2621/102/0/2024
RCR rotates (shifts) the destination operand to the right a number of times as specified
by the source operand. The carry flag is copied into the most significant bit and the least
significant bit is copied into the carry flag with each shift. The CL register or a constant
may be used to control the number of rotations.
Flags: OF, CF
Repeats a string operation a specified number of times. These are optional repeat
prefixes coded before string instructions like CMPS, MOVS, SCAS and STOS. CX must
be loaded with a count prior to execution. The operation decrements CX by 1 for each
execution of the string instruction. For REP, the operation repeats until CX is 0. For
REPE/REPZ, the operation repeats until CX is 0 or until ZF is 0 (nonzero condition). For
REPNE/REPNZ, the operation repeats until CX is 0 or until ZF is 1 (zero condition).
REP repeats a string operation a specified number of times. CX is used as a counter and is
decremented each time the instruction is repeated. The operation repeats until CX=0.
NEAR: RET/RETN (return near) pops the word at the top of the stack into IP and
increments SP by 2.
FAR: RET/RETF pops the 2 words at the top of the stack into IP and CS and
increments SP by 4.
Flags: None
ROL rotates (shifts) the destination operand to the left a number of times as specified by the
source operand. The leftmost (most significant) bit is rotated into the least significant
position. The most significant bit also moves into the carry flag for each shift. The CL
register or a constant may be used to control the number of rotations.
ROR rotates (shifts) the destination operand to the right a number of times as specified
by the source operand. The rightmost (least significant) bit rotates into the most
261
COS2621/102/0/2024
significant position. The least significant bit also moves to the carry flag for each shift.
The CL register or a constant may be used to control the number of rotations.
Flags: OF,CF
Format: SAHF
SAL shifts the destination operand to the left a number of times as specified by the
source operand. Zero is shifted into the least significant bit and the most significant bit
is shifted into the carry flag for each shift. The CL register or a constant may be used as
the source operand.
SAR shifts the destination operand to the right a number of times as specified by the source
operand. The most significant bit retains its previous value which means that the sign
bit is duplicated with each shift. The least significant bit is shifted into the carry flag with
each shift. The CL register or a constant may be used as the source operand.
Subtracts the source operand from the destination operand and then subtracts the value
of CF from the destination. SBB is used in multiword subtraction to carry an overflowed
1 bit into the next stage of arithmetic.
Scans a string in memory pointed to by ES:DI for a value that matches a value in AL or
AX.
263
COS2621/102/0/2024
SET: Set bytes conditionally
SHR shifts bits to the right a specified number of times and fills the bits on the left with 0's.
The CL register or a constant (for some assemblers the constant may only be a 1)
may be used to control the number of shifts.
Sets the CF to 1.
Sets the DF flag to 1. This is used for string operations. SI and/or DI will be decremented
by string operations which in turn causes string operations to process strings from right
to left.
Format: STD
Flags: DF=1.
Format: STI
Flags: IF =1.
STOSD
Stores either AL (byte) or AX (word) in the memory position addressed by ES:DI. DI is
incremented if DF=0 and DI is decremented if DF=1 (compare with STD).
Flags: None
265
COS2621/102/0/2024
SUB: SUBtract
Subtracts the source operand from the destination operand. Result is stored in the
destination operand.
SUB AX,immed
Tests individual bits in the destination operand against those in the source operand.
TEST performs a logical AND, but the destination operand is not affected.
TEST AX,immed
Allows the processor to remain in a wait state until an external interrupt occurs in order
to synchronize it with an external device or coprocessor.
EXCHanGes operands
XCHG AX,immed
Flags: None
Translate bytes
Translate bytes into a different format, such as ASCII or EBCDIC. A table has
to be defined and pointed to by DS:BX. AL is used as an offset into the table.
Selects the byte from the table and stores it in AL.
XLAT mem
Flags: None
EXclusive OR
Performs a logical exclusive OR on the bits in two operands.
Format: XOR reg,reg XOR
267
COS2621/102/0/2024
reg,immed XOR reg,mem XOR mem,immed
XOR AX,immed
---oooOooo—
COS2621/102/0/2025
1 31 Appendix F
32 THE ASCII TABLE
ASCII-code Character ASCII-code Character ASCII-code Character
0000 0000 NUL 0011 0000 0 0110 0000 `
269
COS2621/102/0/2025
COS2621/102/0/2025
33 Bibliography
P. Abel. IBM PC Assembly Language and Programming, Fifth Edition. USA: PrenticeHall, 2001.
S.D. Burd. Systems Architecture, Sixth Edition. Canada: Course Technology, Thompson Learning,
2010.
J.D. Carpinelli. Computer Systems: Organization & Architecture. USA: Addison Wesley Longman, 2000.
R.C. Detmer. Introduction to 80x86 assembly language and computer architecture. Jones & Bartlett
Publishers, 2001.
J. Duntemann. Assembly Language Step-by-Step, Second Edition. USA: John Wiley & Sons, 2000.
J.L. Hennessy & D.A. Patterson. Computer Architecture: A Quantitative Approach, 3rd edition, Morgan
Kaufmann Publishers, 2003.
V.P. Heuring & H.F. Jordan. Computer Systems Design and Architecture. USA: AddisonWesley, 1997.
K.R. Irvine. Assembly Language for the IBM PC, Second Edition. USA: Macmillan Publishing Company,
1993.
N.S. Matlof. IBM Microcomputer Architecture and Assembly Language. A look under the hood.
USA: Prentice-Hall International, 1992.
W. Stallings. Computer Organization and Architecture. Designing for Performance, 9th edition. USA:
Prentice-Hall International, 2013.
A.S. Tanenbaum. Structured Computer Organization, Fourth Edition. USA: Prentice-Hall International,
1999.
M. Thorne. Computer Organization and Assembly Language Programming. For IBM PC’s and
Compatibles, 2nd Edition. The Benjamin/Cummings Publishing Company, 1991.
271