Assembly Language:-: Processor Instructions
Assembly Language:-: Processor Instructions
Assembly Language:-: Processor Instructions
Advanced Programming
Section (1)
Assembly Language:-
One of the first hurdles to learning assembly language programming is understanding just what assembly
language is. Unlike other programming languages, there is no one standard format that all assemblers use.
Different assemblers use different syntax for writing program statements.
Many beginning assembly language programmers get caught up in trying to figure out the myriad of different
possibilities in assembly language programming.
The first step in learning assembly language programming is defining just what type of assembly language
programming you want to (or need to) use in your environment. Once you define your flavor of assembly
language, it is easy to get started learning and using assembly language in both standalone and high-level
language programs.
Processor Instructions
At the lowest layer of operation, all computer processors (microcomputers, minicomputers, and mainframe
computers) manipulate data based on binary codes defined internally in the processor chip by the
manufacturer. These codes define what functions the processor should perform, utilizing
the data provided by the programmer. These preset codes are referred to as instruction codes.
Different types of processors contain different types of instruction codes. Processor chips are often categorized
by the quantity and type of instruction codes they support.
High-Level Languages
If it looks like programming in pure processor instruction code is difficult, it is. Even the simplest of programs
require the programmer to specify a lot of opcodes and data bytes. Trying to manage a huge program full of
just instruction codes would be a daunting task. To help save the sanity of programmers, high-level languages
(HLLs) were created.
HLLs enable programmers to create functions using simpler terms, rather than raw processor instruction codes.
Special reserved keywords are used to define variables (memory locations for data), create loops (jump over
instruction codes), and handle input and output from the program. However, the processor does not have any
knowledge about how to handle the HLL code. The code must be converted by some mechanism to simple
instruction code format for the processor to handle.
Compiled languages
Most production applications are created using compiled HLLs. The programmer creates a program using
common statements for the language which carry out the logic of the application. The text program statements
are then converted into a set of instruction codes that can be run on the processor.
Usually, what is commonly called compiling a program is actually a two-step process:
❑ Compiling the HLL statements into raw instruction codes
❑ Linking the raw instruction codes to produce an executable program.
This step produces an intermediate file, called an object code file.
The object code file contains the instruction codes that represent the core of the application functions, as
shown above. The object code file itself cannot be run by the operating system. Often the host operating
system requires special file formats for executable files (program files that can be run on the system), and the
HLL program may require program functions from other object files. Another step is required to add these
components.
After the code is compiled into an object file, a linker is used to link the application object code file with any
additional object files required by the application and to create the final executable output file. The output of
the linker is an executable file that can only be run on the operating system for which the program is written.
Unfortunately, each operating system uses a different format for executable files, so an application compiled
on a Microsoft Windows workstation will not work as is on a Linux workstation, and vice versa. Object files that
contain commonly used functions can be combined into a single file, called a library file. The library file can then
be linked into multiple applications either at compile time (called static libraries), or at the time the application
is run on the system (called dynamic libraries).
Interpreted languages
Obviously, the downside to using interpreted languages is speed. Instead of the program being compiled
directly to instruction codes that are run on the processor, an intermediary program reads each lineof program
code and processes the required functions. The amount of time the host program takes to read the code and
execute it adds additional delays to the execution of the application.
With the resulting reduction in speed when using interpreted languages, you may be wondering why anyone
still uses them. One answer is convenience. With compiled programs, every time a change is made to the
program, the program must be recompiled and relinked with the proper code libraries. With interpreted
programs, changes can be quickly made to the source code file and the program rerun to check for errors. In
addition, with interpreted languages, the interpreter application automatically determines what functions
need to be included with the core code to support functions.
Hybrid languages
Hybrid languages are a recent trend in programming that combine the features of a compiled program with the
versatility and ease of an interpreted program. A perfect example is the popular Java programming language.
The Java programming language is compiled into what is called byte code. The byte code is similar to the
instruction code you would see on a processor, but is itself not compatible with any current processor family
(although there have been plans to create a processor that can run Java byte code as instruction sets).
Instead, the Java byte code must be interpreted by a Java Virtual Machine (JVM), running separately on the
host computer. The Java byte code is portable, in that it can be run by any JVM on any type of host computer.
The advantage is that different platforms can have their own specific JVMs, which are used to interpret the
same Java byte code without it having to be recompiled from the original source code.
One feature that many of the new processors on the market offer is advanced mathematics handling
instruction codes. These instruction codes help speed up complex mathematical expression processing by using
larger-than-normal byte sizes to represent numbers (either 64 or 128 bits). Unfortunately, many compilers
don’t take advantage of these advanced instruction codes. Fortunately, there is a simple solution for the
programmer. In environments where execution speed is critical, assembly language programming can come to
the rescue. Of course, the first step to improving execution speed is to ensure that the best algorithm is used in
the first place. Optimizing a poor algorithm does not compensate for using
a fast algorithm in the first place.
Registers: Registers are fast memory, almost always connected to circuitry that allows various arithmetic,
logical, control, and other manipulations, as well as possibly setting internal flags.
Most early computers had only one data register that could be used for arithmetic and logic instructions.
Often there would be additional special purpose registers set aside either for temporary fast internal storage or
assigned to logic circuits to implement certain instructions. Some early computers had one or two address
registers that pointed to a memory location for memory accesses (a pair of address registers typically would
act as source and destination pointers for memory operations). Computers soon had multiple data registers,
address registers, and sometimes other special purpose registers. Some computers have general purpose
registers that can be used for both data and address operations. Every digital computer using a von Neumann
architecture has a register (called the program counter) that points to the next executable instruction. Many
computers have additional control registers for implementing various control capabilities. Often some or all of
the internal flags are combined into a flag or status register.
Ax (Accumulators): are registers that can be used for arithmetic, logical, shift, rotate, or other similar
operations. The first computers typically only had one accumulator. Many times there were related special
purpose registers that contained the source data for an accumulator. Accumulators were replaced with data
registers and general purpose registers. Accumulators reappeared in the first microprocessors.
BX,CX,DX (General purpose registers): can be used as either data or address registers.
SI,DI (Index registers): are used to provide more flexibility in addressing modes, allowing the programmer to
create a memory address by combining the contents of an address register with the contents of an index
register (with displacements, increments, decrements, and other options). In some processors, there are
specific index registers (or just one index register) that can only be used only for that purpose. In some
processors, any data register, address register, or general register (or some combination of the three) can be
used as an index register.
CS, DS, SS, ES (Base registers or segment registers): are used to segment memory. Effective addresses are
computed by adding the contents of the base or segment register to the rest of the effective address
computation. In some processors, any register can serve as a base register. In some processors, there are
specific base or segment registers (one or more) that can only be used for that purpose. In some processors
with multiple base or segment registers, each base or segment register is used for different kinds of memory
accesses (such as a segment register for data accesses and a different segment register for program accesses).
IP or PC (program counter): Almost every digital computer ever made uses a program counter. The program
counter points to the memory location that stores the next executable instruction. Branching is implemented
by making changes to the program counter. Some processor designs allow software to directly change the
program counter, but usually software only indirectly changes the program counter (for example, a JUMP
instruction will insert the operand into the program counter). An assembler has a location counter, which is an
internal pointer to the address (first byte) of the next location in storage (for instructions, data areas,
constants, etc.) while the source code is being converted into object code.
Processor flags store information about specific processor functions. The processor flags are usually kept in a
flag register or a general status register. This can include result flags that record the results of certain kinds of
testing, information about data that is moved, certain kinds of information about the results of compactions or
transformations, and information about some processor states. Closely related and often stored in the same
processor word or status register (although often in a privileged portion) are control flags that control
processor actions or processor states or the actions of certain instructions.
.model <type>
.data
.code
<label>:
mov ax,@data
mov ds,ax
mov ax,00h ; comment removing @data from ax register
mov ah,4ch
int 21h
end <label>