Emu8086 PDF
Emu8086 PDF
Emu8086 PDF
Decimal System
Most people today use decimal representation to count. In the
decimal system there are 10 digits:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
These digits can represent any value, for example:
754.
The value is formed by the sum of each digit, multiplied by the base
(in this case it is 10 because there are 10 digits in decimal system) in
power of digit position (counting from zero):
Position of each digit is very important! for example if you place "7"
to the end:
547
Arquitectura de Computadoras 2011 UTN FRMza Ingeniera en Sistemas
Binary System
Computers are not as smart as humans are (or not yet), it's easy to
make an electronic machine with two states: on and off, or 1 and 0.
Computers use binary system, binary system uses 2 digits:
0, 1
And thus the base is 2.
Each digit in a binary number is called a BIT, 4 bits form a NIBBLE,
8 bits form a BYTE, two bytes form a WORD, two words form a
DOUBLE WORD (rarely used):
Hexadecimal System
Hexadecimal System uses 16 digits:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
And thus the base is 16.
Hexadecimal numbers are compact and easy to read.
It is very easy to convert numbers from binary system to
hexadecimal system and vice-versa, every nibble (4 bits) can be
converted to a hexadecimal digit using this table:
Decimal Binary
Hexadecimal
(base 10) (base 2) (base 16)
0
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
10
1010
11
1011
12
1100
13
1101
14
1110
15
1111
Signed Numbers
There is no way to say for sure whether the hexadecimal byte 0FFh
is positive or negative, it can represent both decimal value "255" and
"- 1".
8 bits can be used to create 256 combinations (including zero), so we
simply presume that first 128 combinations (0..127) will represent
positive numbers and next 128 combinations (128..256) will
represent negative numbers.
Arquitectura de Computadoras 2011 UTN FRMza Ingeniera en Sistemas
Base converter allows you to convert numbers from any system and
to any system. Just type a value in any text-box, and the value will
be automatically converted to all other systems. You can work both
with 8 bit and 16 bit values.
Multi base calculator can be used to make calculations between
numbers in different systems and convert numbers from one system
to another. Type an expression and press enter, result will appear in
chosen numbering system. You can work with values up to 32 bits.
When Signed is checked evaluator assumes that all values (except
decimal and double words) should be treated as signed. Double
words are always treated as signed values, so 0FFFFFFFFh is
converted to -1.
For example you want to calculate: 0FFFFh * 10h + 0FFFFh (maximum
memory location that can be accessed by 8086 CPU). If you check
Signed and Word you will get -17 (because it is evaluated as (-1) * 16
+ (-1) . To make calculation with unsigned values uncheck Signed so
that the evaluation will be 65535 * 16 + 65535 and you should get
1114095.
You can also use the base converter to convert non-decimal digits
Arquitectura de Computadoras 2011 UTN FRMza Ingeniera en Sistemas
because registers are located inside the CPU, they are much faster
than memory. Accessing a memory location requires the use of a
system bus, so it takes much longer. Accessing data in a register
usually takes no time. therefore, you should try to keep variables in
the registers. register sets are very small and most registers have
special purposes which limit their use as variables, but they are still
an excellent place to store temporary data of calculations.
segment registers
CS - points at the segment containing the current program.
DS - generally points at segment where variables are defined.
ES - extra segment register, it's up to a coder to define its
usage.
SS - points at the segment containing the stack.
although it is possible to store any data in the segment registers, this
is never a good idea. the segment registers have a very special
purpose - pointing at accessible blocks of memory.
segment registers work together with general purpose register to
access any memory value. For example if we would like to access
memory at the physical address 12345h (hexadecimal), we should
set the DS = 1230h and SI = 0045h. This is good, since this way
we can access much more memory than with a single register that is
limited to 16 bit values.
CPU makes a calculation of physical address by multiplying the
segment register by 10h and adding general purpose register to it
(1230h * 10h + 45h = 12345h):
[SI]
[DI]
d16 (variable offset only)
[BX]
[BX + SI + d8]
[BX + DI + d8]
[BP + SI + d8]
[BP + DI + d8]
[SI + d8]
[DI + d8]
[BP + d8]
[BX + d8]
[BX + SI + d16]
[BX + DI + d16]
[BP + SI + d16]
[BP + DI + d16]
[SI + d16]
[DI + d16]
[BP + d16]
[BX + d16]
you can form all valid combinations by taking only one item from
each column or skipping the column by not taking anything from it.
as you see BX and BP never go together. SI and DI also don't go
together. here are an examples of a valid addressing modes:
[BX+5]
,
[BX+SI]
,
[DI+BX-4]
the value in segment register (CS, DS, SS, ES) is called a segment,
and the value in purpose register (BX, SI, DI, BP) is called an offset.
When DS contains value 1234h and SI contains the value 7890h it
can be also recorded as 1234:7890. The physical address will be
1234h * 10h + 7890h = 19BD0h.
if zero is added to a decimal number it is multiplied by 10, however
10h = 16, so if zero is added to a hexadecimal value, it is multiplied
by 16, for example:
7h = 7
70h = 112
MOV instruction
copies the second operand (source) to the first operand
(destination).
the source operand can be an immediate value, generalpurpose register or memory location.
the destination register can be a general-purpose register, or
memory location.
both operands must be the same size, which can be a byte or a
word.
these types of operands are supported:
MOV REG, memory
MOV memory, REG
MOV REG, REG
MOV memory, immediate
MOV REG, immediate
REG: AX, BX, CX, DX, AH, AL, BL, BH, CH, CL, DH, DL, DI, SI, BP, SP.
memory: [BX], [BX+SI+7], variable, etc...
immediate: 5, -24, 3Fh, 10001101b, etc...
The MOV instruction cannot be used to set the value of the CS and
IP registers.
here is a short program that demonstrates the use of MOV instruction:
ORG 100h
; this directive required for a simple 1 segment .com program.
MOV AX, 0B800h ; set AX to hexadecimal value of B800h.
MOV DS, AX
; copy value of AX to DS.
MOV CL, 'A'
; set CL to ASCII code of 'A', it is 41h.
MOV CH, 1101_1111b ; set CH to binary value.
MOV BX, 15Eh
; set BX to 15Eh.
MOV [BX], CX
; copy contents of CX to memory at B800:015E
RET
; returns to operating system.
you can copy & paste the above program to emu8086 code editor,
and press [Compile and Emulate] button (or press F5 key on your
keyboard).
the emulator window should open with this program loaded, click
[Single Step] button and watch the register values.
how to do copy & paste:
1. select the above text using mouse, click before the text and
drag it down until everything is selected.
2. press Ctrl + C combination to copy.
3. go to emu8086 source editor and press Ctrl + V combination to
paste.
as you may guess, ";" is used for comments, anything after ";"
symbol is ignored by compiler.
ORG 100h
MOV AL, var1
MOV BX, var2
RET
VAR1 DB 7
var2 DW 1234h
Copy the above code to emu8086 source editor, and press F5 key to
compile and load it in the emulator. You should get something like:
Arquitectura de Computadoras 2011 UTN FRMza Ingeniera en Sistemas
As you see this looks a lot like our example, except that variables are
replaced with actual memory locations. When compiler makes
machine code, it automatically replaces all variable names with their
offsets. By default segment is loaded in DS register (when COM files
is loaded the value of DS register is set to the same value as CS
register - code segment).
In memory list first row is an offset, second row is a hexadecimal
value, third row is decimal value, and last row is an ASCII
character value.
Compiler is not case sensitive, so "VAR1" and "var1" refer to the
same variable.
The offset of VAR1 is 0108h, and full address is 0B56:0108.
The offset of var2 is 0109h, and full address is 0B56:0109, this
variable is a WORD so it occupies 2 BYTES. It is assumed that low
byte is stored at lower address, so 34h is located before 12h.
You can see that there are some other instructions after the RET
instruction, this happens because disassembler has no idea about
where the data starts, it just processes the values in memory and it
understands them as valid 8086 instructions (we will learn them
later).
You can even write the same program using DB directive only:
Copy the above code to emu8086 source editor, and press F5 key to
compile and load it in the emulator. You should get the same
disassembled code, and the same functionality!
As you may guess, the compiler just converts the program source to
the set of bytes, this set is called machine code, processor
understands the machine code and executes it.
ORG 100h is a compiler directive (it tells compiler how to handle the
source code). This directive is very important when you work with
variables. It tells compiler that the executable file will be loaded at
the offset of 100h (256 bytes), so compiler should calculate the
correct address for all variables when it replaces the variable names
with their offsets. Directives are never converted to any real
machine code.
Why executable file is loaded at offset of 100h? Operating system
keeps some data about the program in the first 256 bytes of the CS
(code segment), such as command line parameters and etc.
Though this is true for COM files only, EXE files are loaded at offset
of 0000, and generally use special segment for variables. Maybe we'll
talk more about EXE files later.
Arrays
Arrays can be seen as chains of variables. A text string is an example
of a byte array, each character is presented as an ASCII code value
(0..255).
You can access the value of any element in array using square
brackets, for example:
MOV AL, a[3]
You can also use any of the memory index registers BX, SI, DI, BP,
for example:
MOV SI, 3
MOV AL, a[SI]
If you need to declare a large array you can use DUP operator.
The syntax for DUP:
number DUP ( value(s) )
number - number of duplicate to make (any constant value).
value - expression that DUP will duplicate.
for example:
c DB 5 DUP(9)
is an alternative way of declaring:
c DB 9, 9, 9, 9, 9
one more example:
d DB 5 DUP(1, 2)
is an alternative way of declaring:
d DB 1, 2, 1, 2, 1, 2, 1, 2, 1, 2
Of course, you can use DW instead of DB if it's required to keep
values larger then 255, or smaller then -128. DW cannot be used to
declare strings.
Reminder:
ORG 100h
MOV AL, VAR1
; check value of
VAR1 by moving it to AL.
LEA BX, VAR1
in BX.
; modify the
ORG 100h
MOV AL, VAR1
; check value of
VAR1 by moving it to AL.
MOV BX, OFFSET VAR1
VAR1 in BX.
; get address of
; modify the
Constants
Constants are just like variables, but they exist only until your
program is compiled (assembled). After definition of a constant its
value cannot be changed. To define constants EQU directive is used:
name EQU < any expression >
For example:
k EQU 5
MOV AX, k
You can edit a variable's value when your program is running, simply
double click it, or select it and click Edit button.
Arquitectura de Computadoras 2011 UTN FRMza Ingeniera en Sistemas
ORG
file.
AH, 0Eh
; select sub-function.
Copy & paste the above program to emu8086 source code editor, and
press [Compile and Emulate] button. Run it!
See list of supported interrupts for more information about
interrupts.
To use any of the above macros simply type its name somewhere in
your code, and if required parameters, for example:
include emu8086.inc
ORG
100h
include 'emu8086.inc'
ORG
100h
AX, CX
First compiler processes the declarations (these are just regular the
macros that are expanded to procedures). When compiler gets to
CALL instruction it replaces the procedure name with the address of
the code where the procedure is declared. When CALL instruction is
executed control is transferred to procedure. This is quite useful,
since even if you call the same procedure 100 times in your code you
will still have relatively small executable size. Seems complicated,
isn't it? That's ok, with the time you will learn more, currently it's
required that you understand the basic principle.
As you may see there are 16 bits in this register, each bit is called a
flag and can take a value of 1 or 0.
Carry Flag (CF) - this flag is set to 1 when there is an
unsigned overflow. For example when you add bytes 255 +
1 (result is not in range 0...255). When there is no overflow
this flag is set to 0.
Zero Flag (ZF) - set to 1 when result is zero. For none zero
result this flag is set to 0.
Sign Flag (SF) - set to 1 when result is negative. When result
is positive it is set to 0. Actually this flag take the value of the
most significant bit.
Overflow Flag (OF) - set to 1 when there is a signed
overflow. For example, when you add bytes 100 + 50 (result
is not in range -128...127).
Parity Flag (PF) - this flag is set to 1 when there is even
number of one bits in result, and to 0 when there is odd
number of one bits. Even if result is a word only 8 low bits are
analyzed!
Auxiliary Flag (AF) - set to 1 when there is an unsigned
overflow for low nibble (4 bits).
0 AND 1 = 0
0 AND 0 = 0
As you see we get 1 only when both bits are 1.
TEST - The same as AND but for flags only.
OR - Logical OR between all bits of two operands. These rules
apply:
1 OR 1 = 1
1 OR 0 = 1
0 OR 1 = 1
0 OR 0 = 0
As you see we get 1 every time when at least one of the bits is
1.
XOR - Logical XOR (exclusive OR) between all bits of two
operands. These rules apply:
1 XOR 1 = 0
1 XOR 0 = 1
0 XOR 1 = 1
0 XOR 0 = 0
As you see we get 1 every time when bits are different from
each other.
org
100h
mov
mov
ax, 5
bx, 2
jmp
calc
; set ax to 5.
; set bx to 2.
; go to 'calc'.
; go to 'stop'.
calc:
add ax, bx
jmp back
; add bx to ax.
; go 'back'.
stop:
ret
Description
Condition
Opposite
Instruction
JZ , JE
ZF = 1
JNZ, JNE
JC , JB,
JNAE
CF = 1
JS
Jump if Sign.
SF = 1
JNS
JO
Jump if Overflow.
OF = 1
JNO
JPE, JP
PF = 1
JPO
JNZ , JNE
ZF = 0
JZ, JE
JNC , JNB,
JAE
CF = 0
JNS
SF = 0
JS
JNO
OF = 0
JO
JPO, JNP
PF = 0
JPE, JP
ORG
100h
CALL m1
MOV
RET
AX, 2
; return to operating system.
m1 PROC
MOV BX, 5
RET
; return to caller.
m1
ENDP
END
The above example calls procedure m1, does MOV BX, 5, and
returns to the next instruction after CALL: MOV AX, 2.
There are several ways to pass parameters to procedure, the easiest
way to pass parameters is by using registers, here is another
example of a procedure that receives two parameters in AL and BL
registers, multiplies these parameters and returns the result in AX
register:
ORG
100h
MOV
MOV
AL, 1
BL, 2
CALL
CALL
CALL
CALL
m2
m2
m2
m2
RET
m2 PROC
MUL BL
RET
m2 ENDP
; AX = AL * BL.
; return to caller.
END
ORG
100h
CALL print_me
RET
;
==========================================================
; this procedure prints a string, the string should be null
; terminated (have zero in the end),
; the string address should be in SI register:
print_me PROC
next_char:
CMP b.[SI], 0 ; check for zero to stop
JE stop
;
MOV AL, [SI]
stop:
RET
; return to caller.
print_me ENDP
;
==========================================================
msg
END
"b." - prefix before [SI] means that we need to compare bytes, not
words. When you need to compare words add "w." prefix instead.
When one of the compared operands is a register it's not required
because compiler knows the size of each register.
Notes:
PUSH and POP work with 16 bit values only!
Note: PUSH immediate works only on 80186 CPU and later!
ORG
100h
RET
END
ORG
100h
MOV
MOV
PUSH AX
PUSH BX
POP
POP
AX
BX
RET
END
The exchange happens because stack uses LIFO (Last In First Out)
algorithm, so when we push 1212h and then 3434h, on pop we will
first get 3434h and only after it 1212h.
MACRO [parameters,...]
<instructions>
ENDM
MOV AX, p1
MOV BX, p2
MOV CX, p3
ENDM
ORG 100h
MyMacro 1, 2, 3
MyMacro 4, 5, DX
RET
MOV CX, DX
MyMacro2
MACRO
LOCAL label1, label2
CMP AX, 2
JE label1
CMP AX, 3
JE label2
label1:
INC AX
label2:
ADD AX, 2
ENDM
ORG 100h
MyMacro2
MyMacro2
RET
INT 16h
0FFFFh:0000h
; reboot!
copy the above example to the source editor and press emulate. the
emulator automatically loads .bin file to 0000h:7C00h (it uses
supplementary .binf file to know where to load).
you can run it just like a regular program, or you can use the virtual
drive menu to write 512 bytes at 7c00h to boot sector of a
virtual floppy drive (it's "FLOPPY_0" file in c:\emu8086). after your
program is written to the virtual floppy drive, you can select boot
from floppy from virtual drive menu.
.bin files for boot records are limited to 512 bytes (sector size). if
your new operating system is going to grow over this size, you will
need to use a boot program to load data from other sectors (just like
micro-os_loader.asm does). an example of a tiny operating system
can be found in c:\emu8086\examples and "online":
micro-os_loader.asm
micro-os_kernel.asm
To create extensions for your Operating System (over 512 bytes),
you can use additional sectors of a floppy disk. It's recommended to
use ".bin" files for this purpose (to create ".bin" file select "BIN
Template" from "File" -> "New" menu).
To write ".bin" file to virtual floppy, select "Write .bin file to
floppy..." from "Virtual drive" menu of emulator, you should write
it anywhere but the boot sector (which is Cylinder: 0, Head: 0,
Arquitectura de Computadoras 2011 UTN FRMza Ingeniera en Sistemas
Sector: 1).
you can use this utility to write .bin files to virtual floppy disk
("FLOPPY_0" file), instead of "write 512 bytes at 7c00h to boot
sector" menu. however, you should remember that .bin file that is
designed to be a boot record should always be written to cylinder: 0,
head: 0, sector: 1
Boot Sector Location:
Cylinder: 0
Head: 0
Sector: 1
to write .bin files to real floppy disk use writebin.asm, just compile it
to com file and run it from command prompt. to write a boot record
type: writebin loader.bin ; to write kernel module type: writebin
kernel.bin /k
/k - parameter tells the program to write the file at sector 2 instead
of sector 1. it does not matter in what order you write the files onto
floppy drive, but it does matter where you write them.
mote: this boot record is not MS-DOS/Windows compatible boot
sector, it's not even Linux or Unix compatible, operating system may
not allow you to read or write files on this diskette until you re-format
it, therefore make sure the diskette you use doesn't contain any
important information. however you can write and read anything to
and from this disk using low level disk access interrupts, it's even
possible to protect valuable information from the others this way;
even if someone gets the disk he will probably think that it's empty
and will reformat it because it's the default option in windows
Arquitectura de Computadoras 2011 UTN FRMza Ingeniera en Sistemas
sectors instead of using file system, and in some cases it is also the
most reliable way, if you know how to use it.
to read sectors from floppy drive use INT 13h / AH = 02h.
Traffic Lights
next:
mov ax, [si]
out 4, ax
; wait 5 seconds (5 million microseconds)
mov cx, 4Ch ; 004C4B40h = 5,000,000
mov dx, 4B40h
mov ah, 86h
int 15h
;
FEDC_BA98_7654_3210
situation
dw 0000_0011_0000_1100b
s1
dw 0000_0110_1001_1010b
s2
dw 0000_1000_0110_0001b
s3
dw 0000_1000_0110_0001b
s4
dw 0000_0100_1101_0011b
sit_end = $
all_red
equ
0000_0010_0100_1001b
Stepper-Motor
Robot
Result
R-Button click
Moves text
Copies text
Selects line
Undo
Document End
Control + End
Document End Extend Control + Shift + End
Document Start
Control + Home
Document Start Extend Control + Shift + Home
Find
Control + F, Alt + F3
Find Next
F3
Find Next Word
Control + F3
Find Prev
Shift + F3
Find Prev Word
Control + Shift + F3
Find and Replace
Control + H, Control + Alt + F3
Go To Line
Control + G
Go To Match Brace
Control + ]
Select All
Control + A
Select Line
Control + Alt + F8
Select Swap Anchor
Control + Shift + X
Insert New Line Above Control + Shift + N
Indent Selection
Outdent Selection
Tab
Shift + Tab
Tabify Selection
Untabify Selection
Control + Shift + T
Control + Shift + Space
Lowercase Selection
Uppercase Selection
Left Word
Right Word
Left Sentence
Right Sentence
Control + L
Control + U, Control + Shift + U
Control + Left
Control + Right
Control + Alt + Left
Control + Alt + Right
Toggle Overtype
Display Whitespace
Insert
Control + Alt + T
Scroll Window Up
Control + Down
Scroll Window Down
Control + Up
Scroll Window Left
Control + PageUp
Scroll Window Right Control + PageDown
Control + Delete
Control + Backspace
Ctrl + Q
Ctrl + W
If there are problems with the source editor you may need to
manually copy "cmax20.ocx" from program's folder into
Windows\System or Windows\System32 replacing any existing
version of that file (restart may be required before system allows to
replace existing file).
type your code inside the text area, and click compile button. you
will be asked for a place where to save the compiled file.
after successful compilation you can click emulate button to load the
compiled file in emulator.
the output file type directives:
#make_com#
#make_bin#
#make_boot#
#make_exe#
you can insert these directives in the source code to specify the
required output type for the file. only if compiler cannot determine
the output type automatically and it when it cannot find any of these
directives it may ask you for output type before creating the file.
there is virtually no difference between how .com and .bin are
assembled because these files are raw binary files, but .exe file has a
special header in the beginning of the file that is used by the
operating system to determine some properties of the executable file.
#make_bin#
#LOAD_SEGMENT=1234#
#LOAD_OFFSET=0000#
#AL=12#
#AH=34#
#BH=00#
#BL=00#
#CH=00#
#CL=00#
#DH=00#
#DL=00#
#DS=0000#
#ES=0000#
#SI=0000#
#DI=0000#
#BP=0000#
#CS=1234#
#IP=0000#
#SS=0000#
#SP=0000#
#MEM=0100:FFFE,00FF-0100:FF00,F4#
8000
0000
55
66
77
88
99
AA
BB
CC
DDEE
ABCD
EF12
3456
7890
8000
0000
C123
D123
; load to segment.
; load to offset.
; AL
; AH
; BL
; BH
; CL
; CH
; DL
; DH
; DS
; ES
; SI
; DI
; BP
; CS
; IP
; SS
; SP
error processing
assembly language compiler (or assembler) reports about errors in a
separate information window:
When saving an assembled file, compiler also saves 2 other files that
are later used by the emulator to show original source code when you
run the binary executable, and select corresponding lines. Very often
the original code differs from the disabled code because there are no
comments, no segment and no variable declarations. Compiler
directives produce no binary code, but everything is converted to
pure machine code. Sometimes a single original instruction is
assembled into several machine code instructions, this is done mainly
Arquitectura de Computadoras 2011 UTN FRMza Ingeniera en Sistemas
short description
00000 - 00400
00400 - 00500
00500 - A0000
A0000 - B1000
B1000 - B8000
Reserved.
Not used by the emulator.
B8000 - C0000
C0000 - F4000
Reserved.
F4000 - 10FFEF
address of
BIOS sub-program
00
00x4 = 00
F400:0170 - CPU-generated,
divide error.
04
04x4 = 10
F400:0180 - CPU-generated,
INTO detected
overflow.
10
10x4 = 40
11
11x4 = 44
12
12x4 = 48
13
13x4 = 4C
15
15x4 = 54
16
16x4 = 58
17
17x4 = 5C
F400:0400 - printer.
19
19x4 = 64
FFFF:0000 - reboot.
1A
1Ax4 = 68
1E
1Ex4 = 78
20
20x4 = 80
21
21x4 = 84
33
33x4 = CC
size
description
BIOS equipment list.
0040h:0010
WORD
0040h:0013
WORD
0040h:004A
WORD
0040h:004E
WORD
0040h:0050
8
WORDs
0040h:0062
BYTE
0040h:0084
BYTE
- 10
- 11
- 12 (unused)
- 13 (unused)
- 14 (unused)
15 (unused)
binary
value
action
00000000
do nothing.
00000001
move forward.
00000010
turn left.
00000011
turn right.
00000100
00000101
switch on a lamp.
00000110
binary value
meaning
255
11111111
wall
00000000
nothing
00000111
switched-on lamp
00001000
switched-off lamp
The third byte (port 11) is a status register. read values from
this port to determine the state of the robot. each bit has a
specific property:
bit
number
description
bit #0
zero when robot is ready for next command, one when robot is
busy doing some task.
bit #2
example:
MOV AL, 1 ; move forward.
OUT 9, AL ;
MOV AL, 3 ; turn right.
OUT 9, AL ;
MOV AL, 1 ; move forward.
OUT 9, AL ;
MOV AL, 2 ; turn left.
OUT 9, AL ;
MOV AL, 1 ; move forward.
OUT 9, AL ;