05 Machine Basics
05 Machine Basics
Carnegie Mellon
History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move Intro to x86-64
Carnegie Mellon
Carnegie Mellon
First 16-bit Intel processor. Basis for IBM PC & DOS 1MB address space
386
1985
275K
16-33
First 32 bit Intel processor , referred to as IA32 Added at addressing, capable of running Unix
Carnegie Mellon
Machine
EvoluJon
386 Pen2um Pen2um/MMX Pen2umPro Pen2um
III Pen2um
4 Core
2
Duo Core
i7
1985
1993
1997
1995
1999
2001
2006
2008
0.3M
3.1M
4.5M
6.5M
8.2M
42M
291M
731M
Added
Features
Instruc2ons
to
support
mul2media
opera2ons
Instruc2ons
to
enable
more
ecient
condi2onal
opera2ons
Transi2on
from
32
bits
to
64
bits
More
cores
5
Carnegie Mellon
Features:
4
cores
Max
4.0
GHz
Clock
84
Wahs
6
Carnegie Mellon
Historically
AMD
has
followed
just
behind
Intel
A
lihle
bit
slower,
a
lot
cheaper
Then
Recruited
top
circuit
designers
from
Digital
Equipment
Corp.
and
other
downward
trending
companies
Built
Opteron:
tough
compe2tor
to
Pen2um
4
Developed
x86-64,
their
own
extension
to
64
bits
Carnegie Mellon
Carnegie Mellon
Our Coverage
IA32
The
tradi2onal
x86
shark> gcc m32 hello.c
x86-64
The
emerging
standard
shark> gcc hello.c shark> gcc m64 hello.c
PresentaJon
Book
presents
IA32
in
Sec2ons
3.13.12
Covers
x86-64
in
3.13
We
will
cover
both
simultaneously
Some
labs
will
be
based
on
x86-64,
others
on
IA32
9
Carnegie Mellon
History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move Intro to x86-64
10
Carnegie Mellon
DeniJons
Architecture:
(also
ISA:
instrucJon
set
architecture)
The
parts
of
a
processor
design
that
one
needs
to
understand
to
write
assembly
code.
Examples:
instruc2on
set
specica2on,
registers.
11
Carnegie Mellon
Memory
Code
Data
Stack
Programmer-Visible
State
PC:
Program
counter
Address
of
next
instruc2on
Called
EIP
(IA32)
or
RIP
(x86-64)
Memory
Byte
addressable
array
Code
and
user
data
Stack
to
support
procedures
Register le
CondiJon
codes
Store
status
informa2on
about
most
recent
arithme2c
or
logical
opera2on
Used
for
condi2onal
branching
12
Carnegie Mellon
text
text binary
Asm program (p1.s p2.s) Assembler (gcc or as) Object program (p1.o p2.o) Linker (gcc or ld) StaJc libraries (.a)
binary
Carnegie Mellon
Obtain
(on
shark
machine)
with
command
gcc O1 m32 S p1.c Produces
le
p1.s Warning:
Will
get
very
dierent
results
on
non-Shark
machines
(Andrew
Linux,
Mac
OS-X,
)
due
to
dierent
versions
of
gcc
and
dierent
compiler
segngs.
14
Carnegie Mellon
15
Carnegie Mellon
Perform
arithmeJc
funcJon
on
register
or
memory
data
Transfer
data
between
memory
and
register
Load
data
from
memory
into
register
Store
register
data
into
memory
Transfer
control
Uncondi2onal
jumps
to/from
procedures
Condi2onal
branches
16
Carnegie Mellon
Object
Code
Code
for
sum
0x08483f4 <sum>:
0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x5d 0xc3
Assembler
Translates
.s
into
.o Binary
encoding
of
each
instruc2on
Nearly-complete
image
of
executable
code
Missing
linkages
between
code
in
dierent
les
Linker
Resolves
references
between
les
Combines
with
sta2c
run-2me
libraries
E.g.,
code
for
malloc,
printf Some
libraries
are
dynamically
linked
Linking
occurs
when
program
begins
execu2on
17
Carnegie Mellon
C
Code
Add
two
signed
integers
Assembly
Add
2
4-byte
integers
Long
words
in
GCC
parlance
Same
instruc2on
whether
signed
addl 8(%ebp),%eax Similar to expression: x += y More precisely: int eax; int *ebp; eax += ebp[2]
or unsigned Operands: x: Register %eax y: Memory M[%ebp+8] t: Register %eax Return func2on value in %eax
0x80483fa:
03 45 08
Object
Code
3-byte
instruc2on
Stored
at
address
0x80483fa
18
Carnegie Mellon
Disassembler
objdump -d p Useful
tool
for
examining
object
code
Analyzes
bit
pahern
of
series
of
instruc2ons
Produces
approximate
rendi2on
of
assembly
code
Can
be
run
on
either
a.out
(complete
executable)
or
.o
le
19
Carnegie Mellon
Alternate
Disassembly
Object
0x080483f4:
0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x5d 0xc3
Disassembled
Dump of assembler code for function sum: 0x080483f4 <sum+0>: push %ebp 0x080483f5 <sum+1>: mov %esp,%ebp 0x080483f7 <sum+3>: mov 0xc(%ebp),%eax 0x080483fa <sum+6>: add 0x8(%ebp),%eax 0x080483fd <sum+9>: pop %ebp 0x080483fe <sum+10>: ret
Carnegie Mellon
No symbols in "WINWORD.EXE". Disassembly of section .text: 30001000 <.text>: 30001000: 55 30001001: 8b ec 30001003: 6a ff 30001005: 68 90 10 00 30 3000100a: 68 91 dc 4c 30
Anything
that
can
be
interpreted
as
executable
code
Disassembler
examines
bytes
and
reconstructs
assembly
source
21
Carnegie Mellon
History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move Intro to x86-64
22
Carnegie Mellon
23
Carnegie Mellon
Moving
Data
movl
Source,
Dest:
Operand
Types
Immediate:
Constant
integer
data
Example:
$0x400,
$-533
Like
C
constant,
but
prexed
with
$ Encoded
with
1,
2,
or
4
bytes
Register:
One
of
8
integer
registers
Example:
%eax, %edx But
%esp and
%ebp reserved
for
special
use
Others
have
special
uses
for
par2cular
instruc2ons
Memory:
4
consecu2ve
bytes
of
memory
at
address
given
by
register
Simplest
example:
(%eax) Various
other
address
modes
24
Carnegie Mellon
Reg
movl $0x4,%eax Mem
movl $-147,(%eax) Reg
movl %eax,%edx Mem
movl %eax,(%edx) Reg
movl (%eax),%edx
Mem
Carnegie Mellon
Normal (R) Mem[Reg[R]] Register R species memory address Aha! Pointer dereferencing in C movl (%ecx),%eax Displacement D(R) Mem[Reg[R]+D] Register R species start of memory region Constant displacement D species oset movl 8(%ebp),%edx
26
Carnegie Mellon
swap: pushl %ebp movl %esp,%ebp pushl %ebx movl movl movl movl movl movl popl popl ret 8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax) %ebx %ebp
Set Up
Body
Finish
27
Carnegie Mellon
swap: pushl %ebp movl %esp,%ebp pushl %ebx movl movl movl movl movl movl popl popl ret 8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax) %ebx %ebp
Set Up
Body
Finish
28
Carnegie Mellon
Understanding
Swap
void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } Oset
12 8 4
yp xp Rtn
adr
%ebp %esp
0 Old %ebp Register %edx %eax %ecx %ebx Value xp yp t0 t1 -4 Old %ebx
8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax)
# # # # # #
= = = = = =
Carnegie Mellon
Understanding
Swap
%eax %edx %ecx %ebx %esi %edi %esp %ebp 0x104 movl movl movl movl movl movl 8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax) yp xp Oset
12 8 4 %ebp 0 -4
123 456
# # # # # #
= = = = = =
Carnegie Mellon
Understanding
Swap
%eax %edx %ecx %ebx %esi %edi %esp %ebp 0x104 movl movl movl movl movl movl 8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax) 0x124 yp xp Oset
12 8 4 %ebp 0 -4
123 456
# # # # # #
= = = = = =
Carnegie Mellon
Understanding
Swap
%eax %edx %ecx %ebx %esi %edi %esp %ebp 0x104 movl movl movl movl movl movl 8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax) 0x120 0x124 yp xp Oset
12 8 4 %ebp 0 -4
123 456
# # # # # #
= = = = = =
Carnegie Mellon
Understanding
Swap
%eax %edx %ecx %ebx %esi %edi %esp %ebp 0x104 movl movl movl movl movl movl 8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax) 0x120 0x124 123 yp xp Oset
12 8 4 %ebp 0 -4
123 456
# # # # # #
= = = = = =
Carnegie Mellon
Understanding
Swap
%eax %edx %ecx %ebx %esi %edi %esp %ebp 0x104 movl movl movl movl movl movl 8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax) 0x120 0x124 123 456 yp xp Oset
12 8 4 %ebp 0 -4
123 456
# # # # # #
= = = = = =
Carnegie Mellon
Understanding
Swap
%eax %edx %ecx %ebx %esi %edi %esp %ebp 0x104 movl movl movl movl movl movl 8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax) 0x120 0x124 123 456 yp xp Oset
12 8 4 %ebp 0 -4
456 456
# # # # # #
= = = = = =
Carnegie Mellon
Understanding
Swap
%eax %edx %ecx %ebx %esi %edi %esp %ebp 0x104 movl movl movl movl movl movl 8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax) 0x120 0x124 123 456 yp xp Oset
12 8 4 %ebp 0 -4
456 123
# # # # # #
= = = = = =
Carnegie Mellon
Mem[Reg[Rb]+S*Reg[Ri]+ D]
D:
Constant
displacement
1,
2,
or
4
bytes
Rb:
Base
register:
Any
of
8
integer
registers
Ri:
Index
register:
Any,
except
for
%esp
Unlikely
youd
use
%ebp,
either
S:
Carnegie Mellon
History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move Intro to x86-64
38
Carnegie Mellon
C Data Type Generic 32-bit Intel IA32 x86-64 unsigned 4 4 4 int 4 4 4 long int 4 4 8 char 1 1 1 short 2 2 2 oat 4 4 4 double 8 8 8 long double 8 10/12 10/16 char * 4 4 8 Or any other pointer
39
Carnegie Mellon
Extend exis2ng registers. Add 8 new ones. Make %ebp/%rbp general purpose
40
Carnegie Mellon
InstrucJons
41
Carnegie Mellon
swap: pushl %ebp movl %esp,%ebp pushl %ebx movl movl movl movl movl movl popl popl ret 8(%ebp), %edx 12(%ebp), %eax (%edx), %ecx (%eax), %ebx %ebx, (%edx) %ecx, (%eax) %ebx %ebp
Set Up
Body
Finish
42
Carnegie Mellon
Carnegie Mellon
64-bit
data
Data
held
in
registers
%rdx
and
%rax
movq
opera2on
44
Carnegie Mellon
Intro
to
x86-64
A
major
departure
from
the
style
of
code
seen
in
IA32
45