DDCAarm Ch6

Download as pdf or txt
Download as pdf or txt
You are on page 1of 231

Chapter 6

Digital Design and Computer Architecture: ARM® Edition


Sarah L. Harris and David Money Harris

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <1>
Chapter 6 :: Topics

• Introduction
• Assembly Language
• Machine Language
• Programming
• Addressing Modes

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <2>
Introduction
• Jumping up a few levels of
abstraction
– Architecture: programmer’s
view of computer
• Defined by instructions &
operand locations
– Microarchitecture: how to
implement an architecture in
hardware (covered in Chapter 7)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <3>
Instructions
• Commands in a computer’s language
– Assembly language: human-readable
format of instructions
– Machine language: computer-readable
format (1’s and 0’s)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <4>
ARM Architecture
• Developed in the 1980’s by Advanced RISC
Machines – now called ARM Holdings
• Nearly 10 billion ARM processors sold/year
• Almost all cell phones and tablets have multiple
ARM processors
• Over 75% of humans use products with an ARM
processor
• Used in servers, cameras, robots, cars, pinball
machines, etc.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <5>
ARM Architecture
• Developed in the 1980’s by Advanced RISC
Machines – now called ARM Holdings
• Nearly 10 billion ARM processors sold/year
• Almost all cell phones and tablets have multiple
ARM processors
• Over 75% of humans use products with an ARM
processor
• Used in servers, cameras, robots, cars, pinball
machines,, etc.
Once you’ve learned one architecture, it’s easier to learn others

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <6>
Architecture Design Principles
Underlying design principles, as articulated by
Hennessy and Patterson:
1.Regularity supports design simplicity
2.Make the common case fast
3.Smaller is faster
4.Good design demands good compromises

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <7>
Instruction: Addition

C Code ARM Assembly Code


a = b + c; ADD a, b, c

• ADD: mnemonic – indicates operation to


perform
• b, c: source operands
• a: destination operand

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <8>
Instruction: Subtraction
Similar to addition - only mnemonic changes

C Code ARM assembly code


a = b - c; SUB a, b, c

• SUB: mnemonic
• b, c: source operands
• a: destination operand

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <9>
Design Principle 1
Regularity supports design simplicity
• Consistent instruction format
• Same number of operands (two sources and
one destination)
• Ease of encoding and handling in hardware

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <10>
Multiple Instructions
More complex code handled by multiple ARM
instructions
C Code ARM assembly code
a = b + c - d; ADD t, b, c ; t = b + c
SUB a, t, d ; a = t - d

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <11>
Design Principle 2
Make the common case fast
• ARM includes only simple, commonly used instructions
• Hardware to decode and execute instructions kept
simple, small, and fast
• More complex instructions (that are less common)
performed using multiple simple instructions

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <12>
Design Principle 2
Make the common case fast
• ARM is a Reduced Instruction Set Computer (RISC),
with a small number of simple instructions
• Other architectures, such as Intel’s x86, are
Complex Instruction Set Computers (CISC)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <13>
Operand Location
Physical location in computer
– Registers
– Constants (also called immediates)
– Memory

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <14>
Operands: Registers
• ARM has 16 registers
• Registers are faster than memory
• Each register is 32 bits
• ARM is called a “32-bit architecture”
because it operates on 32-bit data

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <15>
Design Principle 3
Smaller is Faster
• ARM includes only a small number of
registers

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <16>
ARM Register Set

Name Use
R0 Argument / return value / temporary variable
R1-R3 Argument / temporary variables
R4-R11 Saved variables
R12 Temporary variable
R13 (SP) Stack Pointer: stack làm gì ???
?????????????????????????????????????
?????????????????????????????????????
????????????????????????????????
R14 (LR) Link Register: lưu trữ địa chỉ trả về từ hàm
R15 (PC) Program Counter
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <17>
Operands: Registers
• Registers:
– R before number, all capitals
– Example: “R0” or “register zero” or “register R0”

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <18>
Operands: Registers
• Registers used for specific purposes:
– Saved registers: R4-R11 hold variables
– Temporary registers: R0-R3 and R12, hold
intermediate values
– Discuss others later

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <19>
Instructions with Registers
Revisit ADD instruction

C Code ARM Assembly Code


; R0 = a, R1 = b, R2 = c

a = b + c ADD R0, R1, R2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <20>
Operands: Constants\Immediates
• Many instructions can use constants or
immediate operands
• For example: ADD and SUB
• value is immediately available from
instruction
C Code ARM Assembly Code
; R0 = a, R1 = b
a = a + 4; ADD R0, R0, #4
b = a – 12; SUB R1, R0, #12

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <21>
Generating Constants
Generating small constants using move (MOV):
C Code ARM Assembly Code
//int: 32-bit signed word ; R0 = a, R1 = b
int a = 23; MOV R0, #23
int b = 0x45; MOV R1, #0x45

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <22>
Generating Constants
Generating small constants using move (MOV):
C Code ARM Assembly Code
//int: 32-bit signed word ; R0 = a, R1 = b
int a = 23; MOV R0, #23
int b = 0x45; MOV R1, #0x45

Constant must have < 8 bits of precision

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <23>
Generating Constants
Generating small constants using move (MOV):
C Code ARM Assembly Code
//int: 32-bit signed word ; R0 = a, R1 = b
int a = 23; MOV R0, #23
int b = 0x45; MOV R1, #0x45

Constant must have < 8 bits of precision


Note: MOV can also use 2 registers: MOV R7, R9

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <24>
Generating Constants
Generate larger constants using move (MOV) and
or (ORR):
C Code ARM Assembly Code
# R0 = a
int a = 0x7EDC8765; MOV R0, #0x7E000000
ORR R0, R0, #0xDC0000
ORR R0, R0, #0x8700
ORR R0, R0, #0x65

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <25>
Operands: Memory
• Too much data to fit in only 16 registers
• Store more data in memory
• Memory is large, but slow
• Commonly used variables still kept in registers

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <26>
Byte-Addressable Memory
• Each data byte has unique address
• 32-bit word = 4 bytes, so word address
increments by 4

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <27>
Reading Memory
• Memory read called load
• Mnemonic: load register (LDR)
• Format:
LDR R0, [R1, #12]

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <28>
Reading Memory
• Memory read called load
• Mnemonic: load register (LDR)
• Lệnh này là lệnh load R0 sẽ chứa data của R1
+ 12
• Format:
LDR R0, [R1, #12]
Address calculation:
– add base address (R1) to the offset (12)
– address = (R1 + 12)
Result:
– R0 holds the data at memory address Chapter
(R1 +6 12)
Digital Design and Computer Architecture: ARM® Edition © 2015
<29>
Reading Memory
• Memory read called load
• Mnemonic: load register (LDR)
• Format:
LDR R0, [R1, #12]
Address calculation:
– add base address (R1) to the offset (12)
– address = (R1 + 12)
Result:
– R0 holds the data at memory address (R1 + 12)
Any register may be used as base address

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <30>
Reading Memory
• Example: Read a word of data at memory
address 8 into R3

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <31>
Reading Memory
• Example: Read a word of data at memory
address 8 into R3
– Address = (R2 + 8) = 8
– R3 = 0x01EE2842 after load

ARM Assembly Code


MOV R2, #0
LDR R3, [R2, #8]

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <32>
Writing Memory
• Memory write are called stores
• Mnemonic: store register (STR)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <33>
Writing Memory
• Example: Store the value held in R7 into
memory word 21.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <34>
Writing Memory
• Example: Store the value held in R7 into
memory word 21.
• Memory address = 4 x 21 = 84 = 0x54
ARM assembly code
MOV R5, #0
STR R7, [R5, #0x54]

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <35>
Writing Memory
• Example: Store the value held in R7 into
memory word 21.
• Memory address = 4 x 21 = 84 = 0x54
ARM assembly code
MOV R5, #0
STR R7, [R5, #0x54]

The offset can be written in


decimal or hexadecimal

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <36>
Recap: Accessing Memory
• Address of a memory word must be
multiplied by 4
• Examples:
– Address of memory word 2 = 2 × 4 = 8
– Address of memory word 10 = 10 × 4 = 40

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <37>
Big-Endian & Little-Endian Memory
• How to number bytes within a word?

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <38>
Big-Endian & Little-Endian Memory
• How to number bytes within a word?
– Little-endian: byte numbers start at the little
(least significant) end
– Big-endian: byte numbers start at the big (most
significant) end

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <39>
Big-Endian & Little-Endian Memory
• Jonathan Swift’s Gulliver’s Travels: the Little-Endians
broke their eggs on the little end of the egg and the
Big-Endians broke their eggs on the big end
• It doesn’t really matter which addressing type used
– except when two systems share data

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <40>
Big-Endian & Little-Endian Example
Suppose R2 and R5 hold the values 8 and
0x23456789
• After following code runs on big-endian system, what
value is in R7?
• In a little-endian system?
STR R5, [R2, #0]
LDRB R7, [R2, #1]

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <41>
Big-Endian & Little-Endian Example
Suppose R2 and R5 hold the values 8 and
0x23456789
• After following code runs on big-endian system, what
value is in R7?
• In a little-endian system?
STR R5, [R2, #0]
LDRB R7, [R2, #1]
Big-Endian Little-Endian
Word
Byte Address 8 9 A B Address B A 9 8 Byte Address
Data Value 23 45 67 89 0 23 45 67 89 Data Value
MSB LSB MSB LSB

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <42>
Big-Endian & Little-Endian Example
Suppose R2 and R5 hold the values 8 and
0x23456789
• After following code runs on big-endian system, what
value is in R7?
• In a little-endian system? Big-endian:
STR R5, [R2, #0] 0x00000045
LDRB R7, [R2, #1]
Little-endian:
Big-Endian Little-Endian 0x00000067
Word
Byte Address 8 9 A B Address B A 9 8 Byte Address
Data Value 23 45 67 89 0 23 45 67 89 Data Value
MSB LSB MSB LSB

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <43>
Programming
High-level languages:
– e.g., C, Java, Python
– Written at higher level of abstraction

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <44>
Ada Lovelace, 1815-1852
• British mathematician
• Wrote the first computer
program
• Her program calculated
the Bernoulli numbers on
Charles Babbage’s
Analytical Engine
• She was a child of the
poet Lord Byron

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <45>
Programming Building Blocks
• Data-processing Instructions
• Conditional Execution
• Branches
• High-level Constructs:
▪ if/else statements
▪ for loops
▪ while loops
▪ arrays
▪ function calls

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <46>
Programming Building Blocks
• Data-processing Instructions
• Conditional Execution
• Branches
• High-level Constructs:
▪ if/else statements
▪ for loops
▪ while loops
▪ arrays
▪ function calls

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <47>
Data-processing Instructions
• Logical operations
• Shifts / rotate
• Multiplication

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <48>
Logical Instructions
• AND
• ORR
• EOR (XOR)
• BIC (Bit Clear)
• MVN (MoVe and NOT)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <49>
Logical Instructions: Examples

BIC ở đây là
R1 and Phủ
của R2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <50>
Logical Instructions: Uses
• AND or BIC: useful for masking bits

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <51>
Logical Instructions: Uses
• AND or BIC: useful for masking bits
Example: Masking all but the least significant byte
of a value
0xF234012F AND 0x000000FF = 0x0000002F
0xF234012F BIC 0xFFFFFF00 = 0x0000002F

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <52>
Logical Instructions: Uses
• AND or BIC: useful for masking bits
Example: Masking all but the least significant byte
of a value
0xF234012F AND 0x000000FF = 0x0000002F
0xF234012F BIC 0xFFFFFF00 = 0x0000002F

• ORR: useful for combining bit fields

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <53>
Logical Instructions: Uses
• AND or BIC: useful for masking bits
Example: Masking all but the least significant byte
of a value
0xF234012F AND 0x000000FF = 0x0000002F
0xF234012F BIC 0xFFFFFF00 = 0x0000002F

• ORR: useful for combining bit fields


Example: Combine 0xF2340000 with 0x000012BC:
0xF2340000 ORR 0x000012BC = 0xF23412BC

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <54>
Shift Instructions
• LSL: logical shift left

• LSR: logical shift right

• ASR: arithmetic shift right

• ROR: rotate right

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <55>
Shift Instructions
• LSL: logical shift left
Example: LSL R0, R7, #5 ; R0 = R7 << 5

• LSR: logical shift right

• ASR: arithmetic shift right

• ROR: rotate right

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <56>
Shift Instructions
• LSL: logical shift left
Example: LSL R0, R7, #5 ; R0 = R7 << 5

• LSR: logical shift right


Example: LSR R3, R2, #31 ; R3 = R2 >> 31

• ASR: arithmetic shift right

• ROR: rotate right

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <57>
Shift Instructions
• LSL: logical shift left
Example: LSL R0, R7, #5 ; R0 = R7 << 5

• LSR: logical shift right


Example: LSR R3, R2, #31 ; R3 = R2 >> 31

• ASR: arithmetic shift right


Example: ASR R9, R11, R4 ; R9 = R11 >>> R47:0

• ROR: rotate right

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <58>
Shift Instructions
• LSL: logical shift left
Example: LSL R0, R7, #5 ; R0 = R7 << 5

• LSR: logical shift right


Example: LSR R3, R2, #31 ; R3 = R2 >> 31

• ASR: arithmetic shift right


Example: ASR R9, R11, R4 ; R9 = R11 >>> R47:0

• ROR: rotate right


Example: ROR R8, R1, #3 ; R8 = R1 ROR 3

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <59>
Shift Instructions: Example 1
• Immediate shift amount (5-bit immediate)
• Shift amount: 0-31
ARM là dịch bit sang phải và
các bit đầu được thay thế
bằng các bit đầu tiên trước
khi chuyển

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <60>
Shift Instructions: Example 2
• Register shift amount (uses low 8 bits of register)
• Shift amount: 0-255

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <61>
Multiplication
• MUL: 32 × 32 multiplication, 32-bit result

• UMULL: Unsigned multiply long: 32 × 32


multiplication, 64-bit result

• SMULL: Signed multiply long: 32 × 32


multiplication, 64-bit result

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <62>
Multiplication
• MUL: 32 × 32 multiplication, 32-bit result
MUL R1, R2, R3
Result: R1 = (R2 x R3)31:0
• UMULL: Unsigned multiply long: 32 × 32
multiplication, 64-bit result

• SMULL: Signed multiply long: 32 × 32


multiplication, 64-bit result

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <63>
Multiplication
• MUL: 32 × 32 multiplication, 32-bit result
MUL R1, R2, R3
Result: R1 = (R2 x R3)31:0
• UMULL: Unsigned multiply long: 32 × 32
multiplication, 64-bit result
UMULL R1, R2, R3, R4
Result: {R1,R4} = R2 x R3 (R2, R3 unsigned)
• SMULL: Signed multiply long: 32 × 32
multiplication, 64-bit result

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <64>
Multiplication
• MUL: 32 × 32 multiplication, 32-bit result
MUL R1, R2, R3
Result: R1 = (R2 x R3)31:0
• UMULL: Unsigned multiply long: 32 × 32
multiplication, 64-bit result
UMULL R1, R2, R3, R4
Result: {R1,R4} = R2 x R3 (R2, R3 unsigned)
• SMULL: Signed multiply long: 32 × 32
multiplication, 64-bit result
SMULL R1, R2, R3, R4
Result: {R1,R4} = R2 x R3 (R2, R3 signed)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <65>
Programming Building Blocks
• Data-processing Instructions
• Conditional Execution
• Branches
• High-level Constructs:
▪ if/else statements
▪ for loops
▪ while loops
▪ arrays
▪ function calls

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <66>
Conditional Execution
Don’t always want to execute code sequentially
• For example:
▪ if/else statements, while loops, etc.: only
want to execute code if a condition is true
▪ branching: jump to another portion of code
if a condition is true

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <67>
Conditional Execution
Don’t always want to execute code sequentially
• For example:
▪ if/else statements, while loops, etc.: only
want to execute code if a condition is true
▪ branching: jump to another portion of code
if a condition is true
• ARM includes condition flags that can be:
▪ set by an instruction
▪ used to conditionally execute an instruction

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <68>
ARM Condition Flags
Flag Name Description
N Negative Instruction result is negative
Z Zero Instruction results in zero
C Carry Instruction causes an unsigned carry out

V oVerflow Instruction causes an overflow

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <69>
ARM Condition Flags
Flag Name Description
N Negative Instruction result is negative
Z Zero Instruction results in zero
C Carry Instruction causes an unsigned carry out

V oVerflow Instruction causes an overflow

• Set by ALU (recall from Chapter 5)


• Held in Current Program Status Register (CPSR)
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <70>
Review: ARM ALU

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <71>
Setting the Condition Flags: NZCV
• Method 1: Compare instruction: CMP
Example: CMP R5, R6
▪ Performs: R5-R6
▪ Does not save result
▪ Sets flags

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <72>
Setting the Condition Flags: NZCV
• Method 1: Compare instruction: CMP
Example: CMP R5, R6
▪ Performs: R5-R6
▪ Does not save result
▪ Sets flags. If result:
• Is 0, Z=1
• Is negative, N=1
• Causes a carry out, C=1
• Causes a signed overflow, V=1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <73>
Setting the Condition Flags: NZCV
• Method 1: Compare instruction: CMP
Example: CMP R5, R6
▪ Performs: R5-R6
▪ Sets flags: If result is 0 (Z=1), negative (N=1), etc.
▪ Does not save result
• Method 2: Append instruction mnemonic with S

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <74>
Setting the Condition Flags: NZCV
• Method 1: Compare instruction: CMP
Example: CMP R5, R6
▪ Performs: R5-R6
▪ Sets flags: If result is 0 (Z=1), negative (N=1), etc.
▪ Does not save result
• Method 2: Append instruction mnemonic with S
Example: ADDS R1, R2, R3
▪ Performs: R2 + R3
▪ Sets flags: If result is 0 (Z=1), negative (N=1), etc.
▪ Saves result in R1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <75>
Condition Mnemonics
• Instruction may be conditionally executed
based on the condition flags
• Condition of execution is encoded as a
condition mnemonic appended to the
instruction mnemonic
Example: CMP R1, R2
SUBNE R3, R5, R8
▪ NE: not equal condition mnemonic
▪ SUB will only execute if R1 ≠ R2
(i.e., Z = 0)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <76>
Condition Mnemonics
cond Mnemonic Name CondEx
0000 EQ Equal 𝑍
0001 NE Not equal 𝑍ҧ
0010 CS / HS Carry set / Unsigned higher or same 𝐶
0011 CC / LO Carry clear / Unsigned lower 𝐶ҧ
0100 MI Minus / Negative 𝑁
0101 PL Plus / Positive of zero 𝑁 ഥ
0110 VS Overflow / Overflow set 𝑉
0111 VC No overflow / Overflow clear 𝑉ത
1000 HI Unsigned higher 𝑍𝐶ҧ
1001 LS Unsigned lower or same 𝑍 𝑂𝑅 𝐶ҧ
1010 GE Signed greater than or equal 𝑁⊕𝑉
1011 LT Signed less than 𝑁⊕𝑉
1100 GT Signed greater than ҧ ⊕ 𝑉)
𝑍(𝑁
1101 LE Signed less than or equal 𝑍 𝑂𝑅 (𝑁 ⊕ 𝑉)
1110 AL (or none) Always / unconditional ignored
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <77>
Conditional Execution
Example:
CMP R5, R9 ; performs R5-R9
; sets condition flags

SUBEQ R1, R2, R3 ; executes if R5==R9 (Z=1)


ORRMI R4, R0, R9 ; executes if R5-R9 is
; negative (N=1)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <78>
Conditional Execution
Example:
CMP R5, R9 ; performs R5-R9
; sets condition flags

SUBEQ R1, R2, R3 ; executes if R5==R9 (Z=1)


ORRMI R4, R0, R9 ; executes if R5-R9 is
; negative (N=1)

Suppose R5 = 17, R9 = 23:


CMP performs: 17 – 23 = -6 (Sets flags: N=1, Z=0, C=0, V=0)
SUBEQ doesn’t execute (they aren’t equal: Z=0)
ORRMI executes because the result was negative (N=1)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <79>
Programming Building Blocks
• Data-processing Instructions
• Conditional Execution
• Branches
• High-level Constructs:
▪ if/else statements
▪ for loops
▪ while loops
▪ arrays
▪ function calls

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <80>
Branching
• Branches enable out of sequence instruction
execution
• Types of branches:
– Branch (B)
• branches to another instruction
– Branch and link (BL)
• discussed later
• Both can be conditional or unconditional

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <81>
The Stored Program

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <82>
Unconditional Branching (B)
ARM assembly
MOV R2, #17 ; R2 = 17
B TARGET ; branch to target
ORR R1, R1, #0x4 ; not executed

TARGET
SUB R1, R1, #78 ; R1 = R1 + 78

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <83>
Unconditional Branching (B)
ARM assembly
MOV R2, #17 ; R2 = 17
B TARGET ; branch to target
ORR R1, R1, #0x4 ; not executed

TARGET
SUB R1, R1, #78 ; R1 = R1 + 78

Labels (like TARGET) indicate instruction location.


Labels can’t be reserved words (like ADD, ORR, etc.)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <84>
The Branch Not Taken
ARM Assembly
MOV R0, #4 ; R0 = 4
ADD R1, R0, R0 ; R1 = R0+R0 = 8
CMP R0, R1 ; sets flags with R0-R1
BEQ THERE ; branch not taken (Z=0)
ORR R1, R1, #1 ; R1 = R1 OR R1 = 9
THERE
ADD R1, R1, 78 ; R1 = R1 + 78 = 87

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <85>
Programming Building Blocks
• Data-processing Instructions
• Conditional Execution
• Branches
• High-level Constructs:
▪ if/else statements
▪ for loops
▪ while loops
▪ arrays
▪ function calls

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <86>
if Statement
C Code

if (i == j)
f = g + h;

f = f – i;

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <87>
if Statement
C Code ARM Assembly Code
;R0=f, R1=g, R2=h, R3=i, R4=j

if (i == j) CMP R3, R4 ; set flags with R3-R4


f = g + h; BNE L1 ; if i!=j, skip if block
ADD R0, R1, R2 ; f = g + h

L1
f = f – i; SUB R0, R0, R2 ; f = f - i

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <88>
if Statement
C Code ARM Assembly Code
;R0=f, R1=g, R2=h, R3=i, R4=j

if (i == j) CMP R3, R4 ; set flags with R3-R4


f = g + h; BNE L1 ; if i!=j, skip if block
ADD R0, R1, R2 ; f = g + h

L1
f = f – i; SUB R0, R0, R2 ; f = f - i

Assembly tests opposite case (i != j) of high-level code


(i == j)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <89>
if Statement: Alternate Code
C Code ARM Assembly Code
;R0=f, R1=g, R2=h, R3=i, R4=j

if (i == j) CMP R3, R4 ; set flags with R3-R4


f = g + h; ADDEQ R0, R1, R2 ; if (i==j) f = g + h
f = f – i; SUB R0, R0, R2 ; f = f - i

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <90>
if Statement: Alternate Code
Original Alternate Assembly Code
;R0=f, R1=g, R2=h, R3=i, R4=j

CMP R3, R4 CMP R3, R4 ; set flags with R3-R4


BNE L1 ADDEQ R0, R1, R2 ; if (i==j) f = g + h
ADD R0, R1, R2 SUB R0, R0, R2 ; f = f - i
L1
SUB R0, R0, R2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <91>
if Statement: Alternate Code
Original Alternate Assembly Code
;R0=f, R1=g, R2=h, R3=i, R4=j

CMP R3, R4 CMP R3, R4 ; set flags with R3-R4


BNE L1 ADDEQ R0, R1, R2 ; if (i==j) f = g + h
ADD R0, R1, R2 SUB R0, R0, R2 ; f = f - i
L1
SUB R0, R0, R2

Useful for short conditional blocks of code

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <92>
if/else Statement

C Code ARM Assembly Code

if (i == j)
f = g + h;

else
f = f – i;

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <93>
if/else Statement

C Code ARM Assembly Code


;R0=f, R1=g, R2=h, R3=i, R4=j

if (i == j) CMP R3, R4 ; set flags with R3-R4


f = g + h; BNE L1 ; if i!=j, skip if block
ADD R0, R1, R2 ; f = g + h
B L2 ; branch past else block
else L1
f = f – i; SUB R0, R0, R2 ; f = f – i
L2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <94>
if/else Statement: Alternate Code

C Code ARM Assembly Code


;R0=f, R1=g, R2=h, R3=i, R4=j

if (i == j) CMP R3, R4 ; set flags with R3-R4


f = g + h; ADDEQ R0, R1, R2 ; if (i==j) f = g + h
else
f = f – i; SUBNE R0, R0, R2 ; else f = f - i

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <95>
if/else Statement: Alternate Code

Original Alternate Assembly Code


;R0=f, R1=g, R2=h, R3=i, R4=j

CMP R3, R4 CMP R3, R4 ; set flags with R3-R4


BNE L1 ADDEQ R0, R1, R2 ; if (i==j) f = g + h
ADD R0, R1, R2
B L2 SUBNE R0, R0, R2 ; else f = f - i
L1
SUB R0, R0, R2
L2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <96>
while Loops
C Code ARM Assembly Code
// determines the power
// of x such that 2x = 128
int pow = 1;
int x = 0;

while (pow != 128) {

pow = pow * 2;
x = x + 1;
}

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <97>
while Loops
C Code ARM Assembly Code
// determines the power ; R0 = pow, R1 = x
// of x such that 2x = 128 MOV R0, #1 ; pow = 1
int pow = 1; MOV R1, #0 ; x = 0
int x = 0;
WHILE
CMP R0, #128 ; R0-128
while (pow != 128) { BEQ DONE ; if (pow==128)
; exit loop
pow = pow * 2; LSL R0, R0, #1 ; pow=pow*2
x = x + 1; ADD R1, R1, #1 ; x=x+1
} B WHILE ; repeat loop

DONE

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <98>
while Loops
C Code ARM Assembly Code
// determines the power ; R0 = pow, R1 = x
// of x such that 2x = 128 MOV R0, #1 ; pow = 1
int pow = 1; MOV R1, #0 ; x = 0
int x = 0;
WHILE
CMP R0, #128 ; R0-128
while (pow != 128) { BEQ DONE ; if (pow==128)
; exit loop
pow = pow * 2; LSL R0, R0, #1 ; pow=pow*2
x = x + 1; ADD R1, R1, #1 ; x=x+1
} B WHILE ; repeat loop

DONE

Assembly tests for the opposite case (pow == 128) of the C


code (pow != 128).

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <99>
for Loops
for (initialization; condition; loop operation)
statement

• initialization: executes before the loop begins


• condition: is tested at the beginning of each iteration
• loop operation: executes at the end of each iteration
• statement: executes each time the condition is met

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <100>
for Loops
C Code ARM Assembly Code
// adds numbers from 1-9
int sum = 0

for (i=1; i!=10; i=i+1)


sum = sum + i;

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <101>
for Loops
C Code ARM Assembly Code
// adds numbers from 1-9 ; R0 = i, R1 = sum
int sum = 0 MOV R0, #1 ; i = 1
MOV R1, #0 ; sum = 0

for (i=1; i!=10; i=i+1) FOR


sum = sum + i; CMP R0, #10 ; R0-10
BEQ DONE ; if (i==10)
; exit loop
ADD R1, R1, R0 ; sum=sum + i
ADD R0, R0, #1 ; i = i + 1
B FOR ; repeat loop

DONE

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <102>
for Loops: Decremented Loops
In ARM, decremented loop variables are more efficient

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <103>
for Loops: Decremented Loops
In ARM, decremented loop variables are more efficient
C Code ARM Assembly Code
// adds numbers from 1-9 ; R0 = i, R1 = sum
int sum = 0 MOV R0, #9 ; i = 9
MOV R1, #0 ; sum = 0

for (i=9; i!=0; i=i-1) FOR


sum = sum + i; ADD R1, R1, R0 ; sum=sum + i
SUBS R0, R0, #1 ; i = i – 1
; and set flags
BNE FOR ; if (i!=0)
; repeat loop

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <104>
for Loops: Decremented Loops
In ARM, decremented loop variables are more efficient
C Code ARM Assembly Code
// adds numbers from 1-9 ; R0 = i, R1 = sum
int sum = 0 MOV R0, #9 ; i = 9
MOV R1, #0 ; sum = 0

for (i=9; i!=0; i=i-1) FOR


sum = sum + i; ADD R1, R1, R0 ; sum=sum + i
SUBS R0, R0, #1 ; i = i – 1
; and set flags
BNE FOR ; if (i!=0)
; repeat loop
Saves 2 instructions per iteration:
• Decrement loop variable & compare: SUBS R0, R0, #1
• Only 1 branch – instead of 2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <105>
Programming Building Blocks
• Data-processing Instructions
• Conditional Execution
• Branches
• High-level Constructs:
▪ if/else statements
▪ for loops
▪ while loops
▪ arrays
▪ function calls

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <106>
Arrays
• Access large amounts of similar data
▪ Index: access to each element
▪ Size: number of elements

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <107>
Arrays
• 5-element array
▪ Base address = 0x14000000 (address of first
element, scores[0])
▪ Array elements accessed relative to base address

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <108>
Accessing Arrays
C Code
int array[5];
array[0] = array[0] * 8;
array[1] = array[1] * 8;

ARM Assembly Code


; R0 = array base address

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <109>
Accessing Arrays
C Code
int array[5];
array[0] = array[0] * 8;
array[1] = array[1] * 8;

ARM Assembly Code


; R0 = array base address
MOV R0, #0x60000000 ; R0 = 0x60000000
LDR R1, [R0] ; R1 = array[0]
LSL R1, R1, 3 ; R1 = R1 << 3 = R1*8
STR R1, [R0] ; array[0] = R1
LDR R1, [R0, #4] ; R1 = array[1]
LSL R1, R1, 3 ; R1 = R1 << 3 = R1*8
STR R1, [R0, #4] ; array[1] = R1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <110>
Arrays using for Loops
C Code
int array[200];
int i;
for (i=199; i >= 0; i = i - 1)
array[i] = array[i] * 8;

ARM Assembly Code


; R0 = array base address, R1 = i

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <111>
Arrays using for Loops
C Code
int array[200];
int i;
for (i=199; i >= 0; i = i - 1)
array[i] = array[i] * 8;

ARM Assembly Code


; R0 = array base address, R1 = i
MOV R0, 0x60000000
MOV R1, #199
FOR
LDR R2, [R0, R1, LSL #2] ; R2 = array(i)
LSL R2, R2, #3 ; R2 = R2<<3 = R3*8
STR R2, [R0, R1, LSL #2] ; array(i) = R2
SUBS R0, R0, #1 ; i = i – 1
; and set flags
BPL FOR ; if (i>=0) repeat loop

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <112>
ASCII Code
• American Standard Code for Information
Interchange
• Each text character has unique byte value
– For example, S = 0x53, a = 0x61, A = 0x41
– Lower-case and upper-case differ by 0x20 (32)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <113>
Cast of Characters

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <114>
Programming Building Blocks
• Data-processing Instructions
• Conditional Execution
• Branches
• High-level Constructs:
▪ if/else statements
▪ for loops
▪ while loops
▪ arrays
▪ function calls

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <115>
Function Calls
• Caller: calling function (in this case, main)
• Callee: called function (in this case, sum)
C Code
void main()
{
int y;
y = sum(42, 7);
...
}

int sum(int a, int b)


{
return (a + b);
}

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <116>
Function Conventions
• Caller:
– passes arguments to callee
– jumps to callee

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <117>
Function Conventions
• Caller:
– passes arguments to callee
– jumps to callee
• Callee:
– performs the function
– returns result to caller
– returns to point of call
– must not overwrite registers or memory needed by
caller

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <118>
ARM Function Conventions
• Call Function: branch and link
BL
• Return from function: move the link register
to PC: MOV PC, LR
• Arguments: R0-R3
• Return value: R0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <119>
Function Calls
C Code ARM Assembly Code
int main() { 0x00000200 MAIN BL SIMPLE
simple(); 0x00000204 ADD R4, R5, R6
a = b + c; ...
}

0x00401020 SIMPLE MOV PC, LR


void simple() {
return;
}

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <120>
Function Calls
C Code ARM Assembly Code
int main() { 0x00000200 MAIN BL SIMPLE
simple(); 0x00000204 ADD R4, R5, R6
a = b + c; ...
}

0x00401020 SIMPLE MOV PC, LR


void simple() {
return;
}

void means that simple doesn’t return a value

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <121>
Function Calls
C Code ARM Assembly Code
int main() { 0x00000200 MAIN BL SIMPLE
simple(); 0x00000204 ADD R4, R5, R6
a = b + c; ...
}

0x00401020 SIMPLE MOV PC, LR


void simple() {
return;
}

BL branches to SIMPLE
LR = PC + 4 = 0x00000204
MOV PC, LR makes PC = LR
(the next instruction executed is at 0x00000200)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <122>
Input Arguments and Return Value
ARM conventions:
• Argument values: R0 - R3
• Return value: R0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <123>
Input Arguments and Return Value
C Code
int main()
{
int y;
...
y = diffofsums(2, 3, 4, 5); // 4 arguments
...
}

int diffofsums(int f, int g, int h, int i)


{
int result;
result = (f + g) - (h + i);
return result; // return value
}

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <124>
Input Arguments and Return Value
ARM Assembly Code
; R4 = y
MAIN
...
MOV R0, #2 ; argument 0 = 2
MOV R1, #3 ; argument 1 = 3
MOV R2, #4 ; argument 2 = 4
MOV R3, #5 ; argument 3 = 5
BL DIFFOFSUMS ; call function
MOV R4, R0 ; y = returned value
...
; R4 = result
DIFFOFSUMS
ADD R8, R0, R1 ; R8 = f + g
ADD R9, R2, R3 ; R9 = h + i
SUB R4, R8, R9 ; result = (f + g) - (h + i)
MOV R0, R4 ; put return value in R0
MOV PC, LR ; return to caller

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <125>
Input Arguments and Return Value
ARM Assembly Code
; R4 = result
DIFFOFSUMS
ADD R8, R0, R1 ; R8 = f + g
ADD R9, R2, R3 ; R9 = h + i
SUB R4, R8, R9 ; result = (f + g) - (h + i)
MOV R0, R4 ; put return value in R0
MOV PC, LR ; return to caller

• diffofsums overwrote 3 registers: R4, R8, R9


•diffofsums can use stack to temporarily store registers

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <126>
The Stack
• Memory used to temporarily
save variables
• Like stack of dishes, last-in-
first-out (LIFO) queue
• Expands: uses more memory
when more space needed
• Contracts: uses less memory
when the space no longer
needed
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <127>
The Stack
• Grows down (from higher to lower memory
addresses)
• Stack pointer: SP points to top of the stack

Stack expands by 2 words

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <128>
How Functions use the Stack
• Called functions must have no unintended
side effects
• But diffofsums overwrites 3 registers: R4,
R8, R9
ARM Assembly Code
; R4 = result
DIFFOFSUMS
ADD R8, R0, R1 ; R8 = f + g
ADD R9, R2, R3 ; R9 = h + i
SUB R4, R8, R9 ; result = (f + g) - (h + i)
MOV R0, R4 ; put return value in R0
MOV PC, LR ; return to caller

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <129>
Storing Register Values on the Stack
ARM Assembly Code
; R2 = result
DIFFOFSUMS
SUB SP, SP, #12 ; make space on stack for 3 registers
STR R4, [SP, #-8] ; save R4 on stack
STR R8, [SP, #-4] ; save R8 on stack
STR R9, [SP] ; save R9 on stack
ADD R8, R0, R1 ; R8 = f + g
ADD R9, R2, R3 ; R9 = h + i
SUB R4, R8, R9 ; result = (f + g) - (h + i)
MOV R0, R4 ; put return value in R0
LDR R9, [SP] ; restore R9 from stack
LDR R8, [SP, #-4] ; restore R8 from stack
LDR R4, [SP, #-8] ; restore R4 from stack
ADD SP, SP, #12 ; deallocate stack space
MOV PC, LR ; return to caller

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <130>
The Stack during diffofsums Call

Before call During call After call

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <131>
Registers
Preserved Nonpreserved
Callee-Saved Caller-Saved
R4-R11 R12

R14 (LR) R0-R3

R13 (SP) CPSR

stack above SP stack below SP

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <132>
Storing Saved Registers only on Stack
ARM Assembly Code
; R2 = result
DIFFOFSUMS
STR R4, [SP, #-4]! ; save R4 on stack
ADD R8, R0, R1 ; R8 = f + g
ADD R9, R2, R3 ; R9 = h + i
SUB R4, R8, R9 ; result = (f + g) - (h + i)
MOV R0, R4 ; put return value in R0
LDR R4, [SP], #4 ; restore R4 from stack
MOV PC, LR ; return to caller

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <133>
Storing Saved Registers only on Stack
ARM Assembly Code
; R2 = result
DIFFOFSUMS
STR R4, [SP, #-4]! ; save R4 on stack
ADD R8, R0, R1 ; R8 = f + g
ADD R9, R2, R3 ; R9 = h + i
SUB R4, R8, R9 ; result = (f + g) - (h + i)
MOV R0, R4 ; put return value in R0
LDR R4, [SP], #4 ; restore R4 from stack
MOV PC, LR ; return to caller

Notice code optimization for expanding/contracting stack

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <134>
Nonleaf Function
ARM Assembly Code
STR LR, [SP, #-4]! ; store LR on stack
BL PROC2 ; call another function
...
LDR LR, [SP], #4 ; restore LR from stack
jr $ra ; return to caller

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <135>
Nonleaf Function Example
C Code
int f1(int a, int b) {
int i, x;
x = (a + b)*(a − b);
for (i=0; i<a; i++)
x = x + f2(b+i);
return x;
}
int f2(int p) {
int r;
r = p + 5;
return r + p;
}

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <136>
Nonleaf Function Example
C Code ARM Assembly Code
; R0=a, R1=b, R4=i, R5=x ; R0=p, R4=r
int f1(int a, int b) { F1 F2
int i, x; PUSH {R4, R5, LR} PUSH {R4}
ADD R5, R0, R1 ADD R4, R0, 5
x = (a + b)*(a − b); SUB R12, R0, R1 ADD R0, R4, R0
MUL R5, R5, R12 POP {R4}
for (i=0; i<a; i++) MOV R4, #0 MOV PC, LR
x = x + f2(b+i); FOR
return x; CMP R4, R0
BGE RETURN
} PUSH {R0, R1}
ADD R0, R1, R4
int f2(int p) { BL F2
int r; ADD R5, R5, R0
POP {R0, R1}
r = p + 5; ADD R4, R4, #1
return r + p; B FOR
RETURN
} MOV R0, R5
POP {R4, R5, LR}
MOV PC, LR

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <137>
Nonleaf Function Example
ARM Assembly Code
; R0=a, R1=b, R4=i, R5=x ; R0=p, R4=r
F1 F2
PUSH {R4, R5, LR} ; save regs PUSH {R4} ; save regs
ADD R5, R0, R1 ; x = (a+b) ADD R4, R0, 5 ; r = p+5
SUB R12, R0, R1 ; temp = (a-b) ADD R0, R4, R0 ; return r+p
MUL R5, R5, R12 ; x = x*temp POP {R4} ; restore regs
MOV R4, #0 ; i = 0 MOV PC, LR ; return
FOR
CMP R4, R0 ; i < a?
BGE RETURN ; no: exit loop
PUSH {R0, R1} ; save regs
ADD R0, R1, R4 ; arg is b+i
BL F2 ; call f2(b+i)
ADD R5, R5, R0 ; x = x+f2(b+i)
POP {R0, R1} ; restore regs
ADD R4, R4, #1 ; i++
B FOR ; repeat loop
RETURN
MOV R0, R5 ; return x
POP {R4, R5, LR} ; restore regs
MOV PC, LR ; return

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <138>
Stack during Nonleaf Function

At beginning of f1 Just before calling f2 After calling f2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <139>
Recursive Function Call
C Code
int factorial(int n) {
if (n <= 1)
return 1;
else
return (n * factorial(n-1));
}

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <140>
Recursive Function Call
ARM Assembly Code
0x94 FACTORIAL STR R0, [SP, #-4]! ;store R0 on stack
0x98 STR LR, [SP, #-4]! ;store LR on stack
0x9C CMP R0, #2 ;set flags with R0-2
0xA0 BHS ELSE ;if (r0>=2) branch to else
0xA4 MOV R0, #1 ; otherwise return 1
0xA8 ADD SP, SP, #8 ; restore SP 1
0xAC MOV PC, LR ; return
0xB0 ELSE SUB R0, R0, #1 ; n = n - 1
0xB4 BL FACTORIAL ; recursive call
0xB8 LDR LR, [SP], #4 ; restore LR
0xBC LDR R1, [SP], #4 ; restore R0 (n) into R1
0xC0 MUL R0, R1, R0 ; R0 = n*factorial(n-1)
0xC4 MOV PC, LR ; return

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <141>
Recursive Function Call
C Code ARM Assembly Code
int factorial(int n) { 0x94 FACTORIAL STR R0, [SP, #-4]!
0x98 STR LR, [SP, #-4]!
if (n <= 1) 0x9C CMP R0, #2
return 1; 0xA0 BHS ELSE
0xA4 MOV R0, #1
0xA8 ADD SP, SP, #8
0xAC MOV PC, LR
else 0xB0 ELSE SUB R0, R0, #1
return (n * factorial(n-1)); 0xB4 BL FACTORIAL
} 0xB8 LDR LR, [SP], #4
0xBC LDR R1, [SP], #4
0xC0 MUL R0, R1, R0
0xC4 MOV PC, LR

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <142>
Stack during Recursive Call

Before call During call After call

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <143>
Function Call Summary
• Caller
– Puts arguments in R0-R3
– Saves any needed registers (LR, maybe R0-R3, R8-R12)
– Calls function: BL CALLEE
– Restores registers
– Looks for result in R0
• Callee
– Saves registers that might be disturbed (R4-R7)
– Performs function
– Puts result in R0
– Restores registers
– Returns: MOV PC, LR

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <144>
How to Encode Instructions?

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <145>
How to Encode Instructions?
• Design Principle 1: Regularity supports
design simplicity
– 32-bit data, 32-bit instructions
– For design simplicity, would prefer a single
instruction format but…

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <146>
How to Encode Instructions?
• Design Principle 1: Regularity supports
design simplicity
– 32-bit data, 32-bit instructions
– For design simplicity, would prefer a single
instruction format but…
– Instructions have different needs

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <147>
Design Principle 4
Good design demands good compromises
• Multiple instruction formats allow flexibility
- ADD, SUB: use 3 register operands
- LDR, STR: use 2 register operands and a constant
• Number of instruction formats kept small
- to adhere to design principles 1 and 3
(regularity supports design simplicity and
smaller is faster)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <148>
Machine Language
• Binary representation of instructions
• Computers only understand 1’s and 0’s
• 32-bit instructions
– Simplicity favors regularity: 32-bit data & instructions
• 3 instruction formats:
– Data-processing
– Memory
– Branch

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <149>
Instruction Formats
• Data-processing
• Memory
• Branch

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <150>
Data-processing Instruction Format
• Operands:
– Rn: first source register
– Src2: second source – register or immediate
– Rd: destination register
• Control fields:
– cond: specifies conditional execution
– op: the operation code or opcode
– funct: the function/operation to perform

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <151>
Data-processing Control Fields
• op = 002 for data-processing (DP) instructions
• funct is composed of cmd, I-bit, and S-bit

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <152>
Data-processing Control Fields
• op = 002 for data-processing (DP) instructions
• funct is composed of cmd, I-bit, and S-bit
▪ cmd: specifies the specific data-processing instruction. For
example,
▪ cmd = 01002 for ADD
▪ cmd = 00102 for SUB
▪ I-bit
▪ I = 0: Src2 is a register
▪ I = 1: Src2 is an immediate
▪ S-bit: 1 if sets condition flags
▪ S = 0: SUB R0, R5, R7
▪ S = 1: ADDS R8, R2, R4 or CMP R3, #10

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <153>
Data-processing Src2 Variations
• Src2 can be:
▪ Immediate
▪ Register
▪ Register-shifted register

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <154>
Data-processing Src2 Variations
• Src2 can be:
▪ Immediate
▪ Register
▪ Register-shifted register

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <155>
Immediate Src2
• Immediate encoded as:
▪ imm8: 8-bit unsigned immediate
▪ rot: 4-bit rotation value
• 32-bit constant is: imm8 ROR (rot × 2)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <156>
Immediate Src2
• Immediate encoded as:
▪ imm8: 8-bit unsigned immediate
▪ rot: 4-bit rotation value
• 32-bit constant is: imm8 ROR (rot × 2)
• Example: imm8 = abcdefgh
rot 32-bit constant
0000 0000 0000 0000 0000 0000 0000 abcd efgh
0001 gh00 0000 0000 0000 0000 0000 00ab cdef
… …
1111 0000 0000 0000 0000 0000 00ab cdef gh00

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <157>
Immediate Src2
• Immediate encoded as:
▪ imm8: 8-bit unsigned immediate ROR by X = ROL by (32-X)
▪ rot: 4-bit rotation value Ex: ROR by 30 = ROL by 2
• 32-bit constant is: imm8 ROR (rot × 2)
• Example: imm8 = abcdefgh
rot 32-bit constant
0000 0000 0000 0000 0000 0000 0000 abcd efgh
0001 gh00 0000 0000 0000 0000 0000 00ab cdef
… …
1111 0000 0000 0000 0000 0000 00ab cdef gh00

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <158>
DP Instruction with Immediate Src2
ADD R0, R1, #42
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 01002 (4) for ADD
• Src2 is an immediate so I = 1
• Rd = 0, Rn = 1
• imm8 = 42, rot = 0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <159>
DP Instruction with Immediate Src2
ADD R0, R1, #42
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 01002 (4) for ADD
• Src2 is an immediate so I = 1
• Rd = 0, Rn = 1
• imm8 = 42, rot = 0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <160>
DP Instruction with Immediate Src2
ADD R0, R1, #42
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 01002 (4) for ADD
• Src2 is an immediate so I = 1
• Rd = 0, Rn = 1
• imm8 = 42, rot = 0

0xE281002A
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <161>
DP Instruction with Immediate Src2
SUB R2, R3, #0xFF0
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 00102 (2) for SUB
• Src2 is an immediate so I=1
• Rd = 2, Rn = 3
• imm8 = 0xFF
• imm8 must be rotated right by 28 to produce 0xFF0, so rot = 14

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <162>
DP Instruction with Immediate Src2
SUB R2, R3, #0xFF0
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 00102 (2) for SUB
• Src2 is an immediate so I=1 ROR by 28 =
• Rd = 2, Rn = 3 ROL by (32-28) = 4
• imm8 = 0xFF
• imm8 must be rotated right by 28 to produce 0xFF0, so rot = 14

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <163>
DP Instruction with Immediate Src2
SUB R2, R3, #0xFF0
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 00102 (2) for SUB
• Src2 is an immediate so I=1 ROR by 28 =
• Rd = 2, Rn = 3 ROL by (32-28) = 4
• imm8 = 0xFF
• imm8 must be rotated right by 28 to produce 0xFF0, so rot = 14

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <164>
DP Instruction with Immediate Src2
SUB R2, R3, #0xFF0
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 00102 (2) for SUB
• Src2 is an immediate so I=1 ROR by 28 =
• Rd = 2, Rn = 3 ROL by (32-28) = 4
• imm8 = 0xFF
• imm8 must be rotated right by 28 to produce 0xFF0, so rot = 14

0xE2432EFF
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <165>
DP Instruction with Register Src2
• Src2 can be:
▪ Immediate
▪ Register
▪ Register-shifted register

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <166>
DP Instruction with Register Src2
• Rm: the second source operand
• shamt5: the amount Rm is shifted
• sh: the type of shift (i.e., >>, <<, >>>, ROR)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <167>
DP Instruction with Register Src2
• Rm: the second source operand
• shamt5: the amount rm is shifted
• sh: the type of shift (i.e., >>, <<, >>>, ROR)

First, consider unshifted versions of Rm (shamt5=0, sh=0)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <168>
DP Instruction with Register Src2
ADD R5, R6, R7
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 01002 (4) for ADD
• Src2 is a register so I=0
• Rd = 5, Rn = 6, Rm = 7
• shamt = 0, sh = 0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <169>
DP Instruction with Register Src2
ADD R5, R6, R7
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 01002 (4) for ADD
• Src2 is a register so I=0
• Rd = 5, Rn = 6, Rm = 7
• shamt = 0, sh = 0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <170>
DP Instruction with Register Src2
ADD R5, R6, R7
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 01002 (4) for ADD
• Src2 is a register so I=0
• Rd = 5, Rn = 6, Rm = 7
• shamt = 0, sh = 0

0xE0865007
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <171>
DP Instruction with Register Src2
• Rm: the second source operand
Shift Type sh
• shamt5: the amount Rm is shifted LSL 002
• sh: the type of shift LSR 012
ASR 102
ROR 112
Now, consider shifted versions.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <172>
DP Instruction with Register Src2
ORR R9, R5, R3, LSR #2
• Operation: R9 = R5 OR (R3 >> 2)
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 11002 (12) for ORR
• Src2 is a register so I=0
• Rd = 9, Rn = 5, Rm = 3
• shamt5 = 2, sh = 012 (LSR)

1110 00 0 1100 0 0101 1001 00010 01 0 0011


0xE1859123

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <173>
DP with Register-shifted Reg. Src2
• Src2 can be:
▪ Immediate
▪ Register
▪ Register-shifted register

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <174>
DP with Register-shifted Reg. Src2
EOR R8, R9, R10, ROR R12
• Operation: R8 = R9 XOR (R10 ROR R12)
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 00012 (1) for EOR
• Src2 is a register so I=0
• Rd = 8, Rn = 9, Rm = 10, Rs = 12
• sh = 112 (ROR)

1110 00 0 0001 0 1001 1000 1100 0 11 1 1010


0xE0298C7A

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <175>
Shift Instructions Encoding

Shift Type sh
LSL 002
LSR 012
ASR 102
ROR 112

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <176>
Shift Instructions: Immediate shamt
ROR R1, R2, #23
• Operation: R1 = R2 ROR 23
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 11012 (13) for all shifts (LSL, LSR, ASR, and ROR)
• Src2 is an immediate-shifted register so I=0
• Rd = 1, Rn = 0, Rm = 2
• shamt5 = 23, sh = 112 (ROR)

1110 00 0 1101 0 0000 0001 10111 11 0 0010


0xE1A01BE2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <177>
Shift Instructions: Immediate shamt
ROR R1, R2, #23
• Operation: R1 = R2 ROR 23
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 11012 (13) for all shifts (LSL, LSR, ASR, and ROR)
• Src2 is an immediate-shifted register so I=0
• Rd = 1, Rn = 0, Rm = 2
• shamt5 = 23, sh = 112 (ROR) Uses (immediate-
shifted) register
Src2 encoding

1110 00 0 1101 0 0000 0001 10111 11 0 0010


0xE1A01BE2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <178>
Shift Instructions: Register shamt
ASR R5, R6, R10
• Operation: R5 = R6 >>> R107:0
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 11012 (13) for all shifts (LSL, LSR, ASR, and ROR)
• Src2 is a register so I=0
• Rd = 5, Rn = 0, Rm = 6, Rs = 10
• sh = 102 (ASR)

1110 00 0 1101 0 0000 0101 1010 0 10 1 0110


0xE1A05A56

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <179>
Shift Instructions: Register shamt
ASR R5, R6, R10
• Operation: R5 = R6 >>> R107:0
• cond = 11102 (14) for unconditional execution
• op = 002 (0) for data-processing instructions
• cmd = 11012 (13) for all shifts (LSL, LSR, ASR, and ROR)
• Src2 is a register so I=0
• Rd = 5, Rn = 0, Rm = 6, Rs = 10
• sh = 102 (ASR) Uses register-
shifted register
Src2 encoding

1110 00 0 1101 0 0000 0101 1010 0 10 1 0110


0xE1A05A56

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <180>
Review: Data-processing Format
• Src2 can be:
▪ Immediate
▪ Register
▪ Register-shifted register

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <181>
Instruction Formats
• Data-processing
• Memory
• Branch

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <182>
Memory Instruction Format
Encodes: LDR, STR, LDRB, STRB
• op = 012
• Rn = base register
• Rd = destination (load), source (store)
• Src2 = offset
• funct = 6 control bits

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <183>
Offset Options
Recall: Address = Base Address + Offset
Example: LDR R1, [R2, #4]
Base Address = R2, Offset = 4
Address = (R2 + 4)
• Base address always in a register
• The offset can be:
▪ an immediate
▪ a register
▪ or a scaled (shifted) register

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <184>
Offset Examples
ARM Assembly Memory Address
LDR R0, [R3, #4] R3 + 4
LDR R0, [R5, #-16] R5 – 16
LDR R1, [R6, R7] R6 + R7
LDR R2, [R8, -R9] R8 – R9
LDR R3, [R10, R11, LSL #2] R10 + (R11 << 2)
LDR R4, [R1, -R12, ASR #4] R1 – (R12 >>> 4)
LDR R0, [R9] R9

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <185>
Memory Instruction Format
Encodes: LDR, STR, LDRB, STRB
• op = 012
• Rn = base register
• Rd = destination (load), source (store)
• Src2 = offset: register (optionally shifted) or immediate
• funct = 6 control bits

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <186>
Indexing Modes
Mode Address Base Reg. Update
Offset Base register ± Offset No change
Preindex Base register ± Offset Base register ± Offset
Postindex Base register Base register ± Offset
Examples
• Offset: LDR R1, [R2, #4] ; R1 = mem[R2+4]
• Preindex: LDR R3, [R5, #16]! ; R3 = mem[R5+16]
; R5 = R5 + 16
• Postindex: LDR R8, [R1], #8 ; R8 = mem[R1]
; R1 = R1 + 8

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <187>
Memory Instruction Format
• funct:
▪ I: Immediate bar
▪ P: Preindex
▪ U: Add
▪ B: Byte
▪ W: Writeback
▪ L: Load

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <188>
Memory Format funct Encodings
Type of Operation
L B Instruction
0 0 STR
0 1 STRB
1 0 LDR
1 1 LDRB

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <189>
Memory Format funct Encodings
Type of Operation Indexing Mode
L B Instruction P W Indexing Mode
0 0 STR 0 1 Not supported
0 1 STRB 0 0 Postindex
1 0 LDR 1 0 Offset
1 1 LDRB 1 1 Preindex

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <190>
Memory Format funct Encodings
Type of Operation Indexing Mode
L B Instruction P W Indexing Mode
0 0 STR 0 1 Not supported
0 1 STRB 0 0 Postindex
1 0 LDR 1 0 Offset
1 1 LDRB 1 1 Preindex

Add/Subtract Immediate/Register Offset


Value I U
0 Immediate offset in Src2 Subtract offset from base
1 Register offset in Src2 Add offset to base

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <191>
Memory Instruction Format
Encodes: LDR, STR, LDRB, STRB
• op = 012
• Rn = base register
• Rd = destination (load), source (store)
• Src2 = offset: immediate or register (optionally shifted)
• funct = I (immediate bar), P (preindex), U (add),
B (byte), W (writeback), L (load)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <192>
Memory Instr. with Immediate Src2
STR R11, [R5], #-26
• Operation: mem[R5] <= R11; R5 = R5 - 26
• cond = 11102 (14) for unconditional execution
• op = 012 (1) for memory instruction
• funct = 00000002 (0)
I = 0 (immediate offset), P = 0 (postindex),
U = 0 (subtract), B = 0 (store word), W = 0 (postindex),
L = 0 (store)
• Rd = 11, Rn = 5, imm12 = 26

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <193>
Memory Instr. with Immediate Src2
STR R11, [R5], #-26
• Operation: mem[R5] <= R11; R5 = R5 - 26
• cond = 11102 (14) for unconditional execution
• op = 012 (1) for memory instruction
• funct = 00000002 (0)
I = 0 (immediate offset), P = 0 (postindex),
U = 0 (subtract), B = 0 (store word), W = 0 (postindex),
L = 0 (store)
• Rd = 11, Rn = 5, imm12 = 26

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <194>
Memory Instr. with Register Src2
LDR R3, [R4, R5]
• Operation: R3 <= mem[R4 + R5]
• cond = 11102 (14) for unconditional execution
• op = 012 (1) for memory instruction
• funct = 1110012 (57)
I = 1 (register offset), P = 1 (offset indexing),
U = 1 (add), B = 0 (load word), W = 0 (offset indexing),
L = 1 (load)
• Rd = 3, Rn = 4, Rm = 5 (shamt5 = 0, sh = 0)
1110 01 111001 0100 0011 00000 00 0 0101 = 0xE7943005

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <195>
Memory Instr. with Scaled Reg. Src2
STR R9, [R1, R3, LSL #2]
• Operation: mem[R1 + (R3 << 2)] <= R9
• cond = 11102 (14) for unconditional execution
• op = 012 (1) for memory instruction
• funct = 1110002 (0)
I = 1 (register offset), P = 1 (offset indexing),
U = 1 (add), B = 0 (store word), W = 0 (offset indexing),
L = 0 (store)
• Rd = 9, Rn = 1, Rm = 3, shamt = 2, sh = 002 (LSL)
1110 01 111000 0001 1001 00010 00 0 0011 = 0xE7819103

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <196>
Review: Memory Instruction Format
Encodes: LDR, STR, LDRB, STRB
• op = 012
• Rn = base register
• Rd = destination (load), source (store)
• Src2 = offset: register (optionally shifted) or immediate
• funct = I (immediate bar), P (preindex), U (add),
B (byte), W (writeback), L (load)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <197>
Instruction Formats
• Data-processing
• Memory
• Branch

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <198>
Branch Instruction Format
Encodes B and BL
• op = 102
• imm24: 24-bit immediate
• funct = 1L2: L = 1 for BL, L = 0 for B

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <199>
Encoding Branch Target Address
• Branch Target Address (BTA): Next PC when
branch taken
• BTA is relative to current PC + 8
• imm24 encodes BTA
• imm24 = # of words BTA is away from PC+8

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <200>
Branch Instruction: Example 1
ARM assembly code
0xA0 BLT THERE PC • PC = 0xA0
0xA4 ADD R0, R1, R2 • PC + 8 = 0xA8
0xA8 SUB R0, R0, R9 PC+8
• THERE label is 3
0xAC ADD SP, SP, #8
0xB0 MOV PC, LR instructions past
0xB4 THERE SUB R0, R0, #1 BTA PC+8
0xB8 BL TEST • So, imm24 = 3

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <201>
Branch Instruction: Example 1
ARM assembly code
0xA0 BLT THERE PC • PC = 0xA0
0xA4 ADD R0, R1, R2 • PC + 8 = 0xA8
0xA8 SUB R0, R0, R9 PC+8
• THERE label is 3
0xAC ADD SP, SP, #8
0xB0 MOV PC, LR instructions past
0xB4 THERE SUB R0, R0, #1 BTA PC+8
0xB8 BL TEST • So, imm24 = 3

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <202>
Branch Instruction: Example 1
ARM assembly code
0xA0 BLT THERE PC • PC = 0xA0
0xA4 ADD R0, R1, R2 • PC + 8 = 0xA8
0xA8 SUB R0, R0, R9 PC+8
• THERE label is 3
0xAC ADD SP, SP, #8
0xB0 MOV PC, LR instructions past
0xB4 THERE SUB R0, R0, #1 BTA PC+8
0xB8 BL TEST • So, imm24 = 3

0xBA000003

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <203>
Branch Instruction: Example 2
ARM assembly code
0x8040 TEST LDRB R5, [R0, R3] BTA
0x8044 STRB R5, [R1, R3] • PC = 0x8050
0x8048 ADD R3, R3, #1 • PC + 8 = 0x8058
0x8044 MOV PC, LR • TEST label is 6
0x8050 BL TEST PC instructions before
0x8054 LDR R3, [R1], #4
0x8058 SUB R4, R3, #9 PC+8 PC+8
• So, imm24 = -6

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <204>
Branch Instruction: Example 2
ARM assembly code
0x8040 TEST LDRB R5, [R0, R3] BTA
0x8044 STRB R5, [R1, R3] • PC = 0x8050
0x8048 ADD R3, R3, #1 • PC + 8 = 0x8058
0x8044 MOV PC, LR • TEST label is 6
0x8050 BL TEST PC instructions before
0x8054 LDR R3, [R1], #4
0x8058 SUB R4, R3, #9 PC+8 PC+8
• So, imm24 = -6

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <205>
Branch Instruction: Example 2
ARM assembly code
0x8040 TEST LDRB R5, [R0, R3] BTA
0x8044 STRB R5, [R1, R3] • PC = 0x8050
0x8048 ADD R3, R3, #1 • PC + 8 = 0x8058
0x8044 MOV PC, LR • TEST label is 6
0x8050 BL TEST PC instructions before
0x8054 LDR R3, [R1], #4
0x8058 SUB R4, R3, #9 PC+8 PC+8
• So, imm24 = -6

0xEBFFFFFA

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <206>
Review: Instruction Formats

Branch

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <207>
Conditional Execution
Encode in cond bits of machine instruction
For example,
ANDEQ R1, R2, R3 (cond = 0000)
ORRMI R4, R5, #0xF (cond = 0100)
SUBLT R9, R3, R8 (cond = 1011)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <208>
Review: Condition Mnemonics
cond Mnemonic Name CondEx
0000 EQ Equal 𝑍
0001 NE Not equal 𝑍ҧ
0010 CS / HS Carry set / Unsigned higher or same 𝐶
0011 CC / LO Carry clear / Unsigned lower 𝐶ҧ
0100 MI Minus / Negative 𝑁
0101 PL Plus / Positive of zero 𝑁 ഥ
0110 VS Overflow / Overflow set 𝑉
0111 VC No overflow / Overflow clear 𝑉ത
1000 HI Unsigned higher 𝑍𝐶ҧ
1001 LS Unsigned lower or same 𝑍 𝑂𝑅 𝐶ҧ
1010 GE Signed greater than or equal 𝑁⊕𝑉
1011 LT Signed less than 𝑁⊕𝑉
1100 GT Signed greater than ҧ ⊕ 𝑉)
𝑍(𝑁
1101 LE Signed less than or equal 𝑍 𝑂𝑅 (𝑁 ⊕ 𝑉)
1110 AL (or none) Always / unconditional ignored
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <209>
Conditional Execution: Machine Code
Assembly Code Field Values
31:28 27:26 25 24:21 20 19:16 15:12 11:7 6:5 4 3:0

SUBS R1, R2, R3 14 0 0 2 1 2 1 0 0 0 3


ADDEQ R4, R5, R6 0 0 0 4 0 5 4 0 0 0 6
ANDHS R7, R5, R6 2 0 0 0 0 5 7 0 0 0 6
ORRMI R8, R5, R6 4 0 0 12 0 5 8 0 0 0 6
EORLT R9, R5, R6 11 0 0 1 0 5 9 0 0 0 6
cond op I cmd S rn rd shamt5 sh rm

Machine Code
31:28 27:26 25 24:21 20 19:16 15:12 11:7 6:5 4 3:0

1110 00 0 0010 0 0010 0001 00000 00 0 0011 (0xE0421003)

0000 00 0 0100 0 0101 0100 00000 00 0 0110 (0x00854006)

0010 00 0 0000 0 0101 0111 00000 00 0 0110 (0x20057006)

0100 00 0 1100 0 0101 1000 00000 00 0 0110 (0x41858006)

1011 00 0 0001 0 0101 1001 00000 00 0 0110 (0xB0259006)

cond op I cmd S rn rd shamt5 sh rm

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <210>
Interpreting Machine Code
• Start with op: tells how to parse rest
op = 00 (Data-processing)
op = 01 (Memory)
op = 10 (Branch)
• I-bit: tells how to parse Src2
• Data-processing instructions:
If I-bit is 0, bit 4 determines if Src2 is a register (bit 4
= 0) or a register-shifted register (bit 4 = 1)
• Memory instructions:
Examine funct bits for indexing mode, instruction,
and add or subtract offset

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <211>
Interpreting Machine Code: Example 1
0xE0475001

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <212>
Interpreting Machine Code: Example 1
0xE0475001
• Start with op: 002, so data-processing instruction

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <213>
Interpreting Machine Code: Example 1
0xE0475001
• Start with op: 002, so data-processing instruction
• I-bit: 0, so Src2 is a register
• bit 4: 0, so Src2 is a register (optionally shifted by shamt5)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <214>
Interpreting Machine Code: Example 1
0xE0475001
• Start with op: 002, so data-processing instruction
• I-bit: 0, so Src2 is a register
• bit 4: 0, so Src2 is a register (optionally shifted by shamt5)
• cmd: 00102 (2), so SUB
• Rn=7, Rd=5, Rm=1, shamt5 = 0, sh = 0
• So, instruction is: SUB R5,R7,R1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <215>
Interpreting Machine Code: Example 2
0xE5949010

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <216>
Interpreting Machine Code: Example 2
0xE5949010
• Start with op: 012, so memory instruction
• funct: B=0, L=1, so LDR; P=1, W=0, so offset indexing;
I=0, so immediate offset, U=1, so add offset
• Rn=4, Rd=9, imm12 = 16
• So, instruction is: LDR R9,[R4,#16]

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <217>
Addressing Modes
How do we address operands?
• Register
• Immediate
• Base
• PC-Relative

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <218>
Addressing Modes
How do we address operands?
• Register Only
• Immediate
• Base
• PC-Relative

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <219>
Register Addressing
• Source and destination operands found in
registers
• Used by data-processing instructions
• Three submodes:
– Register-only
– Immediate-shifted register
– Register-shifted register

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <220>
Register Addressing Examples
• Register-only
Example: ADD R0, R2, R7
• Immediate-shifted register
Example: ORR R5, R1, R3, LSL #1
• Register-shifted register
Example: SUB R12, R9, R0, ASR R1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <221>
Addressing Modes
How do we address operands?
• Register Only
• Immediate
• Base
• PC-Relative

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <222>
Immediate Addressing
• Operands found in registers and immediates
Example: ADD R9, R1, #14
• Uses data-processing format with I=1
– Immediate is encoded as
• 8-bit immediate (imm8)
• 4-bit rotation (rot)
– 32-bit immediate = imm8 ROR (rot x 2)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <223>
Addressing Modes
How do we address operands?
• Register Only
• Immediate
• Base
• PC-Relative

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <224>
Base Addressing
• Address of operand is:
base register + offset
• Offset can be a:
– 12-bit Immediate
– Register
– Immediate-shifted Register

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <225>
Base Addressing Examples
• Immediate offset
Example: LDR R0, [R8, #-11]
(R0 = mem[R8 - 11] )
• Register offset
Example: LDR R1, [R7, R9]
(R1 = mem[R7 + R9] )
• Immediate-shifted register offset
Example: STR R5, [R3, R2, LSL #4]
(R5 = mem[R3 + (R2 << 4)] )

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <226>
Addressing Modes
How do we address operands?
• Register Only
• Immediate
• Base
• PC-Relative

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <227>
PC-Relative Addressing
• Used for branches
• Branch instruction format:
– Operands are PC and a signed 24-bit immediate (imm24)
– Changes the PC
– New PC is relative to the old PC
– imm24 indicates the number of words away from PC+8
• PC = (PC+8) + (SignExtended(imm24) x 4)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <228>
Power of the Stored Program
• 32-bit instructions & data stored in memory
• Sequence of instructions: only difference
between two applications
• To run a new program:
– No rewiring required
– Simply store new program in memory
• Program Execution:
– Processor fetches (reads) instructions from memory
in sequence
– Processor performs the specified operation

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <229>
The Stored Program
Assembly Code Machine Code
MOV R1, #100 0xE3A01064
MOV R2, #69 0xE3A02045
ADD R3, R1, R2 0xE2813002
STR R3, [R1] 0xE5913000

Stored Program
Address Instructions Program Counter
(PC): keeps track of
current instruction
0000000C E5 9 1 3 0 0 0
00000008 E2 8 1 3 0 0 2
00000004 E3 A 0 2 0 4 5
00000000 E3 A 0 1 0 6 4 PC

Main Memory
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <230>
Up Next
How to implement the ARM Instruction Set
Architecture in Hardware

Microarchitecture

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 6 <231>

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy