0% found this document useful (0 votes)
8 views

02 - CH_2_ARM_Processor Architecture(6) (1)

Chapter 2 provides an overview of the ARM Cortex-M processor architecture, focusing on programming languages, microprocessor components, and instruction set architecture. It discusses the differences between machine, assembly, and high-level languages, as well as the ARM Cortex-M series designed for embedded systems. The chapter also covers the features and functionalities of the Cortex-M4 processor, including its architecture, instruction sets, and performance metrics.

Uploaded by

minegadget52
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

02 - CH_2_ARM_Processor Architecture(6) (1)

Chapter 2 provides an overview of the ARM Cortex-M processor architecture, focusing on programming languages, microprocessor components, and instruction set architecture. It discusses the differences between machine, assembly, and high-level languages, as well as the ARM Cortex-M series designed for embedded systems. The chapter also covers the features and functionalities of the Cortex-M4 processor, including its architecture, instruction sets, and performance metrics.

Uploaded by

minegadget52
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

Chapter 2:

ARM Cortex-M Processor


(Architecture)

Spring 2023

1
References
 Tahir, Muhammad, and Kashif Javed. ARM
Microprocessor Systems: Cortex-M
Architecture, Programming, and Interfacing.
CRC Press, 2017, Chapter 1 and 2.
 Cortex-M3 Devices Generic User Guide:
https://developer.arm.com/documentation/dui055
2/a
 Arm University Program Education Kits

2
Review/ Discussion
 What are examples of traditional
microprocessors?
 Computer performance metrics?
 RISC vs CISC architecture?

3
Topics
 An introduction on:
 Programming languages
 Inside a microprocessor (a simple explanation)
 ARM architecture

4
Programming Languages

 A language with strict grammar rules,


symbols, and special words used to construct
a computer program.

 Every language has its own set of rules.


 Machine languages
 Assembly languages
 High level languages

5
Machine Languages
 Computers only understand 0 and 1!
 Early computers were programmed in machine
language
 Defined by its hardware architecture

 Example: to calculate wages = rate * hours in


machine language:
 100100 010001 //Load
 100110 010010 //Multiply
 100010 010011 //Store
6
Assembly Languages
 A language whose instructions are in the form
of mnemonic codes and variable names.

 Examples of instructions:

Assembly Machine
ADD x,y 100101 -> ADD: Opcode x, y: Operands
LD x,1 100100
SUB x,y 100011

 How does the computer understand the assembly language ?

7
High-level Languages
 Developed to speed up the programming process
 A programming language that combines algebraic
expressions and English words.
 Instructions written in these languages are known as
statements.
 It’s very close to human language.
 Include Basic, FORTRAN, COBOL, Pascal, C, C++, C#, and
Java
 The equation wages = rate * hours can be written in C++
as:
wages = rate * hours;

 How does the computer understand the high-level language ?

8
Compiler/ assembler
 A software which translates the high-level
language to machine language is called
compiler.
 A software which translates the assembly
language to machine language is called
assembler.

your code in
Your code in
assembly or high- Compiler
machine language
level language / assembler

9
What is a Program? Why writing a program?
 A program is a set of instructions which is translated
into 0 and 1 in order to be understandable for
machine.

 A set of tasks based on these instructions will be


performed.

 The purpose of programming is to control the work


of the computer on all levels or to solve problems
by the computer.

 This is done with the help of "orders" and


"commands" from the programmer, also known as
programming instructions.
10
ARM Profiles
 Cortex-M (Cortex-M0, M1, M3, and M4 families)
 These are designed for microcontroller-based
embedded systems
 Cortex-A
 Addressing high-performance applications mainly
covering the cellular market
 Cortex-R
 Addressing the demands of real-time applications

 In this course we will be referring to Cortex-M (mainly


Cortex-M3) profile unless other one is stated.

11
ARM Microcontroller components
1. Microprocessor core
2. Bus system and bus matrix
3. Memory and peripherals
4. Debug system
5. Nested vectored interrupt controller (NVIC)

Microprocessor
Debug
NVIC core
Interface
Cortex M4/ M3

Peripherals
BUS

I/Os

Memory

12
What’s inside of a microprocessor?

13
What’s inside of a microprocessor?

ALU
Arithmetic and Logical Unit

14
What’s inside of a microprocessor?

ALU

 Where the inputs are coming from?


 Where does the output go?

15
What’s inside of a microprocessor?

16
How the data are transferred?
 Register access is
faster than
memory access

 Higher
performance

17
The Processor Datapath
 Datapath: Elements that
process the data and
move and address the
data in the processor
 Operands are stored in the
registers
 Each register has 32 bits
 Data bus is also 32 bits

 From where does the ALU


receive the instructions?

18
The Processor: Instruction Memory
(Harvard architecture)
 Harvard
architecture:
separate Instr
bus
memory for data
and instruction
 Von-Neuman
architecture:
one memory for
Instruction
data and memory
instructions
(Example: 8085)
19
The Processor: Instruction Memory
(Harvard architecture)

Instr
bus

20
The Processor: Instruction Memory
(Harvard architecture)

Instr
bus

21
The Processor: Instruction execution cycles
 Step 1: Read the program counter(PC)
 Step 2: Fetch the Instruction
 Step 3: Increment the PC (PC = PC + 4)
 Step 4: Access required registers from register file
 Step 5: Perform the operation in ALU

22
What is an instruction?
 A binary pattern which specifies which task must be performed by the
microprocessor
 All ARM instructions have a fixed bit-width size (32 bits in ARM
state, 16 bits in Thumb state)
 These binary patterns are different for every microprocessor
 The collection of instructions is called instruction set
 Each instruction is made of an opcode and one to three operands.

 Examples of instructions:
 ADD R1, R2
 SUB R1, R2, R3
 Here ADD and SUB are opcode (or instruction) and R1-R3 are
operands.

23
Instruction Set Architecture (ISA)
ISA is an approach that allows accessing operands.
 Memory-Memory: This type of ISA allows more than one
operand of most instructions to be specified in memory.
 Example: PDP series
 Register-Memory: These architectures allow one operand of an
instruction to be specified in memory, while the other operand is in
CPU register.
 Example: x86, Motorola 68k
 Register-Register: Also known as load-store architecture.
 Direct access to the memory is NOT allowed to most of the
instructions in this ISA.
 Only specific instructions, called load and store instructions, are
responsible for any data movement between registers and memory.
 Example: ARM, MIPS, RISC-V

24
ARM Instruction Set Architecture (ISA)
 ARM supports two different instruction sets:
 ARM instructions: 32 bits
 Thumb instructions: 16 bits (A subset of ARM instructions)

 Offers flexibility
 Suitable for applications with limited memory
 Thumb-2 instruction set is an enhancement to 16-bit Thumb instruction set.
 It adds 32-bit instructions that can be freely intermixed with 16-bit instructions in a
program.
 The additional 32-bit instructions enable Thumb-2 to cover the functionality of
the ARM instructions. However, 32-bit Thumb instructions are not exact copy of 32-
bit ARM instructions .
 Cortex-M3 supports Thumb 2 and Thumb, but does not support ARM instructions.

25
Arm Architectures and Processors
 Arm architecture is a family of RISC-based processor architectures
 Well known for its power efficiency
 Widely used in mobile devices, e.g. smartphones and tablets
 Designed and licensed by Arm to a wide ecosystem of partners

 ARM Holdings
 The company that designs Arm-based processors
 Arm does not manufacture, but it licenses designs to semiconductor
partners, who add their own intellectual property (IP) on top of Arm’s IP,
which they then fabricate and sell to customers
 Arm also offers IP other than processors, such as physical IPs, interconnect
IPs, graphics cores and development tools

26
Arm Processor Families
 Cortex-A series (Application) Cortex-A5 Cortex-A55
Cortex-A7 Cortex-A57
 High performance processors capable of full Cortex-A8
Cortex-A9
Cortex-A65
Cortex-
operating system (OS) support Cortex-A Cortex-A15
Cortex-A17
A65AE
Cortex-A72
Cortex-A32 Cortex-A73
 Applications include smartphones, digital TV, smart Cortex-A34
Cortex-A35
Cortex-A75
Cortex-A76
books Cortex-A53 Cortex-
A75AE

 Cortex-R series (Real-time) Cortex-R4


Cortex-R5
 High performance and reliability for real-time Cortex-R Cortex-R7
Cortex-R8
applications Cortex-R52

 Applications include automotive braking systems,


Cortex-M0 Cortex-M3
powertrains Cortex-M0+ CoCrotretxe-xM
-M47
Cortex-M Cortex-M1 Cortex-M23
 Cortex-M series (Microcontroller)
Cortex-M33
Cortex-m35P

 Cost sensitive solutions for deterministic


microcontroller applications SecurCore SC000 Processor
SC300 Processor
 Applications include microcontrollers, smart sensors

 SecurCore series for high security applications Classic


Arm7
Arm9
Arm11

 Earlier classic processors including Arm7, Arm9,


Arm11 families 27
How to Design an Arm-based SoC
1. Select a set of IP cores from Arm and/or other
third-party IP vendors
2. Integrate IP cores into a single-chip design
3. Give design to semiconductor foundries for chip
fabrication

IP SoC
Cortex-A9 libraries
Cortex-R5 Cortex-M4 Arm
ROM RAM
processor
Arm7 Arm9 Arm11
System bus
Arm-based
DRAM ctrl FLASH ctrl SRAM ctrl SoC
Peripherals
AXI bus AHB bus APB bus

External Interface
GPIO I/O blocks Timer

Licensable IPs SoC Design Chip


Manufacture
28
Arm Cortex-M Series
Cortex-M0 Cortex-M3
 Energy-efficiency Cortex-M
Cortex-M0+ Cortex-M4
Cortex-M1 Cortex-M7

 Low energy cost, long battery life Cortex-M23


Cortex-M33
Cortex-m35P

 Smaller code
 Lower silicon costs

 Ease of use
 Faster software development and reuse

 Embedded applications
 Smart metering, human interface devices, automotive and industrial
control systems, white goods, consumer products and medical
instrumentation

29
Arm Cortex-M Series Family
Arm Core Hardwar Saturate DSP
Hardware Floating
Processor Architectur Architectur Thumb Thumb-2 e d Extension
Multiply Point
e e Divide Math s

Von 1 or 32
Cortex-M0 Armv6-M Most Subset No No No No
Neumann cycle

Cortex Von 1 or 32
Armv6-M Most Subset No No No No
- Neumann cycle
M0+

Cortex-M3 Armv7-M Harvard Entire Entire 1 cycle Yes Yes No No

Cortex-M4 Armv7E-M Harvard Entire Entire 1 cycle Yes Yes Yes Optional

Cortex-M7 Armv7E-M
Harvard Entire Entire 1 cycle Yes Yes Yes Optional

30
Cortex-M4 Processor Overview
 Cortex-M4 processor
 Introduced in 2010
 Designed with a large variety of highly efficient signal processing features
 Features extended single-cycle multiply accumulate instructions,
optimized SIMD arithmetic, saturating arithmetic, and an optional floating-
point unit

 High performance efficiency


 1.25 DMIPS/MHz (Dhrystone million instructions per second/MHz) at the order
of µWatts/MHz

 Low power consumption


 Longer battery life – especially critical in mobile products

 Enhanced determinism
 The critical tasks and interrupt routines can be served quickly in a known
number of cycles 31
Cortex-M4 Processor Features
 32-Bit reduced instruction set computing (RISC) processor

 Harvard architecture
 Separated data bus and instruction bus

 Instruction set
 Includes the entire Thumb-1 (16-bit) and Thumb-2 (16/32-bit)
instruction sets

 3-stage + branch speculation pipeline

 Supported interrupts
 Non-maskable interrupt (NMI) + 1 to 240 physical interrupts
 8 to 256 interrupt priority levels
32
Cortex-M4 Processor Features
 Supports sleep modes
 Up to 240 wake-up interrupts
 Integrated wait for interrupt (WFI) and wait for event
(WFE) instructions and sleep on exit capability
 Sleep and deep sleep signals
 Optional retention mode with arm power management kit

 Enhanced instructions
 Hardware divide (2-12 cycles)
 Single-cycle 16, 32-bit MAC, single-cycle dual 16-bit
MAC
 8, 16-bit SIMD arithmetic
33
Cortex-M4 Processor Features
 Debug
 Optional JTAG & serial-wire debug (SWD) ports
 Up to eight breakpoints and four watchpoints

 Memory protection unit (MPU)


 Optional eight-region MPU with sub regions and
background regions

34
ARM Cortex M4 Blocks
1. Microprocessor core
2. Bus system and bus matrix
3. Memory and peripherals
4. Debug system
5. Nested vectored interrupt controller (NVIC)

Microprocessor
Debug
NVIC core
Interface
Cortex M4/ M3

Peripherals
BUS

I/Os

Memory

36
Cortex-M4 Block Diagram A[31:0]
Control
signals Address register

 Processor core PC Incrementer


PC
Registers
 Contains internal Rd
Rn
Bank PC
Rm
registers, the ALU,
data path, and some
MULT
ALU
Control bus A bus
Barrel
control logic
Unit
Shifter

 Registers include 16x B bus

32-bit registers for


ALU

both general and Data in i. pipe Data out register

special use
Instruction

Memory

37
Cortex-M4 Block Diagram
 Processor pipeline stages
 Three-stage pipeline: fetch, decode, and execution
 Some instructions may take multiple cycles to execute,
in which case the pipeline will be stalled
 Speculatively prefetches instructions from branch target
addresses
 Up to two instructions can be fetched in one transfer
(16-bit instructions)

38
Cortex-M4 Block Diagram
 Nested vectored interrupt controller (NVIC)
 Up to 240 interrupt request signals and an NMI
 Automatically handles nested interrupts, such as comparing
priorities between interrupt requests and the current priority level

 Wake-up interrupt controller (WIC)


 For low-power applications, the microcontroller can enter sleep
mode by shutting down most of the components
 When an interrupt request is detected, the WIC can inform the
power management unit to power up the system

 Memory protection unit (MPU)


 Used to protect memory content, e.g., make some memory regions
read-only or preventing user applications from accessing privileged
application data
39
Cortex-M4 Block Diagram
 Bus interconnect
 Allows data transfer to take place on different buses simultaneously
 Provides data transfer management, e.g. write buffer, bit-oriented
operations (bit-band)
 May include bus bridges (e.g. AHB-to-APB bus bridge) to connect
different buses into a network using a single global memory space
 Includes the internal bus system, the data path in the processor
core, and the AHB LITE interface unit

 Debug subsystem
 Handles debug control, program breakpoints, and data
watchpoints
 When a debug event occurs, it can put the processor core in a
halted state, so developers can analyse the status of the processor,
such as register values and flags, at that point
40
ARM Cortex-M3 registers

41
Cortex-M3 Registers
 R0-R12: general purpose registers
 Low registers (R0-R7) can be accessed by any
instruction
 High registers (R8-R12) sometimes cannot be
accessed, e.g. by some Thumb (16-bit)
instructions

 R13: Stack Pointer (SP)


 Records the current address of the stack
 Used for saving the context of a program
while switching between tasks
 Cortex-M3 has two SPs:
 Main SP, used in applications that require
privileged access e.g. OS kernel
 Process SP, used in base-level application
code (when not running an exception
handler)
42
Cortex-M3 Registers
 R14: Link Register (LR)
 The LR is used to store the return address of a subroutine or a
function call
 The PC will load the value from the LR after a function is finished

43
Cortex-M3 Registers
 xPSR, combined program status register (PSR)
 Provides information about program execution and ALU
flags
 Application PSR (APSR)
 Interrupt PSR (IPSR)
 Execution PSR (EPSR)

44
Cortex-M3 Registers
 APSR
 N: negative flag: set to one if the result from the ALU is negative
 Z: zero flag: set to one if the result from the ALU is zero
 C: carry flag: set to one if an unsigned overflow occurs
 V: overflow flag: set to one if a signed overflow occurs
 Q: stick saturation flag: set to one if saturation has occurred in saturating
arithmetic instructions, or overflow has occurred in certain multiply instructions

 IPSR
 Interrupt service routine (ISR) number: current executing ISR number

 EPSR
 T: Thumb state: always one since Cortex-M4 only supports the Thumb state
 IC/IT: Interrupt-continuable instruction (ICI) bit, IF-THEN instruction block
status bit
45
Cortex-M3 Registers
 Exception mask registers
 1-bit PRIMASK (priority masking register): When it is set, all the exceptions/interrupts are
blocked except the Reset Interrupt, the Non-Maskable Interrupt (NMI), and the HardFault
exception.
 1-bit FAULTMASK (fault mask register): Setting FAULTMASK to 1 only allows reset and NMI but
masks the HardFault exception.
 1-bit BASEPRI (base priority masking register): The interrupt masking by the BASEPRI is
performed depending on the current priority level conguration.

 CONTROL: special register


 The CONTROL register controls the stack used and the privilege level for software execution
when the processor is in Thread mode.
 SPSEL: if 0 MSP, if 1 PSP
 nPRIV: if 0 privileged, if 1 unprivileged

46
Cortex-M Registers

47
Concept of Overflow
Why microprocessors have separate carry and overflow
flags?
 Carry flag represents overflow for unsigned numbers
 Overflow flag represents overflow for signed numbers
 There can be four possible outcomes when an
arithmetic operation such as addition is performed.
 no overflow,
 unsigned overflow only,
 signed overflow only,
 both signed and unsigned overflows.

48
Concept of Overflow
 Example: an unsigned overflow but not signed overflow (i.e.,
case 2). Assume 32-bit data.
 Assume that we want to perform addition of R0 = 0xFFFFFFFF and
R1= 0x00000001.
 The 32-bit answer comes out to be 0x00000000 with a carry 1 (comes
out of the most significant bit (MSB)).
 If considered as unsigned, result should be:
 4,294,967,295 + 1 = 4,294,967,296.
 Answer cannot be accommodated in 32 bits.
 Got an incorrect answer (0).
 C flag = 1 (unsigned overflow)
 If considered as signed, result is:
 -1 + 1 = 0
 Answer within range of signed numbers.
 V flag = 0 (no signed overflow)
49
Concept of Overflow
 Example: a signed overflow but not unsigned
overflow (i.e., case 3). Assume 32-bit data.
 Assume that we want to perform addition of R0 =
0x7FFFFFFF and R1= 0x7FFFFFFF.
 The 32-bit answer comes out to be 0xFFFFFFFE.
 If considered as unsigned, result is:
 2,147,483,647 + 2,147,483,647 = 4,294,967,294.
 C flag = 0 (no unsigned overflow)
 If considered as signed, result is:
 Signed interpretation is -2
 V flag = 1 (signed overflow)

50
Concept of Overflow
 Example: both signed and unsigned overflow (i.e., case 4).
Assume 32-bit data.
 Assume that we want to perform addition of R0 = 0x80000000 and R1=
0x80000000.
 The 32-bit answer comes out to be 0x00000000 with a carry 1 (comes
out of the most significant bit (MSB)).
 If considered as unsigned, result should be:
 2,147,483,648 + 2,147,483,648 = 4,294,967,296.
 Answer cannot be accommodated in 32 bits.
 Got an incorrect answer (0).
 C flag = 1 (no unsigned overflow)
 If considered as signed, result should be:
 -2,147,483,648 + -2,147,483,648 = -4,294,967,296
 Got an incorrect answer (0).
 V flag = 1 (signed overflow)
51
Exceptions
 Exceptions are usually used to handle
unexpected events which arise during the
execution of a program

 Exception number: which exception number is


being handled
 System Exception
 Interrupt Exception

52
Exceptions
Some sources of Exceptions:

 Direct effect of executing an instruction


 Undefined instructions
 Prefetch aborts (memory fault occurring during fetch)

 A side-effect of an instruction
 Data abort (a memory fault during a load or store
data access)

 Exceptions generated externally


 Reset
53
ARM Operating modes
 Thread Mode: Running normal program,
 Processor enters this mode after reset
 Code execution in Thread mode can have one of the following access
levels:
 Unprivileged (user) access level: cannot switch to privileged access level
 Privileged access level: can switch between privileged and user

 Handler Mode: The processor enters Handler mode as a result of an


exception.
 All code execution is privileged in handler mode.

 What is the purpose of privileged modes?


 provide a mechanism for safeguarding memory accesses to
critical regions as well as providing a basic security model
 Increase security and reliability

54
ARM operating modes graph

55
Memory types: RAM, SRAM, DRAM
 Random Access Memory
 Volatile memory: data is lost when the power goes off

 SRAM: Static RAM


 Faster
 More expensive
 Used for CPU Cache
 Its built from transistors (MOSFETs)

 DRAM: Dynamic Ram


 Needs refreshing circuit
 Main memory
 Capacitive structure
56
Memory types: ROM
 Read Only Memory
 Non-volatile memory
 Cannot be written to
 Types:
 PROM: Programmable ROM
 EPROM: Erasable Programmable ROM
 EEPROM (E2PROM): Electronically Erasable
Programmable ROM
 Flash memory is a type of EEPROM
 Flash memory is electronically erasable

57
Memory address map
 Predefined memory map
 4 GB memory space

 Cortex-M3 has an
internal structure which is
optimized for this
memory map

 System level:
 Interrupt
 Debug system
 Vender
specific
information

58
Memory Endianness
 ARM processors are 32-bit and their memory
interface is also 32-bit and are not limited in that
aspect.
 Cortex-M processors support the following common
data types when performing operations or transferring
data to or from memory.
 Byte data type of size 8-bits
 Halfword data type of size 16-bits
 Word data type of size 32-bits

59
Memory Endianness
 Little Endian
 The processor stores the least significant byte of a word (or
halfword) at the lowest-numbered byte address, and the
most significant byte of the word (or halfword) at the
highest-numbered byte address.

 Byte Invariant Big Endian


 The processor stores the most significant byte of a word (or
halfword) at the lowest byte address, and the least
significant byte of the word (or halfword) at the highest byte
address.

 Cortex-M architecture supports both of them, however most of


the Cortex-M microcontrollers use little endian format.
60
Memory
Endianness

61
Bit-Band Operations
 Bit-banding is an important concept, and a very useful tool, of
Cortex-M3.

 Bit-banding operation allows a single load/store operation to


access (read/write) to a single data bit in memory, without
special instructions.

 Bit-banding is powerful. Without bit-banding, it takes longer time


to modify the bit data.

 Possible applications: serial data transfers; branch decisions;


multiple Boolean variables packed into one single memory location,
whereas access to each bit is still completely separate.

62
How to modify only some bits in a
register or a memory?
 To set or clear bit-2 in an 8-bit register, you can do
Scenario:  Set: register |= 0x04
Without bit-banding  Clear: register &= 0xFB (&= ~0x04)

 Note:
 |= “Or Equal” operator
 &= “And Equal” operator

 Q: How about changing a bit in one memory?


 You need to use Read-Modify-Write method with a
register, as shown on the left. It takes some number
of cycles to read, to set/clear a bit, and then to write.
 Bit-banding can save you the trouble.

 Q: How to implement bit-banding (in Assembly)?


Pictures from Hitex

63
Bit-Band Alias
 Bit-band is supported in two predefined separate
memory regions (called bit-band regions), located in
the first 1 MB of the SRAM region and the peripheral
region, respectively.
 Access to these two regions as bit-band regions is not
direct.
 Rather, we need to access a separate memory region
called the bit-band alias regions, to perform bit-band
operations in the two predefined bit-band memory
regions.
 In other words, normal read/write operations performed
in the bit-band alias regions result in single bit
read/write operations in the actual bit-band region. 64
65
Locating Bit-Band memory regions
 Two bit-band regions: One region is located at the start
of SRAM space and the other is located at the start of
peripheral space and each of them is 1 MB in size.
 Locations of bit-band regions:
 0x20000000 − 0x200FFFFF (SRAM bit-band region, 1 MB)
 0x40000000 − 0x400FFFFF (peripheral bit-band region, 1
MB)
 Locations of the corresponding bit-band alias regions:
 0x22000000 − 0x23FFFFFF (SRAM bit-band alias region, 32
MB)
 0x42000000 − 0x43FFFFFF (peripheral bit-band alias
region, 32 MB)
66
Mapping between bit-band region
and its corresponding alias.
 Access to bit-banding region.
 Each individual bit in the bit-band region is
accessed separately in the least significant bit
(LSB) of 32-bit contents at the word-aligned bit-
band alias address.

67
Mapping between bit-band region
and its corresponding bit-band alias

68
Mapping between bit-band region
and its corresponding bit-band alias
 For example, when we want to set the fifth bit at address
0x20000400, what would be the corresponding bit-band alias
address to perform this operation?
 Since the MSB of 0x20000000 in the bit-band region is
mapped to 0x2200007F in bit-band alias, correspondingly
the LSB of 0x20000001 is mapped to 0x22000080 in bit-
band alias.
 Let x represent the memory address and c is the bit location
in the bit-band region that we want to operate on.
 Then the corresponding bit-band alias address y is given by:

69
Examples of Bit-band Alias
0x2200 0008
0x2200 0004 Alias
0x2200 0000

How many bits?

70
Examples of Bit-band Alias

71
Writing single bit to memory
With and Without Bit-band
 Example 2.1 (Writing to bit-band region for bit setting.).
 Let’s first see how bit setting using conventional load-modify-store
procedure looks like. The pseudocode listed below outlines the key
steps involved in this process.
Step 1: Setup address in bit-band region
Step 2: Read data from the address to register
Step 3: Set the selected bit
Step 4: Write the result back to same address

 Now the same activity will be performed using bit band alias region,
then the following steps are required.

Step 1: Setup address in bit-band alias region


Step 2: Setup data for setting bit
Step 3: Write to the bit-band alias region 72
With and Without Bit-band
Task: set bit-2 of 0x20000000

How do we know
memory
0x22000008 is
referring to bit-2
of memory
0x20000000?

Q: what if you write a number other


than #1 or #0 to bit-band memory?
73
Reading single bit from memory
With and Without Bit-band
 Example 2.2 (Reading single bit from memory)
 The pseudo-code listing below provides the steps
involved to read a bit from memory using a
conventional approach.
Step 1: Setup address in bit-band region
Step 2: Read the entire word from the address to
register
Step 3: Extract the selected bit

 The same activity when performed using bit band


alias region, we require the following steps.
Step 1: Setup address in bit-band alias region
Step 2: Read the bit from the address
74
Advantages of Bit-Band Operations
 Bit-band operations can reduce the number of
instructions required when bit read or write
operations are performed in a specific memory
region.
 Bit-band operation can also help in simplifying
the branch decisions.

75
Advantages of Bit-Band Operations
 Example 2.3 (Performing branch operation using bit banding)
 Let us consider that a branch operation is to be performed after testing a status
bit in one of the registers associated with a peripheral.
 This bit can be set or reset based on the presence or absence of a certain
condition related to that peripheral.
 Under these situations the normal sequence of operations to perform the above-
mentioned task involves the following steps.
Step 1: Read the complete status register.
Step 2: Mask unwanted bits and perform any arbitrary bit
shifting if required.
Step 3: Compare with the test value and then perform
branch operation if the test condition is true.
 However, this very same activity can be efficiently performed based on bit-
banding by reading the status bit using band alias region and involves the
following steps.
Step 1: Read the status bit using bit-band alias region.
Step 2: Compare and perform the branch operation if the
test condition is valid . 76
System Stack Architecture
 Uses a special memory region, called stack memory region.
 Accessed using stack pointers (MSP or PSP)
 Stack memory read and write operations always follow Last-In-First-
Out (LIFO) data buffer format.
 Cortex-M processors allocate a small region from the main
system memory (RAM) as the stack memory.
 PUSH instruction and the POP instruction are used to access stack.
 Stack memory is used in the following situations:
 Storing some of the registers (holding application data currently) that
may need to be freed. then retrieved later.
 Pass parameter or argument values
 Declaring any local variables used by a software function or
subroutine.
 In case of exceptions and/or interrupts, the processor status and
general-purpose registers are stored before the corresponding
interrupt service routine is executed. 77
Cortex-M Stack
 Cortex-M processor uses a full-descending
stack operation model, where the SP points to the
largest address (also called the stack starting
address) when stack memory is empty.
 Two-Stack Model:
 Cortex-M processor has two SPs: the MSP and the
PSP.
 CONTROL register bit 1 (CONTROL-Bit1).
 CONTROL-Bit1 = 0: the MSP is used for both thread
mode and handler mode.
 CONTROL-Bit1 = 1: the PSP is used in thread mode
and MSP is used in handler mode.

78
When CONTROL-Bit 1 is 0 both thread and handler
modes use main stack

79
Bus organization
 Cortex M3 allows to carry instruction fetch and data
access at the same time.
 Main bus interfaces:
 Code memory bus
 I-code

 D-code

 System bus
 Static RAM (SRAM), peripherals, external

RAM, external devices


 Private peripheral bus

80
Bus organization

81
Bus organization
 The Advanced High-performance Bus (AHB) and
Advanced Peripheral Bus (APB) bus protocols are part
of the Advanced Microcontroller Bus Architecture
(AMBA) standard.
 AMBA standard consists of a set of multiple bus
protocol specifications.
 A bus matrix is used as the AHB interconnection network
 Bus matrix allows the data access and instruction fetch to
take place simultaneously
 An internal AHB-to-APB bus bridge is used to connect a
number of APB devices, such as debugging components,
which follow the private peripheral bus interface.

82
Bus organization
 The bus matrix allows memory and other peripherals to be
accessed using the AHB and APB buses.

83
Bus organization: I-code bus
 I-Code bus is a 32-bit
bus based on the AHB-
Lite bus protocol.
 It allows to perform
instruction fetches in
memory regions from
0x00000000 to
0x1FFFFFFF.
 Instruction fetches are
performed in word size,
even for 16bit Thumb
instructions.
 The CPU core can fetch
up to two Thumb
instructions at a time.
84
Bus organization: D-code bus
 The D-Code bus is
a 32-bit bus based
on the AHB-Lite
bus protocol.
 Data access in
memory regions
from 0x00000000
to 0x1FFFFFFF
can be performed
via this bus.

85
Bus organization: system-bus
 The System bus is a
32-bit bus based on
the AHB-Lite bus
protocol.
 It is used for
instruction fetch and
data access in
memory regions
from 0x20000000 to
0xDFFFFFFF and
from 0xE0100000 to
0xFFFFFFFF.

86
Bus organization: AHB-AP bus
 The processor
contains an AHB-
AP (access port)
interface for debug
accesses.
 An external
Debug Port (DP)
component
accesses this
interface.

87
Bus organization: Private Peripheral Bus
(PPB)
 The private peripheral bus
(PPB) is a 32-bit bus based on
AMBA-based APB protocol.
 This is intended for private
peripheral accesses in memory
regions 0xE0040000 to
0xE00FFFFF.
 However, since some part of
this APB memory is already
used for different debug
interfaces, the memory region
that can be used for attaching
extra peripherals on this bus is
only 0xE0042000 to
0xE00FF000.

88
Discussion
 Assume that a processor has separate data and
instructions memory. The data memory is 1 GB,
and the instruction memory 512 MB. It has 8 bits
registers. Assume memory cells are 1 byte. This
processor has 16 bits fixed size instructions.
Assume that you are going to design a bus system
for this processor.

 A convenient choice for the size of Data-bus?


 Instruction-bus size?
 How many bits are required to address each memory?
89
Discussion
 Assume that the address-bus of a
microprocessor is 16 bits, and we are going to
connect a byte-addressable memory to the
microprocessor.
 What will be the maximum size of the memory
that we can connect to this microprocessor?

90
Summary
 Programming languages
 ARM architecture
 ARM core (Microprocessor, its registers)
 ARM ISA
 Memory types and ARM Memory map
 Bus organization

91
Future sessions
 ARM instructions

92

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy