ARM9ES TRM PDF
ARM9ES TRM PDF
(Rev 1)
Change history
Proprietary Notice
ARM, The ARM Powered logo, Thumb, and StrongARM are registered trademarks of ARM Limited.
The ARM logo, AMBA, Angel, ARMulator, EmbeddedICE, ModelGen, Multi-ICE, PrimeCell,
ARM7TDMI, ARM7TDMI-S, ARM9TDMI, ARM9E-S, ARM946E-S, ARM966E-S, ETM7, ETM9, TDMI,
and STRONG are trademarks of ARM Limited.
All other products or services mentioned herein may be trademarks of their respective owners.
Neither the whole nor any part of the information contained in, or the product described in, this document
may be adapted or reproduced in any material form except with the prior written permission of the copyright
holder.
The product described in this document is subject to continuous developments and improvements. All
particulars of the product and its use contained in this document are given by ARM Limited in good faith.
However, all warranties implied or expressed, including but not limited to implied warranties of
merchantability, or fitness for purpose, are excluded.
This document is intended only to assist the reader in the use of the product. ARM Limited shall not be liable
for any loss or damage arising from the use of any information in this document, or any error or omission in
such information, or any incorrect use of the product.
Figure C-2 on page C-4 reprinted with permission IEEE Std 1149.1-1990, IEEE Standard Test Access Port
and Boundary-Scan Architecture Copyright 2000, by IEEE. The IEEE disclaims any responsibility or
liability resulting from the placement and use in the described manner.
Confidentiality Status
Product Status
Web Address
http://www.arm.com
ii Copyright © 1999, 2000 ARM Limited. All rights reserved. ARM DDI 0165B
-
Contents
ARM9E-S Technical Reference Manual
Preface
About this document .................................................................................... xvi
Further reading ............................................................................................ xix
Feedback ...................................................................................................... xx
Chapter 1 Introduction
1.1 About the ARM9E-S .................................................................................... 1-2
1.2 ARM9E-S architecture ................................................................................ 1-5
1.3 ARM9E-S block, core, and interface diagrams ........................................... 1-7
1.4 ARM9E-S instruction set summary ........................................................... 1-10
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. iii
Contents
Chapter 5 Interrupts
5.1 About interrupts .......................................................................................... 5-2
5.2 Hardware interface ..................................................................................... 5-3
5.3 Maximum interrupt latency ......................................................................... 5-7
5.4 Minimum interrupt latency .......................................................................... 5-8
iv Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Contents
Chapter 9 AC Parameters
9.1 Timing diagrams ......................................................................................... 9-2
9.2 AC timing parameter definitions .................................................................. 9-8
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. v
Contents
vi Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
List of Tables
ARM9E-S Technical Reference Manual
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. vii
Table 4-5 Halfword accesses ....................................................................... 4-7
Table 4-6 Cycle types ................................................................................... 4-8
Table 4-7 Burst types ................................................................................. 4-10
Table 4-8 Transfer widths ........................................................................... 4-16
Table 4-9 DnTRANS encoding ................................................................... 4-16
Table 4-10 Transfer size encoding ............................................................... 4-21
Table 4-11 Significant address bits .............................................................. 4-21
Table 4-12 Word accesses ........................................................................... 4-22
Table 4-13 Halfword accesses ..................................................................... 4-22
Table 4-14 Byte accesses ............................................................................ 4-22
Table 4-15 Cycle types ................................................................................. 4-24
Table 4-16 Burst types ................................................................................. 4-28
Table 6-1 Handshake signals ....................................................................... 6-7
Table 6-2 Handshake signal connections ................................................... 6-20
Table 7-1 Coprocessor 14 register map ..................................................... 7-16
Table 8-1 Key to tables ................................................................................. 8-3
Table 8-2 ARM instruction cycle counts ....................................................... 8-3
Table 8-3 Key to cycle timing tables ............................................................. 8-7
Table 8-4 Branch and ARM branch with link cycle timings ........................... 8-8
Table 8-5 Thumb branch with link cycle timing ............................................. 8-9
Table 8-6 Branch and exchange cycle timing ............................................. 8-10
Table 8-7 Thumb branch, link and exchange cycle timing ......................... 8-11
Table 8-8 Data operation cycle timing ........................................................ 8-12
Table 8-9 MRS cycle timing ........................................................................ 8-14
Table 8-10 MSR cycle timing ........................................................................ 8-15
Table 8-11 MUL and MLA cycle timing ......................................................... 8-17
Table 8-12 MULS and MLAS cycle timing .................................................... 8-17
Table 8-13 SMULL, UMULL, SMLAL, and UMLAL cycle timing ................... 8-18
Table 8-14 SMULLS, UMULLS, SMLALS, and UMLALS cycle timing ......... 8-18
Table 8-15 SMULxy, SMLAxy, SMULWy, and SMLAWy cycle timing ......... 8-19
Table 8-16 SMLALxy cycle timing ................................................................ 8-19
Table 8-17 QADD, QDADD, QSUB, and QDSUB cycle timing .................... 8-20
Table 8-18 Load register operation cycle timing ........................................... 8-23
Table 8-19 Cycle timing for load operations resulting in interlocks .............. 8-24
Table 8-20 Example sequence LDRB, NOP and ADD cycle timing ............. 8-24
Table 8-21 Example sequence LDRB and STMIA cycle timing ................... 8-25
Table 8-22 Store register operation cycle timing .......................................... 8-26
Table 8-23 LDM cycle timing ........................................................................ 8-28
Table 8-24 STM cycle timing ........................................................................ 8-30
Table 8-25 Data swap cycle timing ............................................................... 8-33
Table 8-26 PLD operation cycle timing ......................................................... 8-35
Table 8-27 Exception entry cycle timing ....................................................... 8-36
Table 8-28 Coprocessor data operation cycle timing ................................... 8-37
Table 8-29 Load coprocessor register cycle timing ...................................... 8-38
Table 8-30 Store coprocessor register cycle timing ..................................... 8-40
Table 8-31 MRC instruction cycle timing ...................................................... 8-42
Table 8-32 MCR instruction cycle timing ...................................................... 8-43
viii Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Table 8-33 MRRC instruction cycle timing .................................................... 8-44
Table 8-34 MCRR instruction cycle timing .................................................... 8-45
Table 8-35 Coprocessor absent instruction cycle timing ............................... 8-46
Table 8-36 Unexecuted instruction cycle timing ............................................ 8-47
Table 9-1 Target AC timing parameters ........................................................ 9-8
Table A-1 Clock interface signals .................................................................. A-2
Table A-2 Instruction memory interface signals ............................................ A-3
Table A-3 Data memory interface signals ..................................................... A-4
Table A-4 Miscellaneous signals ................................................................... A-6
Table A-5 Coprocessor interface signals ....................................................... A-7
Table A-6 Debug signals ............................................................................... A-8
Table B-1 ARM9E-S signals and ARM9TDMI hard macrocell equivalents ... B-2
Table C-1 Public instructions ........................................................................ C-7
Table C-2 Scan chain number allocation .................................................... C-12
Table C-3 Scan chain 1 bit order ................................................................ C-15
Table C-4 ARM9E-S EmbeddedICE-RT logic register map ....................... C-28
Table C-5 Watchpoint control register for data comparison functions ........ C-31
Table C-6 Watchpoint control register for instruction comparison functions C-33
Table C-7 Debug control register bit functions ........................................... C-34
Table C-8 Interrupt signal control ............................................................... C-35
Table C-9 Debug status register bit functions ............................................. C-36
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. ix
x Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
List of Figures
ARM9E-S Technical Reference Manual
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. xi
Figure 4-12 Use of CLKEN ............................................................................ 4-31
Figure 4-13 Alteration of next memory request during waited bus cycle ....... 4-32
Figure 5-1 Retaking the FIQ exception .......................................................... 5-4
Figure 5-2 Stopping CLK for power saving .................................................... 5-5
Figure 5-3 Using CLK and CLKEN for best interrupt latency ......................... 5-6
Figure 6-1 ARM9E-S LDC/STC cycle timing ................................................. 6-4
Figure 6-2 ARM9E-S coprocessor clocking ................................................... 6-5
Figure 6-3 ARM9E-S MCR or MRC transfer timing ....................................... 6-8
Figure 6-4 ARM9E-S MCRR or MRRC transfer timing ................................ 6-10
Figure 6-5 ARM9E-S interlocked MCR ........................................................ 6-12
Figure 6-6 ARM9E-S interlocked MCRR ..................................................... 6-13
Figure 6-7 ARM9E-S late-canceled CDP .................................................... 6-14
Figure 6-8 ARM9E-S privileged instructions ................................................ 6-16
Figure 6-9 ARM9E-S busy waiting and interrupts ........................................ 6-17
Figure 6-10 ARM9E-S coprocessor 15 MCRs ............................................... 6-18
Figure 6-11 Coprocessor connections ........................................................... 6-19
Figure 7-1 Typical debug system ................................................................... 7-3
Figure 7-2 ARM9E-S block diagram .............................................................. 7-5
Figure 7-3 The ARM9E-S, TAP controller, and EmbeddedICE-RT ............... 7-6
Figure 7-4 Breakpoint timing .......................................................................... 7-9
Figure 7-5 Watchpoint entry with data processing instruction ..................... 7-11
Figure 7-6 Watchpoint entry with branch ..................................................... 7-12
Figure 7-7 Clock synchronization ................................................................ 7-14
Figure 7-8 Debug comms channel control register ...................................... 7-17
Figure 7-9 Coprocessor 14 monitor mode debug status register format ..... 7-18
Figure 9-1 Instruction memory interface timing ............................................. 9-2
Figure 9-2 Data memory interface timing ..................................................... 9-3
Figure 9-3 Clock enable timing ...................................................................... 9-3
Figure 9-4 Coprocessor interface timing ........................................................ 9-4
Figure 9-5 Exception and configuration timing .............................................. 9-4
Figure 9-6 Debug interface timing ................................................................. 9-5
Figure 9-7 Interrupt sensitivity status timing .................................................. 9-5
Figure 9-8 JTAG interface timing ................................................................... 9-6
Figure 9-9 DBGSDOUT to DBGTDO relationship ......................................... 9-7
Figure C-1 ARM9E-S scan chain arrangements ............................................ C-2
Figure C-2 Test access port controller state transitions ................................. C-4
Figure C-3 ID code register format ............................................................... C-11
Figure C-4 Typical scan chain cell ............................................................... C-13
Figure C-5 Debug exit sequence .................................................................. C-22
Figure C-6 Debug state entry ....................................................................... C-23
Figure C-7 ARM9E-S EmbeddedICE macrocell overview ........................... C-30
Figure C-8 Watchpoint control register for data comparison ........................ C-31
Figure C-9 Watchpoint control register for instruction comparison .............. C-32
Figure C-10 Debug control register format ..................................................... C-34
Figure C-11 Debug status register ................................................................. C-35
Figure C-12 Debug control and status register structure ............................... C-37
Figure C-13 Vector catch register .................................................................. C-38
xii Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Preface
This preface introduces the ARM9E-S and its reference documentation. It contains the
following sections:
• About this document on page xiv
• Further reading on page xvii
• Feedback on page xviii.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. xiii
Preface
Intended audience
This document has been written for hardware and software engineers who want to
design or develop products based upon the ARM9E-S family of processors. It assumes
no prior knowledge of ARM products.
Chapter 1 Introduction
Read this chapter for an introduction to the ARM9E-S, and for a
summary of the ARM9E-S instruction set.
Chapter 5 Interrupts
Read this chapter for a description of interrupt operation. The chapter
includes interrupt latency details.
xiv Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Preface
Chapter 9 AC Parameters
Read this chapter for a description of the AC timing parameters of the
ARM9E-S.
Appendix B Differences
Read this chapter for a description of the differences between the
ARM9E-S and the ARM9TDMI hard macrocell interface.
Typographical conventions
bold Highlights ARM processor signal names, and interface elements, such as
menu names and buttons. Also used for terms in descriptive lists, where
appropriate.
typewriter Denotes text that can be entered at the keyboard, such as commands, file
and program names, and source code.
typewriter italic
Denotes arguments to commands or functions, where the argument is to
be replaced by a specific value.
typewriter bold
Denotes language keywords when used outside example code.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. xv
Preface
This manual contains a number of timing diagrams. The following key explains the
components used in these diagrams. Any variations are clearly labeled when they occur.
Therefore, you must not attach any additional meaning unless specifically stated.
Clock
HIGH to LOW
Transient
HIGH/LOW to HIGH
Bus stable
Bus change
Shaded bus and signal areas are undefined, so the bus or signal can assume any value
within the shaded area at that time. The actual level is unimportant and does not affect
normal operation.
xvi Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Preface
Further reading
This section lists publications by ARM Limited, and by third parties.
If you would like further information on ARM products, or if you have questions not
answered by this document, please contact info@arm.com or visit our web site at
http://www.arm.com.
ARM publications
This document contains information that is specific to the ARM9E-S. Refer to the
following documents for other relevant information:
• ARM Architecture Reference Manual (ARM DDI 0100)
• ARM9TDMI Data Sheet (ARM DDI 0029)
• ARM Software Development Kit User Guide (ARM DUI 0040).
Other publications
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. xvii
Preface
Feedback
ARM Limited welcomes feedback both on the ARM9E-S, and on the documentation.
If you have any comments or suggestions about this product, please contact your
supplier giving:
• the product name
• a concise explanation of your comments.
If you have any comments about this document, please send email to
errata@arm.com giving:
• the document title
• the document number
• the page number(s) to which your comments refer
• a concise explanation of your comments.
xviii Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Chapter 1-
Introduction
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-1
Introduction
The ARM9E-S supports the ARM debug architecture and features support for real-time
debug, which allows critical exception handlers to execute while debugging the system.
The ARM9E-S uses a pipeline to increase the speed of the flow of instructions to the
processor. This allows several operations to take place simultaneously, and the
processing and memory systems to operate continuously.
1-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
ARM Thumb
Note
The program counter points to the instruction being fetched rather than to the instruction
being executed.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-3
Introduction
F D E M W
Instruction RegisterRegister Data Register
memory access decode read Shift ALU memory access
write
First Second
multiply cycle multiply cycle
CLK
IA[31:1], InMREQ,
ISEQ
INSTR[31:0]
DA[31:0], DnMREQ,
DSEQ, DMORE
WDATA[31:0]
RDATA[31:0]
The ARM9E-S has a Harvard architecture. This features separate address and data
buses for both the 32-bit instruction interface and the 32-bit data interface. This
achieves a significant decrease in Cycles Per Instruction (CPI) by allowing instruction
and data accesses to run concurrently.
Only load, store, coprocessor load, coprocessor store, and swap instructions can access
data from memory. Data can be 8-bit bytes, 16-bit halfwords or 32-bit words. Words
must be aligned to 4-byte boundaries. Halfwords must be aligned to 2-byte boundaries.
Due to the nature of the five-stage pipeline, it is possible for a value to be required for
use before it has been placed in the register bank by the actions of an earlier instruction.
The ARM9E-S control logic automatically detects these cases and stalls the core or
forwards data as applicable to overcome these hazards. No intervention is required by
software in these cases, although you can improve software performance by re-ordering
instructions in certain situations.
1-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
A typical 32-bit architecture can manipulate 32-bit integers with single instructions, and
address a large address space much more efficiently than a 16-bit architecture. When
processing 32-bit data, a 16-bit architecture takes at least two instructions to perform
the same task as a single 32-bit instruction.
When a 16-bit architecture has only 16-bit instructions, and a 32-bit architecture has
only 32-bit instructions, overall the 16-bit architecture has higher code density, and
greater than half the performance of the 32-bit architecture.
The ARM9E-S gives you the choice of running in ARM state, or Thumb state, or a mix
of the two. This allows you to optimize both code density and performance to best suit
your application requirements.
The Thumb instruction set is a subset of the most commonly used 32-bit ARM
instructions. Thumb instructions are each 16 bits long, and have a corresponding 32-bit
ARM instruction that has the same effect on the processor model. Thumb instructions
operate with the standard ARM register configuration, allowing excellent
interoperability between ARM and Thumb states.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-5
Introduction
Thumb therefore offers a long branch range, powerful arithmetic operations, and a large
address space.
Thumb code is typically 65% of the size of the ARM code, and provides 160% of the
performance of ARM code when running on a processor connected to a 16-bit memory
system. Thumb, therefore, makes the ARM9E-S ideally suited to embedded
applications with restricted memory bandwidth, where code density is important.
The availability of both 16-bit Thumb and 32-bit ARM instruction sets, gives designers
the flexibility to emphasize performance or code size on a subroutine level, according
to the requirements of their applications. For example, critical loops for applications
such as fast interrupts and DSP algorithms can be coded using the full ARM instruction
set, and linked with Thumb code.
1-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
Scan chain 2
ARM9E-S
DBGRNG[1:0]
EmbeddedICE-RT
DBGEXT[1:0]
logic
WDATA[31:0]
Coprocessor
RDATA[31:0] ARM9E-S interface
core signals
InMREQ, ISEQ,
ITBIT, InTRANS
IA[31:0]
INSTR[31:0]
Scan chain 1
ARM9E-S
TAP controller
DBGTCKEN
DBGTMS
DBGnTRST
DBGTDI
DBGTDO
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-7
Introduction
IA[31:1] INSTR[31:0]
IAScan IDScan
Incrementer
IAreg Instruction
pipeline
DINFWD[31:0]
Shift
ACC
Byte/
word
repl.
MulResultMe[31:0]
CLZ
ALU
ALUOutEx[31:0]
DINC
DAreg DDScan
DAScan
1-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
CLK TAPID[31:0]
CLKEN DBGTAPSM[3:0]
Clock CORECLKENOUT DBGSDOUT
CORECLKENIN DBGSDIN
DBGSCREG[4:0]
nIRQ DBGnTDOEN EmbeddedICE
nFIQ DBGIR[3:0] and scan
Interrupts interface
nRESET DBGTCKEN
DBGTMS
CFGHIVECS DBGTDI
Miscellaneous DBGnTRST
CFGDISLTBIT
configuration
CFGBIGEND DBGTDO
IA[31:1]
ARM9E-S
INSTR[31:0] DA[31:0]
IABORT
Instruction WDATA[31:0]
memory InMREQ
interface ISEQ RDATA[31:0]
ITBIT DABORT
InTRANS DnRW Data
InM[4:0] DMAS[1:0] memory
interface
DnTRANS[1:0]
DnM[4:0]
DBGIEBKPT DnMREQ
DBGDEWPT DSEQ
EDBGRQ DMORE
DBGACK DLOCK
DBGEXT[1:0]
DBGEN
Debug DBGRNG[1:0]
DBGCOMMRX PASS
DBGCOMMTX LATECANCEL
Coprocessor
DBGRQI CHSD[1:0]
interface
DBGINSTREXEC CHSE[1:0]
DBGINSTRVALID
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-9
Introduction
Symbol Description
1-10 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
Symbol Description
x Selects HIGH or LOW 16 bits of register Rm. T selects the HIGH 16 bits.
(T = top) B selects the LOW 16 bits. (B = bottom).
y Selects HIGH or LOW 16 bits of register Rs. T selects the HIGH 16 bits.
(T = top) B selects the LOW 16 bits. (B = bottom).
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-11
Introduction
Operation Assembler
1-12 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
Operation Assembler
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-13
Introduction
Operation Assembler
1-14 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
Operation Assembler
Move to ARM reg from coproc MRC{cond} p<cpnum>, <op1>, Rd, CRn, CRm,
<op2>
Move to coproc from ARM reg MCR{cond} p<cpnum>, <op1>, Rd, CRn, CRm,
<op2>
Move double to ARM reg from MRRC{cond} p<cpnum>, <op1>, Rd, Rn, CRm
coproc
Move double to coproc from ARM MCRR{cond} p<cpnum>, <op1>, Rd, Rn, CRm
reg
Software BKPT<immediate>
breakpoint
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-15
Introduction
Operation Assembler
Pre-indexed offset -
Post-indexed offset -
1-16 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
Operation Assembler
Post-indexed offset -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-17
Introduction
Operation Assembler
1-18 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
Operation Assembler
Operation Assembler
Register Rm
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-19
Introduction
Suffix Sets
Suffix Description
EQ Equal
NE Not equal
MI Negative
PL Positive or zero
VS Overflow
VC No overflow
HI Unsigned higher
GE Greater or equal
LT Less than
GT Greater than
AL Always
1-20 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
Operation Assembler
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-21
Introduction
Operation Assembler
OR ORR Rd, Rs
Branch Conditional -
1-22 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Introduction
Operation Assembler
Unconditional B label
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 1-23
Introduction
Operation Assembler
Address -
1-24 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Chapter 2-
Programmer’s Model
This chapter describes the ARM9E-S programmer’s model. It contains the following
sections:
• About the programmer’s model on page 2-2
• Processor operating states on page 2-3
• Memory formats on page 2-4
• Instruction length on page 2-6
• Data types on page 2-7
• Operating modes on page 2-8
• Registers on page 2-9
• The program status registers on page 2-16
• Exceptions on page 2-20.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-1
Programmer’s Model
2-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
ARM state 32-bit, word-aligned ARM instructions are executed in this state.
In Thumb state, the Program Counter (PC) uses bit 1 to select between alternate
halfwords.
Note
Transition between ARM and Thumb states does not affect the processor mode or the
register contents.
You can switch the operating state of the ARM9E-S core between ARM state and
Thumb state using the BX and BLX instructions, and loads to the PC. Switching state is
described in the ARM Architecture Reference Manual. For full details of the ARM9E-S
instruction set, contact ARM.
All exceptions are entered, handled, and exited in ARM state. If an exception occurs in
Thumb state, the processor reverts to ARM state. The transition back to Thumb state
occurs automatically on return from the exception handler.
The ARM9E-S allows you to mix ARM and Thumb code as you wish. For details see
Chapter 7 Interworking ARM and Thumb in the Software Development Kit User Guide.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-3
Programmer’s Model
In big-endian format, the ARM9E-S stores the most significant byte of a word at the
lowest-numbered byte, and the least significant byte at the highest-numbered byte.
Therefore, byte 0 of the memory system connects to data lines 31 to 24. This is shown
in Figure 2-1.
4 5 6 7 4
0 1 2 3 0
Lower address
2-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
7 6 5 4 4
3 2 1 0 0
Lower address
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-5
Programmer’s Model
2-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-7
Programmer’s Model
• User mode is the usual ARM program execution state, and is used for executing
most application programs.
Modes other than User mode are collectively known as privileged modes. Privileged
modes are used to service interrupts or exceptions, or to access protected resources.
2-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
2.7 Registers
The ARM9E-S has a total of 37 registers:
• 31 general-purpose 32-bit registers
• 6 32-bit status registers.
These registers are not all accessible at the same time. The processor state and operating
mode determine which registers are available to the programmer.
In ARM state, 16 general registers and one or two status registers are accessible at any
one time. In privileged modes, mode-specific banked registers become available.
Figure 2-3 on page 2-11 shows which registers are available in each mode.
The ARM state register set contains 16 directly-accessible registers, r0 to r15. A further
register, the Current Program Status Register (CPSR), contains condition code flags
and the current mode bits. Registers r0 to r13 are general-purpose registers used to hold
either data or address values. Registers r14, r15, and the CPSR have the following
special functions:
Link register Register r14 is used as the subroutine Link Register (LR).
Register r14 receives a copy of r15 when a Branch with Link (BL
or BLX) instruction is executed.
You can treat r14 as a general-purpose register at all other times.
The corresponding banked registers r14_svc, r14_irq, r14_fiq,
r14_abt and r14_und are similarly used to hold the return values
of r15 when interrupts and exceptions arise, or when BL or BLX
instructions are executed within interrupt or exception routines.
In privileged modes, another register, the Saved Program Status Register (SPSR), is
accessible. This contains the condition code flags and the mode bits saved as a result of
the exception that caused entry to the current mode.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-9
Programmer’s Model
Banked registers have a mode identifier that indicates which User mode register they
are mapped to. These mode identifiers are shown in Table 2-1.
User usra
Interrupt irq
Supervisor svc
Abort abt
System usra
Undefined und
FIQ mode has seven banked registers mapped to r8–r14 (r8_fiq–r14_fiq). As a result
many FIQ handlers do not need to save any registers.
The Supervisor, Abort, IRQ, and Undefined modes each have alternative mode-specific
registers mapped to r13 and r14, allowing a private stack pointer and link register for
each mode.
2-10 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
Indicates that the normal register used by the User or System mode
has been replaced by an alternative register specific to the exception mode.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-11
Programmer’s Model
The Thumb state register set is a subset of the ARM state set. The programmer has
direct access to:
• eight general registers, r0–r7 (for details of high register access in Thumb state
see Accessing high registers in Thumb state on page 2-15).
• the PC
• a stack pointer, SP (ARM r13)
• an LR (ARM r14)
• the CPSR.
There are banked SPs, LRs, and SPSRs for each privileged mode. This register set is
shown in Figure 2-4 on page 2-13.
2-12 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
Indicates that the normal register used by the User or System mode
has been replaced by an alternative register specific to the exception mode.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-13
Programmer’s Model
2.7.3 The relationship between ARM state and Thumb state registers
The Thumb state registers relate to the ARM state registers in the following way:
• Thumb state r0–r7 and ARM state r0–r7 are identical.
• Thumb state CPSR and SPSRs and ARM state CPSR and SPSRs are identical.
• Thumb state SP maps onto ARM state r13.
• Thumb state LR maps onto ARM state r14.
• The Thumb state PC maps onto the ARM state PC (r15).
Low registers
r3 r3
r4 r4
r5 r5
r6 r6
r7 r7
r8
r9
r10
High registers
r11
r12
Stack pointer (SP) Stack pointer (r13)
Link register (LR) Link register (r14)
Program counter (PC) Program counter (r15)
CPSR CPSR
SPSR SPSR
Figure 2-5 Mapping of Thumb state registers onto ARM state registers
Note
Registers r0–r7 are known as the low registers. Registers r8–r15 are known as the high
registers.
2-14 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
In Thumb state, the high registers (r8–r15) are not part of the standard register set. With
assembly language programming you have limited access to them, but can use them for
fast temporary storage.
You can use special variants of the MOV instruction to transfer a value from a low register
(in the range r0–r7) to a high register, and from a high register to a low register. The CMP
instruction allows you to compare high register values with low register values. The
ADD instruction allows you to add high register values to low register values. For more
details, refer to the ARM Architecture Reference Manual.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-15
Programmer’s Model
31 30 29 28 27 26 9 8 7 6 5 4 3 2 1 0
N Z C V Q . . . . . I F T M4 M3 M2 M1 M0
Negative/Less than
Note
The unused bits of the status registers might be used in future ARM architectures, and
must not be modified by software. The unused bits of the status registers are readable,
to allow the processor state to be preserved (for example, during process context
switches) and writable, to allow the processor state to be restored. To maintain
compatibility with future ARM processors, and as good practice, you are strongly
advised to use a read-modify-write strategy when changing the CPSR.
The N, Z, C, and V bits are the condition code flags. They can be set by arithmetic and
logical operations, and also by MSR and LDM instructions. The ARM9E-S tests these
flags to determine whether to execute an instruction.
All instructions can execute conditionally on the state of the N, Z, C, and V bits in ARM
state. In Thumb state, only the Branch instruction can be executed conditionally. For
more information about conditional execution, refer to the ARM Architecture Reference
Manual.
2-16 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
The Q flag
The Sticky Overflow (Q) flag can be set by certain multiply and fractional arithmetic
instructions:
• QADD
• QDADD
• QSUB
• QDSUB
• SMLAxy
• SMLAWy
The Q flag is sticky in that, once set by an instruction, it remains set until explicitly
cleared by an MSR instruction writing to CPSR. Instructions cannot execute
conditionally on the status of the Q flag. To determine the status of the Q flag you must
read the PSR into a register and extract the Q flag from this. For details of how the Q
flag is set and cleared, see individual instruction definitions in the ARM Architectural
Reference Manual.
The bottom eight bits of a PSR are known collectively as the control bits. They are the:
• Interrupt disable bits
• T bit
• Mode bits on page 2-18.
The control bits change when an exception occurs. When the processor is operating in
a privileged mode, software can manipulate these bits.
T bit
Caution
Never use an MSR instruction to force a change to the state of the T bit in the CPSR. If
you do this, the processor enters an unpredictable state.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-17
Programmer’s Model
Mode bits
The M4, M3, M2, M1, and M0 bits (M[4:0]) are the mode bits. These bits determine the
processor operating mode as shown in Table 2-2.
Caution
An illegal value programmed into M[4:0] causes the processor to enter an
unrecoverable state. If this occurs, apply reset.
Not all combinations of the mode bits define a valid processor mode, so take care to use
only those bit combinations shown.
M[4:0] Mode Visible Thumb state registers Visible ARM state registers
10000 User r0–r7, r8-r12a, SP, LR, PC, CPSR r0–r14, PC, CPSR
10001 FIQ r0–r7, r8_fiq-r12_fiqa, SP_fiq, LR_fiq PC, r0–r7, r8_fiq–r14_fiq, PC, CPSR, SPSR_fiq
CPSR, SPSR_fiq
10010 IRQ r0–r7, r8-r12a, SP_irq, LR_irq, PC, CPSR, r0–r12, r13_irq, r14_irq, PC, CPSR, SPSR_irq
SPSR_irq
10011 Supervisor r0–r7, r8-r12a, SP_svc, LR_svc, PC, CPSR, r0–r12, r13_svc, r14_svc, PC, CPSR,
SPSR_svc SPSR_svc
10111 Abort r0–r7, r8-r12a, SP_abt, LR_abt, PC, CPSR, r0–r12, r13_abt, r14_abt, PC, CPSR,
SPSR_abt SPSR_abt
11011 Undefined r0–r7, r8-r12a, SP_und, LR_und, PC, CPSR, r0–r12, r13_und, r14_und, PC, CPSR,
SPSR_und SPSR_und
11111 System r0–r7, r8-r12a, SP, LR, PC, CPSR r0–r14, PC, CPSR
2-18 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
The remaining bits in the PSRs are unused, but are reserved. When changing a PSR flag
or control bits, make sure that these reserved bits are not altered. You must ensure that
your program does not rely on reserved bits containing specific values because future
processors might use some or all of the reserved bits.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-19
Programmer’s Model
2.9 Exceptions
Exceptions arise whenever the normal flow of a program has to be halted temporarily,
for example, to service an interrupt from a peripheral. Before attempting to handle an
exception, the ARM9E-S preserves the current processor state so that the original
program can resume when the handler routine has finished.
If two or more exceptions arise simultaneously, the exceptions are dealt with in the fixed
order given in Exception priorities on page 2-27.
Table 2-3 summarizes the PC value preserved in the relevant r14 on exception entry,
and the recommended instruction for exiting the exception handler.
Previous state
Exception
Return instruction Notes
or entry
ARM r14_x Thumb r14_x
SWI MOVS PC, R14_svc PC + 4 PC+2 Where the PC is the address of the
SWI, undefined instruction, or
UNDEF MOVS PC, R14_und PC + 4 PC+2 instruction that had the Prefetch
Abort.
PABT SUBS PC, R14_abt, #4 PC + 4 PC+4
FIQ SUBS PC, R14_fiq, #4 PC + 4 PC+4 Where the PC is the address of the
instruction that was not executed
IRQ SUBS PC, R14_irq, #4 PC + 4 PC+4 because the FIQ or IRQ took
priority.
DABT SUBS PC, R14_abt, #8 PC + 8 PC+8 Where the PC is the address of the
Load or Store instruction that
generated the Data Abort.
2-20 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
1. Preserves the address of the next instruction in the appropriate LR. When the
exception entry is from:
• ARM state, the ARM9E-S copies the address of the next instruction into the
LR (current PC + 4 or PC + 8 depending on the exception).
• Thumb state, the ARM9E-S writes the value of the PC into the LR, offset
by a value (current PC + 4 or PC + 8 depending on the exception) that
causes the program to resume from the correct place on return.
The exception handler does not need to determine the state when entering an
exception. For example, in the case of a SWI, MOVS PC, r14_svc always
returns to the next instruction regardless of whether the SWI was executed in
ARM or Thumb state.
3. Forces the CPSR mode bits to a value which depends on the exception.
4. Forces the PC to fetch the next instruction from the relevant exception vector.
The ARM9E-S can also set the interrupt disable flags to prevent otherwise
unmanageable nesting of exceptions.
Note
Exceptions are always entered, handled, and exited in ARM state. When the processor
is in Thumb state and an exception occurs, the switch to ARM state takes place
automatically when the exception vector address is loaded into the PC.
When an exception has completed, the exception handler must move the LR, minus an
offset to the PC. The offset varies according to the type of exception, as shown in
Table 2-3 on page 2-20.
If the S bit is set and rd = r15, the core copies the SPSR back to the CPSR and clears
the interrupt disable flags that were set on entry.
Note
The action of restoring the CPSR from the SPSR automatically resets the T bit to the
value it held immediately prior to the exception. The I and F bits are automatically
restored to the value they held immediately prior to the exception.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-21
Programmer’s Model
2.9.4 Reset
When the nRESET signal is driven LOW a reset occurs, and the ARM9E-S abandons
the executing instruction.
1. Forces CPSR[4:0] to b10011 (Supervisor mode), sets the I and F bits in the CPSR,
and clears the CPSR T bit. Other bits in the CPSR are indeterminate.
2. Forces the PC to fetch the next instruction from the reset vector address.
After reset, all register values except the PC and CPSR are indeterminate.
Refer to Chapter 3 Device Reset for more details of the ARM9E-S reset behavior.
The Fast Interrupt Request (FIQ) exception supports fast interrupts. In ARM state, FIQ
mode has eight private registers to reduce, or even remove the requirement for register
saving (minimizing the overhead of context switching).
An FIQ is externally generated by taking the nFIQ signal input LOW. The nFIQ input
is registered internally to the ARM9E-S. It is the output of this register that is used by
the ARM9E-S control logic.
Irrespective of whether exception entry is from ARM state or from Thumb state, an FIQ
handler returns from the interrupt by executing:
SUBS PC,R14_fiq,#4
You can disable FIQ exceptions within a privileged mode by setting the CPSR F flag.
When the F flag is clear, the ARM9E-S checks for a LOW level on the output of the
nFIQ register at the end of each instruction.
FIQs and IRQs are disabled when an FIQ occurs. Nested interrupts are allowed but it is
up to the programmer to save any corruptible registers and to re-enable FIQs and
interrupts.
The Interrupt Request (IRQ) exception is a normal interrupt caused by a LOW level on
the nIRQ input. IRQ has a lower priority than FIQ, and is masked on entry to an FIQ
sequence. You can disable IRQ at any time, by setting the I bit in the CPSR from a
privileged mode.
2-22 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
Irrespective of whether exception entry is from ARM state or Thumb state, an IRQ
handler returns from the interrupt by executing:
SUBS PC,R14_irq,#4
You can disable IRQ exceptions within a privileged mode by setting the CPSR I flag.
When the I flag is clear, the ARM9E-S checks for a LOW level on the output of the
nIRQ register at the end of each instruction.
FIQs and IRQs are disabled when an IRQ occurs. Nested interrupts are allowed but it is
up to you to save any corruptible registers and to re-enable FIQs and interrupts.
2.9.7 Aborts
An abort indicates that the current memory access cannot be completed. An abort is
signaled by one of the two external abort input pins, IABORT and DABORT.
Prefetch Abort
This is signaled by an assertion on the IABORT input pin and checked at the end of
each instruction fetch.
When a Prefetch Abort occurs, the ARM9E-S marks the prefetched instruction as
invalid, but does not take the exception until the instruction reaches the Execute stage
of the pipeline. If the instruction is not executed, for example because a branch occurs
while it is in the pipeline, the abort does not take place.
After dealing with the cause of the abort, the handler executes the following instruction
irrespective of the processor operating state:
SUBS PC,R14_abt,#4
This action restores both the PC and the CPSR, and retries the aborted instruction.
Data Abort
This is signaled by an assertion on the DABORT input pin and checked at the end of
each data access, both read and write.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-23
Programmer’s Model
The ARM9E-S implements the base restored Data Abort model, which differs from the
base updated Data Abort model implemented by the ARM7TDMI-S.
The difference in the Data Abort model affects only a very small section of operating
system code, in the Data Abort handler. It does not affect user code.
With the base restored Data Abort model, when a Data Abort exception occurs during
the execution of a memory access instruction, the base register is always restored by the
processor hardware to the value it contained before the instruction was executed. This
removes the need for the Data Abort handler to unwind any base register update, which
might have been specified by the aborted instruction. This greatly simplifies the
software Data Abort handler.
After dealing with the cause of the abort, the handler must execute the following return
instruction irrespective of the processor operating state at the point of entry:
SUBS PC,R14_abt,#8
This action restores both the PC and the CPSR, and retries the aborted instruction.
You can use the Software Interrupt Instruction (SWI) to enter Supervisor mode, usually
to request a particular supervisor function. A SWI handler returns by executing the
following instruction, irrespective of the processor operating state:
MOVS PC, R14_svc
This action restores the PC and CPSR, and returns to the instruction following the SWI.
The SWI handler reads the opcode to extract the SWI function number.
2-24 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
When an instruction is encountered that neither the ARM9E-S, nor any coprocessor in
the system can handle, the ARM9E-S takes the undefined instruction trap. Software can
use this mechanism to extend the ARM instruction set by emulating undefined
coprocessor instructions.
After emulating the failed instruction, the trap handler executes the following
instruction, irrespective of the processor operating state:
MOVS PC,R14_und
This action restores the CPSR and returns to the next instruction after the undefined
instruction.
IRQs are disabled when an undefined instruction trap occurs. For more information
about undefined instructions, refer to the ARM Architecture Reference Manual.
A breakpoint instruction does not cause the ARM9E-S to take the Prefetch Abort
exception until the instruction reaches the Execute stage of the pipeline. If the
instruction is not executed, for example because a branch occurs while it is in the
pipeline, the breakpoint does not take place.
After dealing with the breakpoint, the handler executes the following instruction
irrespective of the processor operating state:
SUBS PC,R14_abt,#4
This action restores both the PC and the CPSR, and retries the breakpointed instruction.
Note
If the EmbeddedICE-RT logic is configured into stopping mode, a breakpoint
instruction causes the ARM9E-S to enter debug state. See Debug control register on
page C-34.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-25
Programmer’s Model
You can configure the location of the exception vector addresses using the input
CFGHIVECS, as shown in Table 2-4.
0 0x0000 0000
1 0xFFFF 0000
Table 2-5 shows the exception vector addresses and entry conditions for the different
exception types.
2-26 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Programmer’s Model
When multiple exceptions arise at the same time, a fixed priority system determines the
order in which they are handled:
1. Reset (highest priority).
2. Data Abort.
3. FIQ.
4. IRQ.
5. Prefetch Abort.
6. BKPT, undefined instruction, and SWI (lowest priority).
• The BKPT, or undefined instruction, and SWI exceptions are mutually exclusive.
Each corresponds to a particular (non-overlapping) decoding of the current
instruction.
• When FIQs are enabled, and a Data Abort occurs at the same time as an FIQ, the
ARM9E-S enters the Data Abort handler, and proceeds immediately to the FIQ
vector.
A normal return from the FIQ causes the Data Abort handler to resume execution.
Data Aborts must have higher priority than FIQs to ensure that the transfer error
does not escape detection. You must add the time for this exception entry to the
worst-case FIQ latency calculations in a system that uses aborts to support virtual
memory.
The FIQ handler must not access any memory that can generate a Data Abort,
because the initial Data Abort exception condition is lost.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 2-27
Programmer’s Model
2-28 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Chapter 3-
Device Reset
This chapter describes the ARM9E-S reset behavior. It contains the following sections:
• About device reset on page 3-2
• Reset modes on page 3-3
• ARM9E-S behavior on exit from reset on page 3-5.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 3-1
Device Reset
nRESET The nRESET signal is the main CPU reset that initializes the majority of
the ARM9E-S logic.
DBGnTRST The DBGnTRST signal is the debug logic reset that you can use to reset
the ARM9E-S TAP controller and the EmbeddedICE-RT unit.
Both nRESET and DBGnTRST are active LOW signals that asynchronously reset
logic in the ARM9E-S. You must take care when designing the logic to drive these reset
signals.
3-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Device Reset
You must apply power-on or cold reset to the ARM9E-S when power is first applied to
the system. In the case of power-on reset, the leading (falling) edge of the reset signals
(nRESET and DBGnTRST) does not have to be synchronous to CLK. The trailing
(rising) edge of the reset signals must be set up and held about the rising edge of the
clock. You must do this to ensure that the entire system leaves reset in a predictable
manner. This is particularly important in multi-processor systems. Figure 3-1 shows the
application of power-on reset.
CLK
nRESET
DBGnTRST
It is recommended that you assert the reset signals for at least three CLK cycles to
ensure correct reset behavior. Adopting a three-cycle reset eases the integration of other
ARM parts into the system, for example, ARM9TDMI based designs.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 3-3
Device Reset
A CPU or warm reset initializes the majority of the ARM9E-S CPU, excluding the
ARM9E-S TAP controller and the EmbeddedICE-RT unit. CPU reset is typically used
for resetting a system that has been operating for some time, for example, watchdog
reset.
Sometimes you might not want to reset the EmbeddedICE-RT unit when resetting the
rest of the ARM9E-S, for example, if EmbeddedICE-RT has been configured to
breakpoint (or capture) fetches from the reset vector.
For CPU reset, both the leading and trailing edges of nRESET must be set up and held
about the rising edge of CLK. This ensures that there are no metastability issues
between the ARM9E-S and the EmbeddedICE-RT unit.
EmbeddedICE-RT reset initializes the state of the ARM9E-S TAP controller and the
EmbeddedICE-RT unit. EmbeddedICE-RT reset is typically used by the Multi-ICE
module for hot connection of a debugger to a system.
For EmbeddedICE-RT reset, both the leading and trailing edges of DBGnTRST must
be set up and held about the rising edge of CLK. This ensures that there are no
metastability issues between the ARM9E-S and the EmbeddedICE-RT unit.
Refer to Clocks and synchronization on page 7-14 for more details of synchronization
between the Multi-ICE and ARM9E-S.
During normal operation, neither CPU reset nor EmbeddedICE-RT reset is asserted.
3-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Device Reset
The behavior of the memory interface coming out of reset is shown in Figure 3-2.
F D E M W
CLK
nRESET
InMREQ
ISEQ
INSTR[31:0]
DnMREQ
DSEQ
DMORE
DnRW
DA[31:0]
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 3-5
Device Reset
3-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Chapter 4-
Memory Interface
This chapter describes the ARM9E-S memory interface. It contains the following
sections:
• About the memory interface on page 4-2
• Instruction interface on page 4-3
• Instruction interface addressing signals on page 4-4
• Instruction interface data timed signals on page 4-6
• Endian effects for instruction fetches on page 4-7
• Instruction interface cycle types on page 4-8
• Data interface on page 4-13
• Data interface addressing signals on page 4-15
• Data interface data timed signals on page 4-18
• Data interface cycle types on page 4-24
• Endian effects for data transfers on page 4-30
• Use of CLKEN to control bus cycles on page 4-31.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-1
Memory Interface
For both instruction and data interfaces, the ARM9E-S processor core uses pipelined
addressing. This means that the address and control signals are generated the cycle
before the data transfer takes place. All memory accesses are timed with the clock
CLK.
The ARM9E-S can operate in both big-endian and little-endian memory configurations
and this is selected by the CFGBIGEND input. The endian configuration affects both
interfaces, so you must take care when designing the memory interface logic to allow
correct operation of the processor core.
For system programming purposes, you must normally provide some mechanism for
the data interface to access instruction memory. There are two main reasons for this:
• The use of in-line data for literal pools is very common. This data is fetched using
the data interface but is normally contained in the instruction memory space.
• To enable debug using the JTAG interface it must be possible to download code
into the instruction memory. This code has to be written to memory through the
data interface, because the instruction interface is read-only. In this case it is
essential for the data interface to have access to the instruction memory.
It is not necessary for the instruction interface to have access to the data memory area
unless the processor needs to execute code from data memory.
4-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
The signals in the ARM9E-S instruction interface can be grouped into four categories:
Each of these signal groups shares a common timing relationship to the bus interface
cycle. All signals in the ARM9E-S instruction interface are generated from, or sampled
by, the rising edge of CLK.
You can extend bus cycles using the CLKEN signal (see Use of CLKEN to control bus
cycles on page 4-31). Unless otherwise stated CLKEN is permanently HIGH.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-3
Memory Interface
4.3.1 IA[31:1]
IA[31:1] is the 31-bit address bus that specifies the address for the transfer. All
addresses are byte addresses, so a burst of 32-bit instruction fetches results in the
address bus incrementing by four for each cycle.
Note
The ARM9E-S does not produce IA[0] as all instruction accesses are halfword-aligned
(that is, IA[0] = 0).
The address bus provides 4GB of linear addressing space. When a word access is
signaled the memory system must ignore IA[1].
4.3.2 ITBIT
The ITBIT signal encodes the size of the instruction fetch. The ARM9E-S can request
word-sized instructions (when in ARM state) or halfword-sized instructions (when in
Thumb state). This is encoded on ITBIT as shown in Table 4-1.
1 Halfword
0 Word
4-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
4.3.3 InTRANS
The InTRANS signal encodes information about the transfer. A memory management
unit uses this signal to determine if an access is from a privileged mode. Therefore, you
can use this signal to implement an access permission scheme. The encoding of
InTRANS is shown in Table 4-2.
InTRANS Mode
0 User
1 Privileged
4.3.4 InM[4:0]
InM[4:0] indicates the operating mode of the ARM9E-S. This bus corresponds to the
bottom 5 bits of the CPSR, the outputs are inverted with respect to the CPSR.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-5
Memory Interface
4.4.1 INSTR[31:0]
INSTR[31:0] is the read data bus, and is used by the ARM9E-S to fetch opcodes. The
INSTR[31:0] signal is sampled on the rising edge of CLK at the end of the bus cycle.
4.4.2 IABORT
If IABORT is asserted on an opcode fetch, the abort is tracked down the pipeline, and
the Prefetch Abort trap is taken if the instruction is executed.
4-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
Significant
ITBIT Width
address bits
1 Halfword IA[31:1]
0 Word IA[31:2]
When a halfword instruction fetch is performed, a 32-bit memory system can return the
complete 32-bit word, and the ARM9E-S extracts the valid halfword field from it. The
field extracted depends on the state of the CFGBIGEND signal, which determines the
endianness of the system (see Memory formats on page 2-4).
Little-endian Big-endian
ITBIT IA[1]
CFGBIGEND = 0 CFGBIGEND = 1
0 X INSTR[31:0] INSTR[31:0]
When connecting 8-bit or 16-bit memory systems to the ARM9E-S, ensure that the data
is presented to the correct byte lanes on the ARM9E-S as shown in Table 4-5.
Little-endian Big-endian
ITBIT IA[1]
CFGBIGEND = 0 CFGBIGEND = 1
1 0 INSTR[15:0] INSTR[31:16]
1 1 INSTR[31:16] INSTR[15:0]
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-7
Memory Interface
CLK
Address class
Address
signals
Instruction
INSTR[31:0] data
Bus cycle
The ARM9E-S instruction interface can perform three different types of memory cycle.
These are indicated by the state of the InMREQ and ISEQ signals. Memory cycle
types are encoded on the InMREQ and ISEQ signals as shown in Table 4-6.
1 1 - Reserved
A memory controller for the ARM9E-S must commit to an instruction memory access
only on an N cycle or an S cycle.
4-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
Nonsequential cycle
During this the ARM9E-S core requests a transfer to or from an
address that is unrelated to the address used in the preceding cycle.
Sequential cycle During this the ARM9E-S core requests a transfer to or from an
address that is either one word, or one halfword greater than the
address used in the preceding cycle.
Internal cycle During this the ARM9E-S core does not require a transfer because
it is performing an internal function, and no useful prefetching can
be performed at the same time.
The address class signals and the InMREQ, ISEQ = N cycle signals are broadcast on
the instruction interface bus. At the end of the next bus cycle the instruction is
transferred to the CPU from memory. This is shown in Figure 4-2.
CLK
Address class
Address
signals
InMREQ, N cycle
ISEQ
Instruction
INSTR[31:0] data
N cycle
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-9
Memory Interface
Sequential instruction fetches are used to perform burst transfers on the bus. This
information can be used to optimize the design of a memory controller interfacing to a
burst memory device, such as a DRAM.
During a sequential cycle, the ARM9E-S requests a memory location that is part of a
sequential burst. If this is the first cycle in the burst, the address might be the same as
the previous internal cycle. Otherwise the address is incremented from the previous
instruction fetch that was performed:
• for a burst of word accesses, the address is incremented by 4 bytes
• for a burst of halfword access, the address is incremented by 2 bytes.
Address
Burst type Cause
increment
All accesses in a burst are of the same width, direction, and protection type. For more
details, see Instruction interface addressing signals on page 4-4.
Bursts of byte accesses are not possible with the instruction memory interface.
A burst always starts with an N cycle, or a merged I-S cycle (see Instruction interface,
merged I-S cycles on page 4-11), and continues with S cycles. A burst comprises
transfers of the same type or size. The IA[31:1] signal increments during the burst. The
other address class signals are unaffected by a burst.
4-10 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
CLK
Address class
Address Address + 4
signals
Instruction Instruction
INSTR[31:0] data 1 data 2
N cycle S cycle
During an internal cycle, the ARM9E-S does not require an instruction fetch, because
an internal function is being performed, and no useful prefetching can be performed at
the same time.
Where possible the ARM9E-S broadcasts the address for the next access, so that decode
can start, but the memory controller must not commit to a memory access. This is
described further in Instruction interface, merged I-S cycles.
Where possible, the ARM9E-S performs an optimization on the bus to allow extra time
for memory decode. When this happens, the address of the next memory cycle is
broadcast during an internal cycle on this bus. This allows the memory controller to
decode the address, but it must not initiate a memory access during this cycle. In a
merged I-S cycle, the next cycle is a sequential cycle to the same memory location. This
commits to the access, and the memory controller must initiate the memory access. This
is shown in Figure 4-4 on page 4-12.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-11
Memory Interface
CLK
Address class
Address Address + 2
signals
Instruction Instruction
INSTR[31:0] data 1 data 2
There is an exception to the merged I-S behavior in the case of a coprocessor 15 MCR.
In this case the IA bus is used to transmit data to CP15 (see Coprocessor 15 MCRs on
page 6-18).
Note
When designing a memory controller, make sure that the design also works when an
I cycle is followed by an N cycle to a different address. This sequence might occur
during exceptions, or during writes to the program counter. It is essential that the
memory controller does not commit to the memory cycle during an I cycle.
4-12 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
Data transfers take place in the Memory stage of the pipeline. The operation of the data
interface is very similar to the instruction interface.
The signals in the ARM9E-S data bus interface can be grouped into four categories:
Note
All memory accesses are conditioned by the state of the memory request signals. You
must not initiate a memory access unless the memory request signals indicate that one
is required. See Data interface cycle types on page 4-24 for more details.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-13
Memory Interface
Each of these signal groups shares a common timing relationship to the bus interface
cycle. All signals in the ARM9E-S data interface are generated from, or sampled by the
rising edge of CLK.
You can extend bus cycles using the CLKEN signal (see Use of CLKEN to control bus
cycles on page 4-31). Unless otherwise stated CLKEN is permanently HIGH.
4-14 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
4.8.1 DA[31:0]
DA[31:0] is the 32-bit address bus that specifies the address for the transfer. All
addresses are byte addresses, so a burst of word accesses results in the address bus
incrementing by 4 for each cycle.
The address bus provides 4GB of linear addressing space. When a word access is
signaled the memory system must ignore the bottom two bits, DA[1:0], and when a
halfword access is signaled the memory system must ignore the bottom bit, DA[0].
4.8.2 DnRW
DnRW specifies the direction of the transfer. DnRW indicates an ARM9E-S write
cycle when HIGH, and an ARM9E-S read cycle when LOW. A burst of S cycles is
always either a read burst, or a write burst, because the direction cannot be changed in
the middle of a burst.
Note
You must not initiate writes to memory purely on the basis of DnRW. You must use the
status of the data interface request signals to condition writes to memory. See Data
interface cycle types on page 4-24 for more details.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-15
Memory Interface
4.8.3 DMAS[1:0]
The DMAS[1:0] bus encodes the size of the transfer. The ARM9E-S can transfer word,
halfword, and byte quantities. This is encoded on DMAS[1:0] as shown in Table 4-8.
00 Byte
01 Halfword
10 Word
11 Reserved
The size of transfer does not change during a burst of S cycles. Bursts of halfword or
byte accesses are not possible on the ARM9E-S data interface.
Note
A writable memory system for the ARM9E-S must have individual byte write enables.
Both the C compiler and the ARM debug tool chain (for example, Multi-ICE) assume
that arbitrary bytes in the memory can be written. If individual byte write capability is
not provided, you might not be able to use these tools.
4.8.4 DnTRANS
The DnTRANS bus encodes information about the transfer. A memory management
unit uses this signal to determine if an access is from a privileged mode. Therefore, you
can use this signal to implement an access permission scheme. The encoding of
DnTRANS is shown in Table 4-9.
DnTRANS Mode
0 User
1 Privileged
4-16 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
4.8.5 DLOCK
DLOCK indicates to an arbiter that an atomic operation is being performed on the bus.
DLOCK is normally LOW, but is set HIGH to indicate that a SWP or SWPB instruction
is being performed. These instructions perform an atomic read/write operation, and can
be used to implement semaphores.
If DLOCK is asserted in a cycle, then this indicates that there is another access in the
next cycle that must be locked to the first. In the case of a multi-master system, the
ARM processor must not be degranted the bus when a locked transaction is being
performed.
4.8.6 DnM[4:0]
DnM[4:0] indicates the operating mode of the ARM9E-S. This bus corresponds to the
bottom five bits of the CPSR, unless a forced User mode access is being performed, in
which case DnM[4:0] indicates User mode. These bits are inverted with respect to the
CPSR.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-17
Memory Interface
4.9.1 WDATA[31:0]
WDATA[31:0] is the write data bus. All data written out from the ARM9E-S is
broadcast on this bus. Data transfers from the ARM9E-S to a coprocessor also use this
bus during C cycles. In normal circumstances, a memory system must sample the
WDATA[31:0] bus on the rising edge of CLK at the end of a write bus cycle. The value
on WDATA[31:0] is valid only during write cycles.
4.9.2 RDATA[31:0]
RDATA[31:0] is the read data bus, and is used by the ARM9E-S to fetch data. It is
sampled on the rising edge of CLK at the end of the bus cycle, and is also used during
C cycles to transfer data from a coprocessor to the ARM9E-S.
4.9.3 DABORT
If DABORT is asserted on a data access, it causes the ARM9E-S to take the Data Abort
trap.
An example of this is shown in Figure 4-5 on page 4-19, where a load instruction
follows an aborted store.
4-18 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
CLK
Address class
Write address Read address
signals
DnRW
DnMREQ
DSEQ
DMORE
WDATA[31:0]
Write data
(Write)
DABORT
This DABORT to DnMREQ, DSEQ, and DMORE path has been removed from the
ARM9E-S design because:
• a combinational input to output path is undesirable in an ASIC design flow
• the path is critical in ARM9TDMI.
Due to this modification, the memory system connected to ARM9E-S is responsible for
ignoring a data memory request made during the cycle of an aborted data transfer. This
is necessary to prevent a following memory access from corrupting memory after an
aborted access. The memory system must ignore DnMREQ, DSEQ, and DMORE in
this case.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-19
Memory Interface
Figure 4-6 shows the ARM9E-S behavior for an aborted STR instruction followed by
an LDM instruction. While the STR instruction is canceled, a memory request is made
in the first cycle of the LDM before the Data Abort exception is taken.
CLK
Address class
Write address Read address
signals
DnRW
DnMREQ
DSEQ
DMORE
WDATA[31:0]
Write data
(Write)
DABORT
4-20 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
The ARM9E-S indicates the size of a transfer using the DMAS[1:0] signals. These are
encoded as shown in Table 4-10.
00 Byte
01 Halfword
10 Word
11 Reserved
All writable memory in an ARM9E-S based system must support the writing of
individual bytes to allow the use of the C compiler and the ARM debug tool chain (for
example, Multi-ICE).
The address produced by the ARM9E-S is always byte-aligned. However, the memory
system must ignore the insignificant bits of the address. The significant address bits are
listed in Table 4-11.
Significant
DMAS[1:0] Width
address bits
00 Byte DA[31:0]
01 Halfword DA[31:1]
10 Word DA[31:2]
Reads
When a halfword or byte read is performed, a 32-bit memory system can return the
complete 32-bit word, and the ARM9E-S extracts the valid halfword or byte field from
it. The fields extracted depend on the state of the CFGBIGEND signal, which
determines the endianness of the system (see Memory formats on page 2-4).
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-21
Memory Interface
Little-endian Big-endian
DMAS[1:0] DA[1:0]
CFGBIGEND = 0 CFGBIGEND = 1
10 XX RDATA[31:0] RDATA[31:0]
When performing a word load, the ARM9E-S can rotate the data returned internally if
the address used is unaligned. Refer to the ARM Architectural Reference Manual for
more details.
When connecting 8-bit to 16-bit memory systems to the ARM9E-S, you must make sure
that the data is presented to the correct byte lanes on the ARM9E-S as shown in
Table 4-13 and Table 4-14.
Little-endian Big-endian
DMAS[1:0] DA[1:0]
CFGBIGEND = 0 CFGBIGEND = 1
01 0X RDATA[15:0] RDATA[31:16]
01 1X RDATA[31:16] RDATA[15:0]
Little-endian Big-endian
DMAS[1:0] DA[1:0]
CFGBIGEND = 0 CFGBIGEND = 1
00 00 RDATA[7:0] RDATA[31:24]
00 01 RDATA[15:8] RDATA[23:16]
00 10 RDATA[23:16] RDATA[15:8]
00 11 RDATA[31:24] RDATA[7:0]
4-22 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
Writes
When the ARM9E-S performs a byte or halfword write, the data being written is
replicated across the bus, as illustrated in Figure 4-7. The memory system can use the
most convenient copy of the data. A writable memory system must be capable of
performing a write to any single byte in the memory system. This capability is required
by the ARM C compiler and the Debug tool chain.
Byte writes
A WDATA[31:24]
B
A WDATA[23:16]
B
A WDATA[15:8]
B
Register[7:0] A A WDATA[7:0]
B B
Halfword writes
A A
Register[15:0] B B WDATA[15:0]
C C
D D
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-23
Memory Interface
CLK
Address class
Address
signals
DnMREQ,
DSEQ, Cycle type
DMORE
WDATA[31:0]
Write data
(Write)
RDATA[31:0]
Read data
(Read)
Bus cycle
The ARM9E-S data interface can perform four different types of memory cycle. These
are indicated by the state of the DnMREQ and DSEQ signals. Memory cycle types are
encoded on the DnMREQ and DSEQ signals as shown in Table 4-15.
4-24 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
A memory controller for the ARM9E-S must commit to a data memory access only on
an N cycle or an S cycle.
Nonsequential cycle
During this cycle the ARM9E-S core requests a transfer to or from
an address that is unrelated to the address used in the preceding
cycle.
Sequential cycle During this cycle the ARM9E-S core requests a transfer to or from
an address that is one word greater than the address used in the
preceding cycle.
Internal cycle During this cycle the ARM9E-S core does not require a transfer
because it is performing an internal function.
A nonsequential cycle is the simplest form of an ARM9E-S data interface cycle, and
occurs when the ARM9E-S requests a transfer to or from an address that is unrelated to
the address used in the preceding cycle. The memory controller must initiate a memory
access to satisfy this request.
The address class signals and the DnMREQ and DSEQ = N cycle are broadcast on
the data bus. At the end of the next bus cycle the data is transferred between the CPU
and the memory. This is shown in Figure 4-9 on page 4-26.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-25
Memory Interface
CLK
Address class
Address
signals
DnMREQ,
DSEQ, N cycle
DMORE
WDATA[31:0]
Write data
(Write)
RDATA[31:0]
Read data
(Read)
N cycle
The ARM9E-S can perform back to back, nonsequential memory cycles. This happens,
for example, when an STR instruction and an LDR instruction are executed in
succession, as shown in Figure 4-10 on page 4-27.
4-26 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
CLK
Address class
Write address Read address
signals
DnRW
DnMREQ,
DSEQ, N cycle N cycle
DMORE
WDATA[31:0]
Write data
(Write)
RDATA[31:0]
Read data
(Read)
Write cycle Read cycle
If you are designing a memory controller for the ARM9E-S, and your memory system
is unable to cope with this case, use the CLKEN signal to extend the bus cycle to allow
sufficient cycles for the memory system (see Use of CLKEN to control bus cycles on
page 4-31).
Sequential cycles perform burst transfers on the bus. You can use this information to
optimize the design of a memory controller interfacing to a burst memory device, such
as a DRAM.
During a sequential cycle, the ARM9E-S requests a memory location that is part of a
sequential burst. If this is the first cycle in the burst, the address can be the same as the
previous internal cycle. Otherwise the address is incremented from the previous cycle.
For a burst of word accesses, the address is incremented by 4 bytes.
Bursts of halfword or byte accesses are not possible on the ARM9E-S data interface.
A burst always starts with an N cycle and continues with S cycles. A burst comprises
transfers of the same type. The DA[31:0] signal increments during the burst. The other
address class signals are unaffected by a burst.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-27
Memory Interface
All accesses in a burst are of the same width, direction, and protection type. For more
details, see Instruction interface addressing signals on page 4-4.
CLK
Address class
Address Address + 4
signals
DnMREQ
DSEQ
DMORE
WDATA[31:0]
Write data 1 Write data 2
(Write)
N cycle S cycle
The DMORE signal is active during load and store multiple instructions and only ever
goes HIGH when DnMREQ is LOW. This signal effectively gives the same
information as DSEQ, but a cycle ahead. This information is provided to allow external
logic more time to decode sequential cycles.
4-28 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
During an internal cycle, the ARM9E-S does not require a memory access, as an
internal function is being performed.
The ARM9E-S does not perform merged I-S cycles on the data memory interface.
During a coprocessor register transfer cycle, the ARM9E-S uses the data interface to
transfer data to or from a coprocessor. A memory cycle is not required and the memory
controller does not initiate a transaction.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-29
Memory Interface
4.11.1 Writes
For data writes by the processor, the write data is duplicated on the data bus. So for a
16-bit data store, one copy of the data appears on the upper half of the write data bus,
WDATA[31:16], and the same data appears on the lower half, WDATA[15:0]. For
8-bit writes four copies are output, one on each byte lane:
• WDATA[31:24]
• WDATA[23:16]
• WDATA[15:8]
• WDATA[7:0].
This considerably eases the memory control logic design and helps overcome any
endian effects.
4.11.2 Reads
For data reads, the processor reads a specific part of the read data bus. This is
determined by:
• the endian configuration
• the size of the transfer
• bits 1 and 0 of the data address bus.
Table 4-13 on page 4-22 shows which bits of the data bus are read for 16-bit reads, and
Table 4-14 on page 4-22 shows which bits are read for 8-bit transfers.
For simplicity of design, 32-bits of data can be read from memory and the processor
ignores any unwanted bits.
4-30 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Memory Interface
The CLKEN input extends bus cycles on both the instruction and data interfaces when
asserted.
In the pipeline, the address class signals and the memory request signals are ahead of
the data transfer by one bus cycle. In a system using CLKEN this can be more than one
CLK cycle. This is illustrated in Figure 4-12, which shows CLKEN being used to
extend a nonsequential cycle. In the example, the first N cycle is followed by another
N cycle to an unrelated address, and the address for the second access is broadcast
before the first access completes.
CLK
CLKEN
Address class
Address 1 Address 2 Next address
signals
DnMREQ,
DSEQ, N cycle N cycle Next cycle type
DMORE
Note
When designing a memory controller, you must sample the values of InMREQ, ISEQ,
DnMREQ, DSEQ, DMORE, and the address class signals only when CLKEN is
HIGH. This ensures that the state of the memory controller is not accidentally updated
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 4-31
Memory Interface
during a waited cycle. In addition, the ARM9E-S can alter the request for a subsequent
memory cycle during a waited (CLKEN LOW) cycle. See Withdrawal of memory
requests in waited cycles.
The ARM9E-S can alter the value of the memory request and address signals during
cycles in which CLKEN is LOW. This is done to improve the worst case interrupt
latency of ARM9E-S systems. For example, a pending memory request can be
withdrawn if the core is about to take an interrupt and the access is unnecessary.
The ARM9E-S does not alter or withdraw any access to which it is committed. An
access is said to be committed when the address and request signals are sampled on the
rising edge of CLK when CLKEN is HIGH.
The ARM9E-S only attempts to alter or withdraw an uncommitted access during the
extended (or waited) bus cycle of a previous access. Alteration of the next memory
request during a waited bus cycle is shown in Figure 4-13.
CLK
CLKEN
Address class
Address 1 Ignored Ignored
signals
DnMREQ,
DSEQ, Request 1
Ignored Ignored Internal cycle
DMORE, N cycle
DnSPEC First bus cycle Second bus cycle
Figure 4-13 Alteration of next memory request during waited bus cycle
Note
This behavior affects the IA, InMREQ, ISEQ, DA, DnMREQ, DSEQ, DMORE, and
DnSPEC outputs of the ARM9E-S.
4-32 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Chapter 5-
Interrupts
This chapter describes the ARM9E-S interrupt behavior. It contains the following
sections:
• About interrupts on page 5-2
• Hardware interface on page 5-3
• Maximum interrupt latency on page 5-7
• Minimum interrupt latency on page 5-8.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 5-1
Interrupts
The Fast Interrupt Request (FIQ) exception provides support for fast interrupts. The
Interrupt Request (IRQ) exception provides support for normal priority interrupts.
Refer to Exceptions on page 2-20 for more details about the programmer’s model for
interrupts.
5-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Interrupts
You can make the ARM9E-S take the FIQ or IRQ exceptions (if interrupts are enabled
within the core) by asserting (LOW) the nFIQ or nIRQ inputs, respectively.
It is essential that once asserted, the interrupt input remains asserted until the ARM9E-S
has completed its interrupt exception entry sequence. When an interrupt input is
asserted, it must remain asserted until the ARM9E-S acknowledges to the source of the
interrupt that the interrupt has been taken. This acknowledgement normally occurs
when the interrupt service routine accesses the peripheral causing the interrupt, for
example:
• by reading an interrupt status register in the systems interrupt controller
• by writing to a clear interrupt control bit
• by writing data to, or reading data from the interrupting peripheral.
5.2.2 Synchronization
The nFIQ and nIRQ inputs are synchronous inputs to the ARM9E-S, and must be setup
and held about the rising edge of the ARM9E-S clock, CLK. If interrupt events that are
asynchronous to CLK are present in a system, synchronization register(s) that are
external to the ARM9E-S are required.
You must take care when re-enabling interrupts (for example at the end of an interrupt
routine or with a reentrant interrupt handler). You must ensure that the original source
of the interrupt has been removed before interrupts are enabled again on the ARM9E-S.
If you cannot guarantee this, the ARM9E-S might retake the interrupt exception
prematurely.
When considering the timing relation of removing the source of interrupt and
re-enabling interrupts on the ARM9E-S, you must take into account the pipelined
nature of the ARM9E-S and the memory system to which it is connected. For example,
the instruction that causes the removal of the interrupt request (that is, deassertion of
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 5-3
Interrupts
nFIQ or nIRQ) typically does not take effect until after the Memory stage of that
instruction. The instruction that re-enables interrupts on the ARM9E-S can cause the
ARM9E-S to be sensitive to interrupts as early as the Execute stage of that instruction.
CLK
nFIQ
FIQDIS
In Figure 5-1, the STR to the interrupt controller does not cause the deassertion of the
nFIQ input until cycle 4. The SUBS instruction causes the ARM9E-S to be sensitive to
interrupts during cycle 3.
Because of this timing relationship, the ARM9E-S retakes the FIQ exception in this
example.
The FIQDIS (and similarly IRQDIS) output from the ARM9E-S indicates when the
ARM9E-S is sensitive to the state of the nFIQ (nIRQ) input (0 for sensitive, 1 for
insensitive). If nFIQ is asserted in the same cycle that FIQDIS is LOW, the ARM9E-S
takes the FIQ exception in a later cycle, even if the nFIQ input is subsequently
deasserted.
There are several approaches that you can adopt to ensure that interrupts are not enabled
too early on the ARM9E-S. The best approach is highly dependent on the overall
system, and can be a combination of hardware and software.
5-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Interrupts
• Analyze the system and ensure enough instructions separate the instruction that
removes the interrupt and the instruction that re-enables interrupts on the
ARM9E-S.
• Have a software polling mechanism that reads back a status bit from the system
interrupt controller until it indicates that the interrupt has been removed before
re-enabling interrupts.
• Have a hardware system that stalls the ARM9E-S until the interrupt has been
removed.
Before use, the nFIQ and nIRQ inputs are registered internally to the ARM9E-S. To
improve interrupt latency, the registers are not conditioned by CLKEN, and run freely,
off the system clock, CLK. Internally, the ARM9E-S can use the registered nFIQ or
nIRQ status to prepare for interrupt entry, even if the rest of the core is being waited by
CLKEN. The registered interrupt signals can only update if CLK is running. Because
of this, the best interrupt latency can only be achieved if CLK is not stopped. This
requirement is counteracted by power saving features of a system (for instance,
stopping CLK while waiting for a slow memory device, or a power-down mode where
CLK is stopped). In systems like this, you can still achieve the best interrupt latency if
you replace the final disabled CLK cycle with one waited (CLKEN = 0) cycle.
Figure 5-2 shows a system where CLK is stopped by external clock-gating for a
number of cycles.
CLK
CLKEN
Figure 5-3 on page 5-6 shows a system which achieves most of the power saving
benefits of the system shown in Figure 5-2, while at the same time achieving best
interrupt latency.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 5-5
Interrupts
CLK
CLKEN
Figure 5-3 Using CLK and CLKEN for best interrupt latency
The system shown in Figure 5-3 combines CLK stopping and CLKEN waiting for best
power and interrupt latency performance.
5-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Interrupts
• Whenever a new instruction is in the Execute stage for the first cycle of its
execution. Here cycle refers to CLK cycles with CLKEN HIGH.
• Whenever a new instruction which interlocked in the Execute stage has just
progressed to its first active Execute cycle.
If the sampled signal is asserted at the same time as a multicycle instruction has started
its second or later cycle of execution, the interrupt exception entry does not start until
the instruction has completed.
The worst-case interrupt latency occurs when the longest possible LDM instruction
incurs a Data Abort. The processor must enter the Data Abort mode before taking the
interrupt so that the interrupt exception exit can occur correctly. This causes a
worst-case latency of 24 cycles:
• The longest LDM instruction is one that loads all of the registers, including the PC.
Counting the first Execute cycle as 1, the LDM takes 16 cycles.
• The last word to be transferred by the LDM is transferred in cycle 17, and the abort
status for the transfer is returned in this cycle.
• If a Data Abort happens, the processor detects this in cycle 18 and prepares for
the Data Abort exception entry in cycle 19.
• Cycles 20 and 21 are the Fetch and Decode stages of the Data Abort entry
respectively.
• During cycle 22, the processor prepares for FIQ entry, issuing Fetch and Decode
cycles in cycles 23 and 24.
• Therefore, the first instruction in the FIQ routine enters the Execute stage of the
pipeline in stage 25, giving a worst-case latency of 24 cycles.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 5-7
Interrupts
5-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Chapter 6-
ARM9E-S Coprocessor Interface
This chapter describes the ARM9E-S coprocessor interface. It contains the following
sections:
• About the coprocessor interface on page 6-2
• LDC/STC on page 6-4
• MCR/MRC on page 6-8
• MCRR/MRRC on page 6-10
• Interlocked MCR on page 6-12
• Interlocked MCRR on page 6-13
• CDP on page 6-14
• Privileged instructions on page 6-16
• Busy-waiting and interrupts on page 6-17
• Coprocessor 15 MCRs on page 6-18
• Connecting coprocessors on page 6-19.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-1
ARM9E-S Coprocessor Interface
The coprocessor can be run either in step with the ARM9E-S pipeline, or one cycle
behind, depending on the timing priorities. The implications of the two approaches are
discussed in:
• Coprocessor pipeline operates in step with the ARM9E-S
• Coprocessor pipeline one cycle behind the ARM9E-S.
In this case, the pipeline follower inside the coprocessor matches that of the ARM9E-S
exactly. This complicates the timing of key signals such as the INSTR and CLKEN
inputs, because these now become more heavily loaded and therefore incur more delay.
For this reason, this method is only recommended for tightly integrated coprocessors
such as CP15, the system coprocessor.
6-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
ARM9E-S Coprocessor Interface
Examples of how a coprocessor must execute these instruction classes are given in:
• LDC/STC on page 6-4
• MCR/MRC on page 6-8
• Interlocked MCR on page 6-12
• CDP on page 6-14.
Note
For the sake of clarity, all timing diagrams assume a system where the coprocessor
pipeline operates in step with the ARM9E-S.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-3
ARM9E-S Coprocessor Interface
6.2 LDC/STC
The number of words transferred is determined by how the coprocessor drives the
CHSD[1:0] and CHSE[1:0] buses. In the example ARM9E-S LDC/STC cycle timing
shown in Figure 6-1, four words of data are transferred.
CLK
InMREQ
INSTR[31:0]
PASS
LATECANCEL
CHSD[1:0] GO
Coproc CPDOUT[31:0]
STC
Coproc CPDIN[31:0]
LDC
DnMREQ
DMORE
As with all other instructions, the ARM9E-S processor core performs the main decode
using the rising edge of the clock during the Decode stage. From this, the core commits
to executing the instruction, and so performs an instruction fetch. The coprocessor
instruction pipeline keeps in step with the ARM9E-S by monitoring InMREQ.
6-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
ARM9E-S Coprocessor Interface
At the rising edge of CLK, if CLKEN is HIGH, and InMREQ is LOW, an instruction
fetch is taking place, and INSTR[31:0] contains the fetched instruction on the next
rising edge of the clock, when CLKEN is HIGH. This means that:
• the last instruction fetched must enter the Decode stage of the coprocessor
pipeline
• the instruction in the Decode stage of the coprocessor pipeline must enter its
Execute stage
In all other cases, the ARM9E-S pipeline is stalled, and the coprocessor pipeline must
not advance.
Figure 6-2 shows the timing for these signals, and indicates when the coprocessor
pipeline must advance its state. In this timing diagram, Coproc clock shows the
effective clock applied to the pipeline follower in the coprocessor. It is derived such that
the coprocessor state must only advance on rising CLK edges when CLKEN is HIGH.
The method of implementing this is dependent on the design style used, such as clock
gating or register recirculating.
For efficient coprocessor design, an unmodified version of CLK must be applied to the
Execution stage of the coprocessor. This allows the coprocessor to continue executing
an instruction even when the ARM9E-S pipeline is stalled.
CLK
CLKEN
Coproc
clock
During the Execute stage, the condition codes are compared with the flags to determine
whether the instruction really executes or not. The output PASS is asserted (HIGH) if
the instruction in the Execute stage of the coprocessor pipeline:
• is a coprocessor instruction
• has passed its condition codes.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-5
ARM9E-S Coprocessor Interface
On the rising edge of the clock, the ARM9E-S processor core examines the coprocessor
handshake signals CHSD[1:0] or CHSE[1:0]:
• If a new instruction is entering the Execute stage in the next cycle, the core
examines CHSD[1:0].
WAIT If there is a coprocessor attached that can handle the instruction, but not
immediately, the coprocessor handshake signals are driven to indicate
that the ARM9E-S processor core must stall until the coprocessor can
catch up. This is known as the busy-wait condition. In this case, the
ARM9E-S processor core loops in an idle state waiting for CHSE[1:0]
to be driven to another state, or for an interrupt to occur.
If CHSE[1:0] changes to ABSENT, the undefined instruction trap is
taken. If CHSE[1:0] changes to GO or LAST, the instruction proceeds as
follows.
If an interrupt occurs, the ARM9E-S processor core is forced out of the
busy-wait state. This is indicated to the coprocessor by the PASS signal
going LOW. The instruction is restarted later and so the coprocessor must
not commit to the instruction (it must not change any of the coprocessor
state) until it has seen PASS HIGH, when the handshake signals indicate
the GO or LAST condition.
GO The GO state indicates that the coprocessor can execute the instruction
immediately, and that it requires another cycle of execution. Both the
ARM9E-S processor core and the coprocessor must also consider the
6-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
ARM9E-S Coprocessor Interface
LAST An LDC or STC can be used for more than one item of data. If this is the
case, possibly after busy waiting, the coprocessor drives the coprocessor
handshake signals with a number of GO states, and in the penultimate
cycle drives LAST (LAST indicating that the next transfer is the final
one). If there is only one transfer, the sequence is
[WAIT,[WAIT,...]],LAST.
Table 6-1 shows how the handshake signals CHSD[1:0] and CHSE[1:0] are encoded.
Handshake CHSD[1:0],
signal CHSE[1:0]
ABSENT 10
WAIT 00
GO 01
LAST 11
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-7
ARM9E-S Coprocessor Interface
6.3 MCR/MRC
MCR and MRC cycles look very similar to STC or LDC. An example is shown in
Figure 6-3.
CLK
INSTR[31:0]
MCR/MRC
InMREQ
PASS
CHSD[1:0] LAST
CHSE[1:0] Ignored
WDATA[31:0]
(MCR)
RDATA[31:0]
(MRC)
In the next cycle InMREQ is driven LOW to denote that the instruction has now been
issued to the Execute stage. If the condition codes pass, and the instruction is to be
executed, the PASS signal is driven HIGH and the CHSD[1:0] handshake bus is
examined by the core (it is ignored in all other cases).
6-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
ARM9E-S Coprocessor Interface
For any successive Execute cycles the CHSE[1:0] handshake bus is examined. When
the LAST condition is observed, the instruction is committed. In the case of an MCR, the
WDATA[31:0] bus is driven with the register data. In the case of an MRC,
RDATA[31:0] is sampled at the end of the ARM9E-S Memory stage and written to the
destination register during the next cycle.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-9
ARM9E-S Coprocessor Interface
6.4 MCRR/MRRC
MCRR and MRRC cycles look very similar to STC or LDC. An example is shown in
Figure 6-4.
CLK
INSTR[31:0]
MCRR/MRRC
InMREQ
PASS
CHSD[1:0] GO
RDATA[31:0]
Data1 Data2
(MRRC)
In the next cycle InMREQ is driven LOW to denote that the instruction has now been
issued to the Execute stage. If the condition codes pass, and the instruction is to be
executed, the PASS signal is driven HIGH and the CHSD[1:0] handshake bus is
examined by the core (it is ignored in all other cases).
For any successive Execute cycles the CHSE[1:0] handshake bus is examined. When
the LAST condition is observed, the instruction proceeds to its final Execute cycle. In
the case of an MCRR, the WDATA[31:0] bus is driven with the first register data during
6-10 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
ARM9E-S Coprocessor Interface
the second Execute cycle, and the second register data in the Memory cycle. In the case
of an MRRC, RDATA[31:0] is sampled at the end of the second Execute and first
Memory cycles and written to the destination registers during the next cycle.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-11
ARM9E-S Coprocessor Interface
CLK
INSTR[31:0]
MCR
InMREQ
PASS
LATECANCEL
WDATA[31:0]
(MCR)
RDATA[31:0]
(MRC)
6-12 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
ARM9E-S Coprocessor Interface
CLK
INSTR[31:0]
MCRR
InMREQ
PASS
LATECANCEL
CHSD[1:0] GO (ignored) GO
RDATA[31:0]
Data1 Data2
(MRRC)
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-13
ARM9E-S Coprocessor Interface
6.7 CDP
CDP instructions normally execute in a single cycle. Like all the previous cycles,
InMREQ is driven LOW to signal when an instruction is entering the Decode stage and
again when it reaches the Execute stage of the pipeline:
• if the coprocessor can accept the instruction for execution, the PASS signal is
driven HIGH during the Execute cycle
Figure 6-7 shows a CDP which is canceled due to the previous instruction causing a Data
Abort.
CLK
INSTR[31:0] CPRT
InMREQ
PASS
LATECANCEL
CHSD[1:0] LAST
CHSE[1:0] Ignored
DABORT
6-14 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
ARM9E-S Coprocessor Interface
The CDP instruction enters the Execute stage of the pipeline and is signaled to execute
by PASS. In the following cycle LATECANCEL is asserted. This causes the
coprocessor to terminate execution of the CDP instruction and prevents the CDP
instruction from causing state changes to the coprocessor.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-15
ARM9E-S Coprocessor Interface
CLK
INSTR[31:0] CPRT
InMREQ
InTRANS
/InM[4:0] Old Mode New Mode
PASS
LATECANCEL
CHSE[1:0] Ignored
The first two CHSD responses are ignored by the ARM9E-S because it is only the final
CHSD response, as the instruction moves from Decode into Execute, that counts. This
allows the coprocessor to change its response as InTRANS/InM changes.
6-16 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
ARM9E-S Coprocessor Interface
For interrupt latency reasons, the coprocessor can be interrupted while busy-waiting,
causing the instruction to be abandoned. Abandoning execution is done through PASS.
The coprocessor must monitor the state of PASS during every busy-wait cycle. If it is
HIGH, the instruction must still be executed. If it is LOW, the instruction must be
abandoned. Figure 6-9 shows a busy-waited coprocessor instruction being abandoned
due to an interrupt.
CLK
INSTR[31:0] Instr
InMREQ
PASS
LATECANCEL
CHSD[1:0] WAIT
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-17
ARM9E-S Coprocessor Interface
CLK
INSTR[31:0]
MCR
InMREQ
ISEQ
DnMREQ
DSEQ
PASS
LATECANCEL
CHSD[1:0] GO
WDATA[31:0]
Coproc Data
(MCR)
6-18 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
ARM9E-S Coprocessor Interface
asel
1
RDATA
0 Memory
ARM9E-S system
WDATA
1
csel 1 0
bsel
CPDOUT
CPDIN
Coprocessor
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-19
ARM9E-S Coprocessor Interface
Note
The RDATA enable term (asel) is specially constructed to select the coprocessor output
data during MRC and STC operations. This is to allow the connection of the ETM module
to the ARM9E-S RDATA and WDATA buses while still allowing tracing of MRC and
STC data.
If you have multiple coprocessors in your system, connect the handshake signals as
shown in Table 6-2.
Signal Connection
PASS, LATECANCEL Connect these signals to all coprocessors present in the system.
CHSD, CHSE Combine the individual bit 1 of CHSD, and CHSE by ANDing.
Combine the individual bit 0 of CHSD, and CHSE by ORing.
Connect the CHSD, and CHSE inputs to the ARM9E-S.
You must also multiplex the output data from the coprocessors.
Example 6-1
In the case of two coprocessors that have handshaking signals CHSD1, and CHSE1,
and CHSD2, and CHSE2, respectively, the following connections are made:
ARM9E-S CP1 CP2
CHSD[1]<= CHSD1[1] ANDCHSD2[1]
CHSD[0]<= CHSD1[0] OR CHSD2[0]
CHSE[1]<= CHSE1[1] ANDCHSE2[1]
CHSE[0]<= CHSE1[0] OR CHSE2[0]
6-20 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
ARM9E-S Coprocessor Interface
If you are implementing a system that does not include any external coprocessors, you
must tie both CHSD and CHSE to 10 (ABSENT). This indicates that no external
coprocessors are present in the system. If any coprocessor instructions are received,
they cause the processor to take the undefined instruction trap, allowing the coprocessor
instructions to be emulated in software if required.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 6-21
ARM9E-S Coprocessor Interface
6-22 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Chapter 7-
Debug Interface and EmbeddedICE-RT
This chapter describes the ARM9E-S debug interface in the following sections:
• About the debug interface on page 7-2
• Debug systems on page 7-3
• Debug interface signals on page 7-9
• ARM9E-S core clock domains on page 7-14
• Determining the core and system state on page 7-15.
This chapter also describes the ARM9E-S EmbeddedICE-RT logic in the following
sections:
• About EmbeddedICE-RT on page 7-6
• Disabling EmbeddedICE-RT on page 7-8
• The debug communications channel on page 7-16
• Monitor mode debug on page 7-21.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-1
Debug Interface and EmbeddedICE-RT
The ARM9E-S contains hardware extensions for advanced debugging features. These
make it easier to develop application software, operating systems, and the hardware
itself. ARM9E-S supports two modes of debug operation:
• Halt mode
• Monitor mode.
In halt mode debug, the debug extensions allow the core to be forced into debug state.
In debug state, the core is stopped and isolated from the rest of the system. This allows
the internal state of the core, and the external state of the system, to be examined while
all other system activity continues as normal. When debug has been completed, the core
and system state can be restored, and program execution resumed.
7-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug Interface and EmbeddedICE-RT
Debug
Development system containing ARM9E-S
target
The debug host is a computer running a software debugger, such as armsd. The debug
host allows you to issue high-level commands such as setting breakpoints or examining
the contents of memory.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-3
Debug Interface and EmbeddedICE-RT
An interface, such as an RS232 or parallel connection, connects the debug host to the
ARM9E-S development system. The messages broadcast over this connection must be
converted to the interface signals of the ARM9E-S. The protocol converter performs
this conversion.
The ARM9E-S has hardware extensions that ease debugging at the lowest level. The
debug extensions:
• allow you to stall program execution by the core
• examine the core internal state
• examine the state of the memory system
• resume program execution.
ARM9E-S core This is the CPU core, with hardware support for debug.
EmbeddedICE-RT logic
This is a set of registers and comparators used to generate debug
exceptions (such as breakpoints). This unit is described in About
EmbeddedICE-RT on page 7-6.
TAP controller This controls the action of the scan chains using a JTAG serial
interface.
7-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug Interface and EmbeddedICE-RT
ARM9E-S
Scan chain 1 ARM9E-S
EmbeddedICE-RT
core
Scan chain 2
ARM9E-S
TAP controller
In halt mode debug a request on one of the external debug interface signals, or on an
internal functional unit known as the EmbeddedICE-RT logic, forces the ARM9E-S into
debug state. The events that activate debug are:
• a breakpoint (a given instruction fetch)
• a watchpoint (a data access)
• an external debug request
• scanned debug request (a debug request scanned into the EmbeddedICE-RT delay
control register).
The internal state of the ARM9E-S is examined using the JTAG serial interface, that
allows instructions to be serially inserted into the core pipeline without using the
external data bus. So, for example, when in debug state, a store multiple (STM) can be
inserted into the instruction pipeline, and this exports the contents of the ARM9E-S
registers. This data can be serially shifted out without affecting the rest of the system.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-5
Debug Interface and EmbeddedICE-RT
DBGEXT[1:0]
DBGCOMMRX
DBGCOMMTX
DBGRNG[1:0]
DBGACK
Processor EmbeddedICE-RT
DBGIEBKPT
EDBGRQ
DBGDEWPT
DBGEN
DBGTCKEN
DBGTMS
TAP DBGTDI
DBGTDO
CLK
DBGnTRST
The debug control register and the debug status register provide overall control of
EmbeddedICE-RT operation.
7-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug Interface and EmbeddedICE-RT
You can program one or both watchpoint units to halt the execution of instructions by
the core. Execution halts when the values programmed into EmbeddedICE-RT match
the values currently appearing on the address bus, data bus, and various control signals.
Note
You can mask any bit so that its value does not affect the comparison.
You can configure each watchpoint unit to be either a watchpoint (monitoring data
accesses) or a breakpoint (monitoring instruction fetches). Watchpoints and breakpoints
can be data-dependent in halt mode debug.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-7
Debug Interface and EmbeddedICE-RT
Caution
Hard wiring the DBGEN input LOW permanently disables all debug functionality.
7-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug Interface and EmbeddedICE-RT
• DBGACK is used by the ARM9E-S to flag back to the system that it is in debug
state.
An instruction being fetched from memory is sampled at the end of a cycle. To apply a
breakpoint to that instruction, the breakpoint signal must be asserted by the end of the
same cycle. This is shown in Figure 7-4.
F1 D1 E1 M1 W1
F2 D2 E2 M2 W2
Breakpointed instruction FB DB (EB) (MB) (WB)
F3 (D3) (E3) (M3)
(F4) (D4) (E4)
Ddebug Edebug1 Edebug2
CLK
IA[31:1]
INSTR[31:0] 1 2 B 3 4
DBGIEBKPT
DBGACK
You can build external logic, such as additional breakpoint comparators, to extend the
breakpoint functionality of the EmbeddedICE-RT logic. You must apply their output to
the DBGIEBKPT input. This signal is ORed with the internally-generated Breakpoint
signal before being applied to the ARM9E-S core control logic.
Note
The timing of the DBGIEBKPT input makes it unlikely that data-dependent external
breakpoints are possible.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-9
Debug Interface and EmbeddedICE-RT
A breakpointed instruction is allowed to enter the Execute stage of the pipeline, but any
state change as a result of the instruction is prevented. All instructions prior to the
breakpointed instruction complete as normal.
Note
If a breakpointed instruction does not reach the Execute stage, for instance, if an earlier
instruction is a branch, then both the breakpointed instruction and breakpoint status are
discarded and the ARM does not enter debug state.
The Decode cycle of the debug entry sequence occurs during the execute cycle of the
breakpointed instruction. The latched Breakpoint signal forces the processor to start
the debug sequence.
In Figure 7-4 on page 7-9 instruction B is breakpointed. The debug entry sequence is
initiated when instruction B enters the Execute stage. The ARM completes the debug
entry sequence and asserts DBGACK two cycles later.
A breakpointed instruction can have a Prefetch Abort associated with it. If so, the
Prefetch Abort takes priority and the breakpoint is ignored. (If there is a Prefetch Abort,
instruction data might be invalid, the breakpoint might have been data-dependent, and
as the data might be incorrect, the breakpoint might have been triggered incorrectly.)
SWI and undefined instructions are treated in the same way as any other instruction that
can have a breakpoint set on it. Therefore, the breakpoint takes priority over the SWI or
undefined instruction.
When the processor has entered debug state, it is important that further interrupts do not
affect the instructions executed. For this reason, as soon as the processor enters debug
state, interrupts are disabled, although the state of the I and F bits in the Program Status
Register (PSR) are not affected.
7-10 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug Interface and EmbeddedICE-RT
7.5.3 Watchpoints
Entry into debug state following a watchpointed memory access is imprecise. This is
necessary because of the nature of the pipeline.
You can build external logic, such as external watchpoint comparators, to extend the
functionality of the EmbeddedICE-RT logic. You must apply their output to the
DBGDEWPT input. This signal is ORed with the internally-generated Watchpoint
signal before being applied to the ARM9E-S core control logic.
Note
The timing of the DBGDEWPT input makes it unlikely that data-dependent external
watchpoints are possible.
After a watchpointed access, the next instruction in the processor pipeline is always
allowed to complete execution. Where this instruction is a single-cycle data-processing
instruction, entry into debug state is delayed for one cycle while the instruction
completes. The timing of debug entry following a watchpointed load in this case is
shown in Figure 7-5.
F1 D1 E1 M1 W1
F2 D2 E2 M2 W2
Fldr Dldr Eldr Mldr Wldr
FDp DDp EDp MDp WDp
F5 D5 E5 M5 W5
Ddebug Edebug1 Edebug2
CLK
InMREQ
INSTR[31:0] 1 2 LDR Dp 5 6 7 8
DA[31:0]
WDATA[31:0]
RDATA[31:0]
DBGDEWPT
DBGACK
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-11
Debug Interface and EmbeddedICE-RT
Although instruction 5 enters the Execute stage, it is not executed, and there is no state
update as a result of this instruction.
The instruction following the instruction that generated the watchpoint might have
modified the Program Counter (PC). If this happens, it is not possible to determine the
instruction that caused the watchpoint. A timing diagram showing debug entry after a
watchpoint where the next instruction is a branch is shown in Figure 7-6.
CLK
InMREQ
IA[31:1]
DA[31:0]
WDATA[31:0]
RDATA[31:0]
DBGDEWPT
DBGACK
You can always restart the processor. When the processor has entered debug state, the
ARM9E-S core can be interrogated to determine its state. In the case of a watchpoint,
the PC contains a value that is five instructions on from the address of the next
instruction to be executed. Therefore, if on entry to debug state, in ARM state, the
instruction SUB PC, PC, #20 is scanned in and the processor restarted, execution flow
returns to the next instruction in the code sequence.
7-12 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug Interface and EmbeddedICE-RT
If there is an abort with the data access as well as a watchpoint, the watchpoint condition
is latched, the exception entry sequence is performed, and then the processor enters
debug state. If there is an interrupt pending, the ARM9E-S allows the exception entry
sequence to occur and then enters debug state.
A debug request can take place through the EmbeddedICE-RT logic or by asserting the
EDBGRQ signal. The request is registered and passed to the processor. Debug request
takes priority over any pending interrupt. Following registering, the core enters debug
state when the instruction at the Execute stage of the pipeline has completely finished
executing (once Memory and Write stages of the pipeline have completed). While
waiting for the instruction to finish executing, no more instructions are issued to the
Execute stage of the pipeline.
When a debug request occurs, the ARM9E-S enters debug state even if the
EmbeddedICE-RT is configured for monitor mode debug.
Once the ARM9E-S is in debug state, both memory interfaces indicate internal cycles.
This allows the rest of the memory system to ignore the ARM9E-S and function as
normal. Because the rest of the system continues operation, the ARM9E-S ignores
aborts and interrupts.
The CFGBIGEND signal must not be changed by the system while in debug state. If it
changes, not only is there a synchronization problem, but the view of the ARM9E-S
seen by the programmer changes without the knowledge of the debugger. The nRESET
signal must also be held stable during debug. If the system applies reset to the
ARM9E-S (nRESET is driven LOW), the state of the ARM9E-S changes without the
knowledge of the debugger.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-13
Debug Interface and EmbeddedICE-RT
During normal operation, CLKEN conditions CLK to clock the core. When the
ARM9E-S is in debug state, DBGTCKEN conditions CLK to clock the core.
If the system and test clocks are asynchronous, they must be synchronized externally to
the ARM9E-S macrocell. The ARM Multi-ICE debug agent directly supports one or
more cores within an ASIC design. To synchronize off-chip debug clocking with the
ARM9E-S macrocell requires a three-stage synchronizer. The off-chip device (for
example, Multi-ICE) issues a TCK signal, and waits for the RTCK (Returned TCK)
signal to come back. Synchronization is maintained because the off-chip device does
not progress to the next TCK until after RTCK is received. Figure 7-7 shows this
synchronization.
TDO DBGTDO
DBGTCKEN
RTCK
TCK D Q D Q D Q
ARM9E-S
CLK
TCK synchronizer
TMS D
EN
Q
DBGTMS
CLK
TDI D
EN
Q
DBGTDI
CLK
Multi-ICE
interface Input sample and hold
pads CLK
7-14 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug Interface and EmbeddedICE-RT
Before you can examine the core and system state, the debugger must determine
whether the processor entered debug from Thumb state or ARM state, by examining
bit 4 of the EmbeddedICE-RT debug status register. If bit 4 is HIGH, the core has
entered debug from Thumb state.
For more details about determining the core state, see Determining the core and system
state on page C-18.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-15
Debug Interface and EmbeddedICE-RT
These registers are located in fixed locations in the EmbeddedICE-RT logic register
map (as shown in EmbeddedICE-RT logic on page C-28) and are accessed from the
processor using MCR and MRC instructions to coprocessor 14.
In addition to the comms channel registers, the processor can access a 1-bit debug status
register for use in the monitor mode debug configuration.
Register
Register name Notes
number
a. You can clear bit 0 of the comms channel control register by writing to it from the debugger
(JTAG) side.
Seen from the debugger, the registers are accessed using the scan chain in the usual way.
Seen from the processor, these registers are accessed using coprocessor register transfer
instructions.
7-16 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug Interface and EmbeddedICE-RT
The debug comms channel control register is read-only.1 The register controls
synchronized handshaking between the processor and the debugger. The debug comms
channel control register is shown in Figure 7-8.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 W R
Bits 31:28 Contain a fixed pattern that denotes the EmbeddedICE version
number (in this case 0011).
Bit 1 Denotes if the comms data write register is available (from the
viewpoint of the processor). Seen from the processor, if the
comms data write register is free (W=0), new data can be written.
If the register is not free (W=1), the processor must poll until
W=0.
Seen from the debugger, when W=1, some new data has been
written that can then be scanned out.
Bit 0 Denotes if there is new data in the comms data read register. Seen
from the processor, if R=1, there is some new data that can be read
using an MRC instruction.
Seen from the debugger, if R=0, the comms data read register is
free, and new data may be placed there through the scan chain. If
R=1, this denotes that data previously placed there through the
scan chain has not been collected by the processor, and so the
debugger must wait.
1. The control register should be viewed as read-only. However, the debugger can clear the
R bit by performing a write to the debug comms channel control register. This feature
must not be used under normal circumstances.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-17
Debug Interface and EmbeddedICE-RT
Note
The Thumb instruction set does not support coprocessor instructions. Therefore, the
processor must be in ARM state before you can access the debug comms channel.
The coprocessor 14 monitor mode debug status register is provided for use by a debug
monitor when the ARM9E-S is configured into the monitor mode debug mode.
The coprocessor 14 monitor mode debug status register is a 1-bit wide read/write
register having the format shown in Figure 7-9.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
DbgAbt bit
Bit 0 of the register, the DbgAbt bit, indicates whether the processor took a Prefetch or
Data Abort in the past because of a breakpoint or watchpoint. If the ARM9E-S core
takes a Prefetch Abort as a result of a breakpoint or watchpoint, then the bit is set. If on
a particular instruction or data fetch, both the debug abort and external abort signals are
asserted, the external abort takes priority and the DbgAbt bit is not set. You can read or
write the DbgAbt bit using MRC or MCR instructions.
7-18 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug Interface and EmbeddedICE-RT
A typical use of this bit is by a monitor mode debug aware abort handler. This examines
the DbgAbt bit to determine whether the abort was externally or internally generated. If
the DbgAbt bit is set, the abort handler initiates communication with the debugger over
the comms channel.
You can send and receive messages using the comms channel. These are described in:
• Sending a message to the debugger
• Receiving a message from the debugger on page 7-20.
Before the processor can send a message to the debugger, it must check that the comms
data write register is free for use by finding out if the W bit of the debug comms control
register is clear.
The processor reads the debug comms control register to check the status of the W bit:
• If the W bit is set, previously written data has not been read by the debugger. The
processor must continue to poll the control register until the W bit is clear.
When the W bit is clear, a message is written by a register transfer to coprocessor 14.
As the data transfer occurs from the processor to the comms data write register, the W
bit is set in the debug comms control register.
The debugger has two options available for reading data from the comms data write
register:
• Poll the debug comms channel control register before reading the comms data
written. If the W bit is set, there is valid data present in the debug comms data
write register. The debugger can then read this data and scan the data out. The
action of reading the data clears the debug comms channel control register W bit.
Then the communications process can begin again.
• Poll the comms data write register, obtaining data and valid status. The data
scanned out consists of the contents of the comms data write register (which
might or might not be valid), and a flag that indicates whether the data read is
valid or not. The status flag is present in the Addr[0] bit position of scan chain 2
when the data is scanned out. See Test data registers on page C-10 for details of
scan chain 2.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-19
Debug Interface and EmbeddedICE-RT
• If the R bit is LOW, the comms data read register is free, and data can be placed
there for the processor to read.
• If the R bit is set, previously deposited data has not yet been collected, so the
debugger must wait.
When the comms data read register is free, data is written there using the JTAG
interface. The action of this write sets the R bit in the debug comms control register.
The processor polls the debug comms control register. If the R bit is set, there is data
that can be read using an MRC instruction to coprocessor 14. The action of this load
clears the R bit in the debug comms control register. When the debugger polls this
register and sees that the R bit is clear, the data has been taken, and the process can now
be repeated.
7-20 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug Interface and EmbeddedICE-RT
• The vector catching hardware can be used but must not be configured to catch the
Prefetch or Data Abort exceptions.
• No support is provided to mix halt mode debug and monitor mode debug
functionality.
The fact that an abort has been generated by the monitor mode is recorded in the
monitor mode debug status register in coprocessor 14 (see Comms channel monitor
mode debug status register on page 7-18).
Because the monitor mode debug bit does not put the ARM9E-S into debug state, it now
becomes necessary to change the contents of the watchpoint registers while external
memory accesses are taking place, rather than being changed when in debug state. In
the event that the watchpoint registers are written to during an access, all matches from
the affected watchpoint unit using the register being updated are disabled for the cycle
of the update.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 7-21
Debug Interface and EmbeddedICE-RT
1. Disable the watchpoint unit using the control register for that watchpoint unit.
7-22 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Chapter 8-
Instruction Cycle Times
This chapter gives the instruction cycle timings and illustrates interlock conditions
present in the ARM9E-S design. It contains the following sections:
• Instruction cycle count summary on page 8-3
• Introduction to detailed instruction cycle timings on page 8-7
• Branch and ARM branch with link on page 8-8
• Thumb branch with link on page 8-9
• Branch and exchange on page 8-10
• Thumb Branch, Link, and Exchange <immediate> on page 8-11
• Data operations on page 8-12
• MRS on page 8-14
• MSR operations on page 8-15
• Multiply and multiply accumulate on page 8-16
• QADD, QDADD, QSUB, and QDSUB on page 8-20
• Load register on page 8-21
• Store register on page 8-26
• Load multiple registers on page 8-27
• Store multiple registers on page 8-30
• Load double register on page 8-31
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-1
Instruction Cycle Times
8-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
Symbol Meaning
Table 8-2 summarizes the ARM9E-S instruction cycle counts and bus activity when
executing the ARM instruction set.
Instruction Data
Instruction Cycles Comment
bus bus
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-3
Instruction Cycle Times
Instruction Data
Instruction Cycles Comment
bus bus
LDM n 1S+(n-1)I 1N+(n-1)S Loading n registers, n > 1, not loading the PC.
LDM n+1 1S+nI 1N+(n-1)S+1I Loading n registers, n > 1, not loading the PC,
last word loaded used by following instruction.
LDM n+4 2S+1N+(n+1)I 1N+(n-1)S+4I Loading n registers including the PC, n > 0.
8-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
Instruction Data
Instruction Cycles Comment
bus bus
MRRC b+3 1S+(b+2)I (b+1)I+2C Following instruction uses last transferred data.
MSR 3 1S+2I 3I If any bits other than just the flags are updated
(all masks other than mask_f).
MUL, MLA 3 1S+2I 3I Following instruction uses the result in its first
Execute cycle or its first Memory cycle. Does
not apply to a multiply accumulate using result
for accumulate operand.
QADD, QDADD, 2 1S+1I 2I Following instruction uses the result in its first
QSUB, QDSUB Execute cycle.
SMULxy, SMLAxy 2 1S+1I 2I Following instruction uses the result in its first
Execute or its first Memory cycle. Does not
apply to a multiply accumulate using result for
accumulate operand.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-5
Instruction Cycle Times
Instruction Data
Instruction Cycles Comment
bus bus
SMULWx, SMLAWx 2 1S+1I 2I Following instruction uses the result in its first
Execute or its first Memory cycle. Does not
apply to a multiply accumulate using result for
accumulate operand.
8-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
The request, address, and control signals on both the instruction and data interfaces are
pipelined so that they are generated in the cycle before the one to which they apply, and
are shown as such in the following tables.
Note
All cycle counts in this chapter assume zero-wait-state memory access. In a system
where CLKEN is used to add wait states, the cycle counts must be adjusted
accordingly.
Table 8-3 shows the key to the cycle timing tables, Table 8-4 to Table 8-36.
Symbol Meaning
- Indicates that the signal is not active, and therefore not valid in this cycle.
A blank entry in the table indicates that the status of the signal is not
determined by the instruction in that cycle. The status of the signal is
determined either by the preceding or succeeding instruction.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-7
Instruction Cycle Times
1. During the first cycle, a branch instruction calculates the branch destination while
performing a prefetch from the current PC. This prefetch is performed in all case,
because by the time the decision to take the branch has been reached, it is already
too late to prevent the prefetch. If the previous instruction requested a data
memory access, the data is transferred in this cycle.
2. During the second cycle, the ARM9E-S performs a fetch from the branch
destination. If the link bit is set, the return address to be stored in r14 is calculated.
3. During the third cycle, the ARM9E-S performs a fetch from the destination + i,
refilling the instruction pipeline.
Table 8-4 Branch and ARM branch with link cycle timings
(pc’ + 2i) -
8-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
1. The first instruction acts as a simple data operation. It takes a single cycle to add
the PC to the upper part of the offset, and stores the result in r14. If the previous
instruction requested a data memory access, the data is transferred in this cycle.
2. The second instruction acts similarly to the ARM BL instruction over three cycles:
a. During the first cycle, the ARM9E-S calculates the final branch target
address while performing a prefetch from the current PC.
b. During the second cycle, the ARM9E-S performs a fetch from the branch
destination, while calculating the return address to be stored in r14.
c. During the third cycle, the ARM9E-S performs a fetch from the destination
+ 2, refilling the instruction pipeline.
(pc’+i) -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-9
Instruction Cycle Times
1. During the first cycle, the ARM9E-S extracts the branch destination and the new
core state while performing a prefetch from the current PC. This prefetch is
performed in all cases, because by the time the decision to take the branch has
been reached, it is already too late to prevent the prefetch. In the case of BX and
BLX<register>, the branch destination new state comes from the register. For
BLX<immediate> the destination is calculated as a PC offset. The state is always
changed. If the previous instruction requested a memory access (and there is no
interlock in the case of BX, BLX <register>), the data is transferred in this
cycle.
2. During the second cycle, the ARM9E-S performs a fetch from the branch
destination, using the new instruction width, dependent on the state that has been
selected. If the link bit is set, the return address to be stored in r14 is calculated.
3. During the third cycle, the ARM9E-S performs a fetch from the destination +2 or
+4 dependent on the new specified state, refilling the instruction pipeline.
(pc’ + 2i’) -
8-10 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
1. The first instruction acts as a simple data operation. It takes a single cycle to add
the PC to the upper part of the offset, and stores the result in r14. If the previous
instruction requested a data memory access, the data is transferred in this cycle.
(pc’+2i) -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-11
Instruction Cycle Times
The ALU combines the A bus operand with the (shifted) B bus operand according to
the operation specified in the instruction. The ARM9E-S pipelines this result and writes
it into the destination register, when required. Compare and test operations do not write
a result as they only affect the status flags.
An instruction prefetch occurs at the same time as the data operation, and the PC is
incremented.
When a register specified shift is used, an additional execute cycle is needed to read the
shifting register operand. The instruction prefetch occurs during this first cycle.
The PC can be one or more of the register operands. When the PC is the destination, the
external bus activity is affected. When the ARM9E-S writes the result to the PC, the
contents of the instruction pipeline are invalidated, and the ARM9E-S takes the address
for the next instruction prefetch from the ALU rather than the incremented address. The
ARM9E-S refills the instruction pipeline before any further instruction execution takes
place. Exceptions are locked out while the pipeline is refilling.
Note
Shifted register with destination equals PC is not possible in Thumb state.
(pc+3i) -
(pc’+ 2i) -
8-12 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
(pc+3i) -
(pc’+2i) -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-13
Instruction Cycle Times
8.8 MRS
An MRS operation always takes two cycles to execute. The first cycle allows any
pending state changes to the PSR to be made. The second cycle passes the PSR register
through the ALU so that it can be written to the destination register.
Note
The MRS instruction can only be executed when in ARM state.
(pc+3i) -
8-14 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
Note
MSR instructions can only be executed in ARM state.
(pc+3i) -
(pc+3i) -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-15
Instruction Cycle Times
During the first (Execute) stage of a multiply instruction, the multiplier and
multiplicand operands are read onto the A and B buses, on which the multiplier unit is
connected. The first stage of the multiplier performs Booth recoding and partial product
summation, using 16 bits of the multiplier operand each cycle.
During the second (Memory) stage of a multiply instruction, the partial product result
from the Execute stage is added with an optional accumulate term (read onto the C bus)
and a possible feedback term from a previous multiply step for multiplications which
require additional cycles.
Note
In Thumb state, only the MULS and MLAS operations are possible.
8.10.1 Interlocks
The multiply unit in ARM9E-S operates in both the Execute and Memory stage of the
pipeline. Because of this, the multiplier result is not available until the end of the
Memory stage of the pipeline. If the following instruction requires the use of the
multiplier result, then it must be interlocked so that the correct value is available. This
applies to all instructions that require the multiply result for the first Execute cycle or
first Memory cycle of the instruction except for multiply accumulate instructions using
the previous multiply result as the accumulator operand.
8-16 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
Table 8-11 shows the cycle timing for MUL and MLA instructions with and without
interlocks.
(pc+3i) -
(pc+3i) -
The MULS and MLAS instructions always take four cycles to execute, and cannot
generate interlocks in following instructions.
Table 8-12 shows the cycle timing for MULS and MLAS instructions.
(pc+3i) -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-17
Instruction Cycle Times
Table 8-13 shows the cycle timing for SMULL, UMULL, SMLAL, and UMLAL instructions
with and without interlocks.
(pc+3i) -
(pc+3i) -
The SMULLS, UMULLS, SMLALS, and UMLALS instructions always take five cycles to
execute, and cannot generate interlocks in following instructions.
Table 8-14 shows the cycle timing for the SMULLS, UMULLS, SMLALS, and UMLALS
instructions.
(pc+3i) -
8-18 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
Table 8-15 shows the cycle timing for SMULxy, SMLAxy, SMULWy, and SMLAWy
instructions with and without interlocks.
b b (pc+3i) b -
(pc+3i) -
Table 8-16 shows the cycle timing for SMLALxy instructions with and without
interlocks.
(pc+3i) -
(pc+3i) -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-19
Instruction Cycle Times
8.11.1 Interlocks
The instructions in this class use both the Execute and Memory stages of the pipeline.
Because of this, the result of an instruction in this class is not available until the end of
the Memory stage of the pipeline. If a following instruction requires the use of the
result, then it must be interlocked so that the correct value is available. This applies to
all instructions that require the result for the first Execute cycle. Instructions that require
the result of a QADD or similar instruction for the first Memory cycle do not incur an
interlock.
Table 8-17 shows the cycle timing for QADD, QDADD, QSUB, and QDSUB instructions with
and without interlocks.
(pc+3i) b -
(pc+3i) -
8-20 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
Note
Destination equals PC is not possible in Thumb state.
8.12.1 Interlocks
The result of an aligned word load instruction is not available until the end of the
Memory stage of the pipeline. If the following instruction requires the use of this result
then it must be interlocked so that the correct value is available. This interlock is
referred to as a single-cycle load-use interlock.
Unaligned word loads, load byte (LDRB), and load halfword (LDRH) instructions use the
byte rotate unit in the Write stage of the pipeline. This introduces a two-cycle load-use
interlock, that can affect the two instructions immediately following the load
instruction.
Once an interlock has been incurred for one instruction it does not have to be incurred
for a later instruction.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-21
Instruction Cycle Times
For example, the following sequence incurs a two-cycle interlock on the first ADD
instruction, but the second ADD does not incur any interlocks:
LDRB r0, [r1, #1]
ADD r2, r0, r3
ADD r4, r0, r5
There is no forwarding path from loaded data to the C read port of the register bank,
which is used for the store data of STR and STM instructions and for the accumulate
operand of multiply accumulate instructions. The result of a load must reach the Write
stage of the pipeline before it can be made available at the C read port, resulting in a
single-cycle load-use interlock from loaded data to the C read port.
Most interlock conditions are determined when the instruction being interlocked is still
in the Decode stage of the pipeline. Load multiple and Store multiple instructions can
incur a Decode stage interlock when the base register is not available due to a previous
instruction. Store multiple instructions can also incur an Execute stage interlock when
the first register to be stored is not available due to a previous instruction. This is
referred to as a second-cycle interlock.
8-22 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
A second-cycle interlock can be incurred on the first word of data stored by an STM
instruction or during the first cycle of a register controlled shift. The following example
does not incur an interlock:
LDR r3, [r1]
STMIA r0, {r2-r3}
Table 8-18 shows the cycle timing for basic load register operations, where:
InMREQ, DnMREQ,
Cycle IA INSTR DA DnTRANS RDATA
ISEQ DSEQ
(pc+3i) (da)
(pc’+2i) -
Note
Destination equals PC is not possible in Thumb state.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-23
Instruction Cycle Times
Table 8-19 shows the cycle timing for load operations resulting in simple interlocks.
InMREQ, DnMREQ,
Cycle IA INSTR DA RDATA
ISEQ DSEQ
(pc+3i) -
(pc+3i) -
With more complicated interlock cases you cannot consider the load instruction in
isolation. This is because in these cases the load instruction has vacated the Execute
stage of the pipeline and a later instruction has occupied it.
Table 8-20 shows the one-cycle interlock incurred for the following sequence of
instructions:
LDRB r0, [r1]
NOP
ADD r2, r0, r1
Table 8-20 Example sequence LDRB, NOP and ADD cycle timing
InMREQ, DnMREQ,
Cycle IA INSTR DA RDATA
ISEQ DSEQ
(pc+5i) -
8-24 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
Table 8-21 shows the cycle timing for the following code sequence:
LDRB r0, [r2]
STMIA r3, {r0-r1}
InMREQ, DnMREQ,
Cycle IA INSTR DA RDATA
ISEQ DSEQ
(pc+4i) r1
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-25
Instruction Cycle Times
Table 8-22 shows the cycle timing for a store register operation, where:
InMREQ, DnMREQ,
Cycle IA INSTR DA DnTRANS WDATA
ISEQ DSEQ
(pc+3i) Rd
8-26 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
1. During the first cycle, the ARM9E-S calculates the address of the first word to be
transferred, while performing an instruction prefetch.
2. During the second and subsequent cycles, ARM9E-S reads the data requested in
the previous cycle and calculates the address of the next word to be transferred.
The new value for the base register is calculated.
When a Data Abort occurs, the instruction continues to completion. The ARM9E-S
prevents all register writing after the abort. The ARM9E-S restores the modified base
pointer (which the load activity before the abort occurred might have overwritten).
When the PC is in the list of registers to be loaded, the ARM9E-S invalidates the current
contents of the instruction pipeline. The PC is always the last register to be loaded, so
an abort at any point prevents the PC from being overwritten.
Note
LDM with destination = PC cannot be executed in Thumb state. However,
POP{Rlist, PC} equates to an LDM with destination = PC.
8.14.1 Interlocks
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-27
Instruction Cycle Times
InMREQ, DnMREQ,
Cycle IA INSTR DA RDATA
ISEQ DSEQ
(pc+3i) -
(pc+3i) (da++)
(pc’+2i) -
(pc’+2i) -
8-28 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
InMREQ, DnMREQ,
Cycle IA INSTR DA RDATA
ISEQ DSEQ
(pc+3i) -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-29
Instruction Cycle Times
1. During the first cycle, the ARM9E-S calculates the address of the first word to be
transferred, while performing an instruction prefetch and also calculating the new
value for the base register.
2. During the second and subsequent cycles, ARM9E-S stores the data requested in
the previous cycle and calculates the address of the next word to be transferred.
When a Data Abort occurs, the instruction continues to completion. The ARM9E-S
restores the modified base pointer (which the load activity before the abort occurred
might have overwritten).
InMREQ, DnMREQ,
Cycle IA INSTR DA WDATA
ISEQ DSEQ
(pc+3i) -
(pc+3i) R’’’
8-30 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-31
Instruction Cycle Times
8-32 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
The swap operation might be aborted in either the read or the write cycle. An aborted
swap operation does not affect the destination register.
Note
Data swap instructions are not available in Thumb state.
The DLOCK output of ARM9E-S is driven HIGH for both read and write cycles to
indicate to the memory system that it is an atomic operation.
8.18.1 Interlocks
A swap operation can cause one and two-cycle interlocks in a similar fashion to a load
register instruction.
Table 8-25 shows the cycle timing for the basic data swap operation.
InMREQ, DnMREQ,
Cycle IA INSTR DA RDATA WDATA
ISEQ DSEQ
(pc+3i) - Rd
(pc+3i) - -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-33
Instruction Cycle Times
InMREQ, DnMREQ,
Cycle IA INSTR DA RDATA WDATA
ISEQ DSEQ
(pc+3i) - -
8-34 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
8.19 PLD
A PLD operation executes in a single cycle. During the Execute cycle, the prefetch
address is calculated and broadcast on DA[31:0]. DnMREQ and DSEQ indicate an
internal cycle, and DnSPEC is asserted.
InMREQ, DnMREQ,
Cycle IA INSTR DA RDATA WDATA
ISEQ DSEQ
(pc+3i) - -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-35
Instruction Cycle Times
1. During the first cycle, the ARM9E-S constructs the forced address, and a mode
change might take place.
2. During the second cycle, the ARM9E-S performs a fetch from the exception
address. The return address to be stored in r14 is calculated. The state of the CPSR
is saved in the relevant SPSR.
3. During the third cycle, the ARM9E-S performs a fetch from the exception address
+ 4, refilling the instruction pipeline.
The exception entry cycle timings are show in Table 8-27, where:
pc Is one of:
• the address of the SWI instruction for SWIs
• the address of the instruction following the last one to be executed
before entering the exception for interrupts
• the address of the aborted instruction for Prefetch Aborts
• the address of the instruction following the one that attempted the
aborted data transfer for Data Aborts.
1 Xn N cycle 1 0 - I cycle
(Xn+8) -
Note
The value on the INSTR bus can be unpredictable in the case of Prefetch Abort or Data
Abort entry.
8-36 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
If the coprocessor cannot perform the requested task, it leaves CHSD at ABSENT.
When the coprocessor is able to perform the task, but cannot commit immediately, the
coprocessor drives CHSD to WAIT, and in subsequent cycles drives CHSE to WAIT
until able to commit, where it drives CHSE to LAST.
Note
Coprocessor operations are only available in ARM state.
The coprocessor data operation cycle timings are shown in Table 8-28.
RDATA/
Cycle IA IREQa INSTR DA DREQb Pc LCd CHSD CHSE
WDATA
ready LAST
(pc+3i) -
(pc+3i) -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-37
Instruction Cycle Times
The coprocessor commits to the transfer only when it is ready to accept the data. The
coprocessor indicates that it is ready for the transfer to commence by driving CHSD or
CHSE to GO. The ARM9E-S produces addresses and requests data memory reads on
behalf of the coprocessor, which is expected to accept the data at sequential rates. The
coprocessor is responsible for determining the number of words to be transferred. It
indicates this using the CHSD and CHSE signals, setting the appropriate signal to
LAST in the cycle before it is ready to initiate the transfer of the last data word.
Note
Coprocessor operations are only available in ARM state.
The load coprocessor register cycle timings are shown in Table 8-29.
CHS
Cycle IA IREQa INSTR DA DREQb RDATA Pc LCd CHSD
E
1 register LAST
ready
1 pc+3i S cycle (pc+2i) da N cycle 1 0 -
(pc+3i) (da)
1 register WAIT
not ready
1 pc+3i I cycle (pc+2i) - I cycle 1 0 WAIT
(pc+3i) (da)
8-38 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
CHS
Cycle IA IREQa INSTR DA DREQb RDATA Pc LCd CHSD
E
m registers GO
(m > 1)
ready 1 pc+3i I cycle (pc+2i) da N cycle 1 0 GO
(pc+3i) (da++)
m registers WAIT
(m > 1)
not ready 1 pc+3i I cycle (pc+2i) - I cycle 1 0 WAIT
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-39
Instruction Cycle Times
The coprocessor commits to the transfer only when it is ready to write the data. The
coprocessor indicates that it is ready for the transfer to commence by driving CHSD or
CHSE to GO. The ARM9E-S produces addresses and requests data memory writes on
behalf of the coprocessor, which is expected to produce the data at sequential rates. The
coprocessor is responsible for determining the number of words to be transferred. It
indicates this using the CHSD and CHSE signals, setting the appropriate signal to
LAST in the cycle before it is ready to initiate the transfer of the last data word.
Note
Coprocessor operations are only available in ARM state.
The store coprocessor register cycle timings are shown in Table 8-30.
1 register LAST
ready
1 pc+3i S cycle (pc+2i) da N cycle 1 0 -
(pc+3i) CPData1
1 register WAIT
not ready
1 pc+3i I cycle (pc+2i) - I cycle 1 0 WAIT
(pc+3i) CPData1
8-40 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
m GO
registers
(m > 1) 1 pc+3i I cycle (pc+2i) da N cycle 1 0 GO
ready
2 pc+3i I cycle - da++ S cycle CPData1 1 0 GO
(pc+3i) CPDatam
m WAIT
registers
(m > 1) 1 pc+3i I cycle (pc+2i) - I cycle 1 0 WAIT
not ready
. pc+3i I cycle - - I cycle - 1 0 WAIT
(pc+3i) CPDatam
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-41
Instruction Cycle Times
Data is transferred over the data bus interface, in a similar fashion to a load register
operation.
Note
Coprocessor operations are only available in ARM state.
ready LAST
(pc+3i) CPData
(pc+3i) CPData
8-42 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
Data is transferred over the data bus interface, in a similar fashion to a store register
operation.
Note
Coprocessor operations are only available in ARM state.
ready LAST
(pc+3i) Rd
(pc+3i) Rd
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-43
Instruction Cycle Times
Data is transferred over the data bus interface, in a similar fashion to a load register
operation.
Note
Coprocessor operations are only available in ARM state.
ready GO
(pc+3i) CPData2
(pc+3i) CPData2
8-44 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
Data is transferred over the data bus interface, in a similar fashion to a store register
operation.
Note
Coprocessor operations are only available in ARM state.
ready GO
(pc+3i) Rn
Rn
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-45
Instruction Cycle Times
Note
By default, CHSD and CHSE must be driven to ABSENT unless the coprocessor
instruction is being handled by a coprocessor. Coprocessor operations are only available
in ARM state.
The cycle timings for coprocessor absent instructions are shown in Table 8-35.
INST RDATA/
Cycle IA IREQa DA DREQb Pc LCd CHSD CHSE
R WDATA
coproces ABSENT
sor
absent in 1 pc+3i I cycle (pc+2i) - I cycle 1 0 - -
decode
2 0x4 N cycle - - I cycle - 0 0 - -
(0xC) -
coproces WAIT
sor
absent in 1 pc+3i I cycle (pc+2i) - I cycle 1 0 WAIT
execute
. pc+3i I cycle - - I cycle - 0 0 WAIT
(0xC) -
8-46 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Instruction Cycle Times
Table 8-36 shows the instruction cycle timing for an unexecuted instruction.
(pc + 3i) -
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 8-47
Instruction Cycle Times
8-48 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Chapter 9-
AC Parameters
This chapter gives the AC timing parameters of the ARM9E-S. It contains the following
sections:
• Timing diagrams on page 9-2
• AC timing parameter definitions on page 9-8.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 9-1
AC Parameters
CLK
InMREQ, TRANS
ISEQ
Tovitrans
Tohitrans
IA[31:1] Address
Toviaddr
Tohiaddr
InTRANS
InM[4:0] Control
ITBIT Tovictl
Tohictl
INSTR[31:0]
Tisinstr
Tihinstr
IABORT
Tisiabort
Tihiabort
DBGIEBKPT
Tisiebkpt
Tihiebkpt
9-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
AC Parameters
CLK
DnMREQ,
DSEQ, TRANS
DMORE,
Tovdtrans
DnSPEC Tohdtrans
DA[31:0] Address
Tovdaddr
Tohdaddr
DnRW,
DMAS[1:0],
DLOCK, Control
DnTRANS, Tovdctl
Tohdctl
DnM[4:0]
WDATA[31:0] Data
Tovwdata
Tohwdata
RDATA[31:0]
Tisrdata
Tihrdata
DABORT
Tisdabort
Tihdabort
DBGDEWPT
Tisdewpt
Tihdewpt
CLK
CLKEN
Tisclken
Tihclken
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 9-3
AC Parameters
CLK
PASS
Tovpass
Tohpass
LATECANCEL
Tovlate
Tohlate
CHSD[1:0]
Tischsd
Tihchsd
CHSE[1:0]
Tischse
Tihchse
CLK
nFIQ,
nIRQ
Tisint
Tihint
nRESET
Tisnreset
Tihnreset
CFGBIGEND,
CFGDISLTBIT,
CFGHIVECS Tiscfg
Tihcfg
9-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
AC Parameters
CLK
DBGACK
Tovdbgack
Tohdbgack
DBGRNG[1:0]
Tovdbgrng
Tohdbgrng
DBGRQI
Tovdbgrqi
Tohdbgrqi
DBGINSTREXEC,
DBGINSTRVALID
Tovdbgstat
Tohdbgstat
DBGCOMMRX,
DBGCOMMTX
Tovdbgcomm
Tohdbgcomm
DBGEN,
EDBGRQ,
DBGEXT[1:0]
Tisdbgin
Tihdbgin
CLK
FIQDIS,
IRQDIS
Tovintdis
Tohintdis
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 9-5
AC Parameters
CLK
DBGIR[3:0],
DBGSCREG[4:0],
DBGTAPMS[3:0] Tovdbgsm
Tohdbgsm
DBGnTDOEN
Tovtdoen
Tohtdoen
DBGSDIN
Tovsdin
Tohsdin
DBGTDO
Tovtdo
Tohtdo
DBGnTRST
Tisntrst
Tihntrst
DBGTDI,
DBGTMS
Tistdi
Tihtdi
DBGTCKEN
Tistcken
Tihtcken
TAPID
Tistapid
Tihtapid
9-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
AC Parameters
DBGSDOUT
DBGTDO
Ttdsh
Ttdsd
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 9-7
AC Parameters
Note
Where 0% is given, this indicates the hold time to clock edge plus the maximum clock
skew for internal clock buffering.
9-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
AC Parameters
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 9-9
AC Parameters
Tohdbgcomm Comms channel output hold time from rising CLK >0% -
9-10 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
AC Parameters
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. 9-11
AC Parameters
9-12 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Appendix A-
Signal Descriptions
This appendix lists and describes all the ARM9E-S interface signals. It contains the
following sections:
• Clock interface signals on page A-2
• Instruction memory interface signals on page A-3
• Data memory interface signals on page A-4
• Miscellaneous signals on page A-6
• Coprocessor interface signals on page A-7
• Debug signals on page A-8.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. A-1
Signal Descriptions
CORECLKENOUT Output The principal state advance signal for the ARM9E-S
core. This output must be connected directly to the
CORECLKENIN input for correct operation. This
signal has been exported from the core to ease buffer
tree insertion from the CORECLKENIN input. You
must take care when loading and routing the
CORECLKENOUT to CORECLKENIN
connection.
A-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Signal Descriptions
InMREQ Output If LOW at the end the cycle, then the processor
Not instruction requires a memory access during the following
memory request cycle.
ISEQ Output If HIGH at the end of the cycle then any instruction
Instruction Sequential memory access during the following cycle is
sequential from the last instruction memory access.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. A-3
Signal Descriptions
RDATA [31:0] Input This bus is used to transfer data between the
Read data memory system and the processor during read
cycles (when DnRW is LOW).
WDATA [31: 0] Output This bus is used to transfer data between the
Write data memory system and the processor during write
cycles (when DnRW is HIGH).
DMORE Output If HIGH at the end of the cycle, then the data
Data more memory access in the following cycle is
directly followed by a sequential data memory
access.
DnMREQ Output If LOW at the end the cycle, then the processor
Not data memory requires a data memory access in the following
request cycle.
A-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Signal Descriptions
DnRW Output If LOW at the end of the cycle, then any data
Data not read, write memory access in the following cycle is a read.
If HIGH then it is a write.
DSEQ Output If HIGH at the end of the cycle, then any data
Data sequential address memory access in the following cycle is
sequential from the last data memory access.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. A-5
Signal Descriptions
nFIQ Input This is the Fast Interrupt Request signal. This input is a
Not fast interrupt synchronous input to the core. It is not synchronized
internally to the core.
nRESET Input This active LOW reset signal is used to start the
Not reset processor from a known address. This is a
level-sensitive asynchronous reset.
A-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Signal Descriptions
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. A-7
Signal Descriptions
DBGIR[3:0] Output These four bits reflect the current instruction loaded
TAP controller into the TAP controller instruction register. These bits
instruction register change when the TAP state machine is in the
UPDATE-IR state.
DBGnTRST Input This is the active LOW reset signal for the
Not test reset EmbeddedICE internal state. This signal is a
level-sensitive asynchronous reset input.
DBGnTDOEN Output When LOW, this signal denotes that serial data is
Not DBGTDO being driven out on the DBGTDO output.
enable DBGnTDOEN is usually used as an output enable
for a DBGTDO pin in a packaged part.
DBGSCREG[4:0] Output These five bits reflect the ID number of the scan chain
currently selected by the TAP Scan Chain Register
controller. These bits change when the TAP state
machine is in the UPDATE-DR state.
DBGSDOUT Input This is the serial data out of an external scan chain.
Input boundary When an external boundary scan chain is not
scan serial output connected, this input must be tied LOW.
data
DBGTAPSM[3:0] Output This bus reflects the current state of the TAP
TAP controller state controller state machine.
machine
A-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Signal Descriptions
DBGCOMMRX Output When HIGH, this signal denotes that the comms
Communications channel receive buffer contains valid data waiting to
channel receive be read by the ARM9E-S.
DBGCOMMTX Output When HIGH, this signal denotes that the comms
Communications channel transmit buffer is empty.
channel transmit
DBGEN Input This input signal allows the debug features of the
Debug enable processor to be disabled. This signal must be LOW
when debugging is not required.
DBGRQI Output This signal represents the state of bit 1 of the debug
Internal debug control register that is combined with EDBGRQ and
request presented to the core debug logic.
TAPID[31:0] Input This input specifies the ID code value shifted out on
Boundary scan DBGTDO when the IDCODE instruction is entered
ID code into the TAP controller.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. A-9
Signal Descriptions
A-10 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Appendix B-
Differences Between the ARM9E-S and the
ARM9TDMI
This appendix describes the differences between the ARM9E-S and ARM9TDMI
macrocell interfaces. It contains the following sections:
• Interface signals on page B-2
• ATPG scan interface on page B-5
• Timing parameters on page B-6
• ARM9E-S design considerations on page B-7
• ARM9E-S debugger considerations on page B-9.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. B-1
Differences Between the ARM9E-S and the ARM9TDMI
Other signals provide the interface for the system designer, which is primarily
memory-mapped. Table B-1 shows the ARM9E-S signals with their ARM9TDMI hard
macrocell equivalent signals.
ARM9TDMI hard
ARM9E-S signal Function Note
macrocell equivalent
CLK Rising edge master clock. All inputs are sampled on the GCLK a
rising edge of CLK.
All timing dependencies are from the rising edge of CLK.
DA[31:0] 32-bit data address output bus, available in the cycle DA[31:0] c
preceding the memory cycle.
DBGDEWPT External data watchpoint (tie LOW when not used). DEWPT e
B-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Differences Between the ARM9E-S and the ARM9TDMI
Table B-1 ARM9E-S signals and ARM9TDMI hard macrocell equivalents (continued)
ARM9TDMI hard
ARM9E-S signal Function Note
macrocell equivalent
DBGEXT[1:0] EmbeddedICE EXTERN debug qualifiers (tie LOW when EXTERN0, EXTERN1 -
not required).
IA[31:1] 31-bit instruction address output bus, available in the cycle IA[31:1] c
preceding the Memory cycle.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. B-3
Differences Between the ARM9E-S and the ARM9TDMI
Table B-1 ARM9E-S signals and ARM9TDMI hard macrocell equivalents (continued)
ARM9TDMI hard
ARM9E-S signal Function Note
macrocell equivalent
a. CLK is a rising edge clock. It is inverted with respect to the GCLK signal used on the ARM9TDMI hard macrocell.
b. CLKEN is sampled on the rising edge of CLK. The nWAIT signal on the ARM9TDMI hard macrocell must be held
throughout the high phase of GCLK. This means that the address class outputs (IA[31:1], DA[31:0], DnRW, DMAS,
InTRANS, DnTRANS, and ITBIT) can still change in a cycle in which CLKEN is taken LOW. You must take this
possibility into account when designing a memory system.
c. All the address class signals (IA[31:1], DA[31:0], DnRW, DMAS, InTRANS, DnTRANS, and ITBIT) change on the
rising edge of CLK. In a system with a low-frequency clock this means that the signals can change in the first phase of the
clock cycle. This is unlike the ARM9TDMI hard macrocell where they always change in the last phase of the cycle.
d. The ARM9TDMI featured a combinational path from DABORT to DnMREQ. This path does not exist in ARM9E-S.
e. With ARM9TDMI, the breakpoint and watchpoint inputs had to be asserted in the phase 1 of the cycle following the cycle in
which the data was returned from the memory system. With ARM9E-S, external breakpoints and watchpoints must be
returned in the same cycle as the data.
f. All JTAG signals are synchronous to CLK on the ARM9E-S. There is no asynchronous TCK as on the ARM9TDMI hard
macrocell. An external synchronizing circuit can be used to generate TCLKEN when an asynchronous TCK is required.
However, CLK must be running.
g. The DBGRQI signal in ARM9TDMI features a combinational input to output path from EDBGRQ. This has been removed
in ARM9E-S.
h. EDBGRQ must be synchronized externally to the macrocell. It is not an asynchronous input as on the ARM9TDMI hard
macrocell.
i. nFIQ and nIRQ are synchronous inputs to the ARM9E-S, and are sampled on the rising edge of CLK. Asynchronous
interrupts are not supported.
j. The ARM9E-S supports only unidirectional data buses, RDATA[31:0], and WDATA[31:0]. When a bidirectional bus is
required, you must implement external bus combining logic.
B-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Differences Between the ARM9E-S and the ARM9TDMI
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. B-5
Differences Between the ARM9E-S and the ARM9TDMI
All other inputs are sampled on rising edge of CLK when the clock enable is active
HIGH, for example:
• IABORT setup time is Tisiabort, hold time is Tihiabort, when CLKEN is active.
• RDATA setup time is Tisrdata, hold time is Tihrdata, when CLKEN is active.
• DBGTMS, DBGTDI setup time is Tistdi, hold time is Tihtdi, when DBGTCKEN
is active.
Outputs are all sampled on the rising edge of CLK with the appropriate clock enable
active, for example:
• IA output hold time is Tohiaddr, valid time is Toviaddr when CLKEN is active.
• InMREQ, ISEQ output hold time is Tohitrans, valid time is Tovitrans when CLKEN
is active.
Similarly, all memory, coprocessor, and debug signal expansion signals are defined with
input setup parameters of Tis... , hold parameters of Tih... , output hold parameters of
Toh...and output valid parameters of Tov... .
B-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Differences Between the ARM9E-S and the ARM9TDMI
The master clock to the ARM9E-S, CLK, is inverted with respect to GCLK used on
the ARM9TDMI hard macrocell. The rising edge of the clock is the active edge of the
clock, on which all inputs are sampled.
All outputs are generated safely from the rising edge of CLK, with the following
exceptions:
CORECLKENOUT
This signal can change from the rising edge of CLK and has a
causal relationship with CLKEN.
DBGTDO This signal can change from the rising edge of CLK and has a
causal relationship with DBGSDOUT.
All JTAG signals on the ARM9E-S are synchronous to the master clock input, CLK.
When an external TCK is used, use an external synchronizer to the ARM9E-S.
As with all ARM9E-S signals, the interrupt signals, nIRQ and nFIQ, are sampled on
the rising edge of CLK.
When you are converting an ARM9TDMI hard macrocell design where the ISYNC
signal is asserted LOW, add a synchronizer to the design to synchronize the interrupt
signals before they are applied to the ARM9E-S.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. B-7
Differences Between the ARM9E-S and the ARM9TDMI
Because the CLKEN signal is sampled only on the rising edge of the clock, the address
class outputs still change in a cycle in which CLKEN is LOW. (This is similar to the
behavior of I/DnMREQ and I/DSEQ in an ARM9TDMI hard macrocell system, when
a wait state is inserted using nWAIT.) Make sure that the memory system design takes
this into account.
Also make sure that the correct address is used for the memory cycle, even though
IA/DA[31:0] might have moved on to the address for the next memory cycle.
B-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Differences Between the ARM9E-S and the ARM9TDMI
• From (test) reset, the ARM9E-S is configured into monitor mode debug. A
debugger requiring the ARM processor halt mode debug features must clear the
monitor mode enable bit in the debug control register. See Debug control register
on page C-34.
• There are a number of instructions that have different cycle counts on ARM9E-S
to ARM9TDMI. In particular, the MRS instruction always requires two cycles to
execute on ARM9E-S. See Chapter 8 Instruction Cycle Times for more details on
instruction cycle timing.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. B-9
Differences Between the ARM9E-S and the ARM9TDMI
B-10 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Appendix C-
Debug in depth
This appendix describes in further detail the debug features of the ARM9E-S, and
includes additional information about the EmbeddedICE-RT logic. It contains the
following sections:
• Scan chains and JTAG interface on page C-2
• Resetting the TAP controller on page C-5
• Instruction register on page C-6
• Public instructions on page C-7
• Test data registers on page C-10
• ARM9E-S core clock domains on page C-17
• Determining the core and system state on page C-18
• Behavior of the program counter during debug on page C-24
• Priorities and exceptions on page C-27
• EmbeddedICE-RT logic on page C-28
• Vector catching on page C-39
• Single-stepping on page C-40
• Coupling breakpoints and watchpoints on page C-41
• Disabling EmbeddedICE-RT on page C-44
• EmbeddedICE-RT timing on page C-45.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-1
Debug in depth
The scan chains allow commands to be serially shifted into the ARM core, allowing the
state of the core and the system to be interrogated. The JTAG interface requires only
five pins on the package.
A JTAG style Test Access Port (TAP) controller controls the scan chains. For further
details of the JTAG specification, refer to IEEE Standard 1149.1 - 1990 Standard Test
Access Port and Boundary-Scan Architecture.
The two scan paths used for debug purposes are referred to as scan chain 1 and scan
chain 2, and are shown in Figure C-1.
ARM9E-S
Scan chain 1 ARM9E-S
EmbeddedICE-RT
core
Scan chain 2
ARM9E-S
TAP controller
C-2 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
Scan chain 1
Scan chain 1 is used for debugging the ARM9E-S core when it has entered debug state.
You can use it to:
• inject instructions into the ARM pipeline
• read and write its registers
• perform memory accesses.
Scan chain 2
Scan chain 2 allows access to the EmbeddedICE-RT registers. Refer to Test data
registers on page C-10 for details.
The process of serial test and debug is best explained in conjunction with the JTAG state
machine. Figure C-2 on page C-4 shows the state transitions that occur in the TAP
controller. The state numbers shown in the diagram are output from the ARM9E-S on
the DBGTAPSM[3:0] bits.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-3
Debug in depth
Test-Logic-Reset
0xF
tms=1
tms=0
tms=1 tms=1
Capture-DR Capture-IR
0x6 0xE
tms=0 tms=0
Shift-DR Shift-IR
0x2 0xA
tms=0 tms=0
tms=1 tms=1
tms=0 tms=0
Pause-DR Pause-IR
0x3 0xB
tms=0 tms=0
tms=1 tms=1
tms=0 tms=0
Exit2-DR Exit2-IR
0x0 0x8
tms=1 tms=1
Update-DR Update-IR
0x5 0xD
1. From IEEE Std 1149.1-1990. Copyright 1999 IEEE. All rights reserved.
C-4 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
• to ready the boundary-scan interface for use, drive DBGnTRST LOW, and then
HIGH again
• to prevent the boundary-scan interface from being used, the DBGnTRST input
can be tied permanently LOW.
Note
A clock on CLK with DBGTCKEN HIGH is not necessary to reset the device.
1. System mode is selected. This means that the boundary-scan cells do not intercept
any of the signals passing between the external system and the core.
2. The IDCODE instruction is selected. When the TAP controller is put into the
SHIFT-DR state, and CLK is pulsed while enabled by DBGTCKEN, the
contents of the ID register are clocked out of DBGTDO.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-5
Debug in depth
The fixed value 0001 is loaded into the instruction register during the CAPTURE-IR
controller state.
C-6 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
EXTEST 0000
SAMPLE/PRELOAD 0011
SCAN_N 0010
INTEST 1100
IDCODE 1110
BYPASS 1111
RESTART 0100
In the following descriptions, the ARM9E-S samples DBGTDI and DBGTMS on the
rising edge of CLK with DBGTCKEN HIGH. All output transitions on DBGTDO
occur as a result of the rising edge of CLK with DBGTCKEN HIGH.
The EXTEST instruction allows a boundary scan chain to be connected between the
DBGSDIN and DBGSDOUT pins. External logic, based on the DBGTAPSM,
DBGSCREG, and DBGIR signals is required to use the EXTEST function for such a
boundary scan chain. Using EXTEST with scan chain 1 or scan chain 2 selected is
UNPREDICTABLE.
You must use this instruction to preload the boundary scan register with known data
prior to selecting INTEST or EXTEST instructions.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-7
Debug in depth
The SCAN_N instruction connects the scan path select register between DBGTDI and
DBGTDO:
• In the CAPTURE-DR state, the fixed value 1000 is loaded into the register.
• In the SHIFT-DR state, the ID number of the desired scan path is shifted into the
scan path select register.
• In the UPDATE-DR state, the scan register of the selected scan chain is connected
between DBGTDI and DBGTDO, and remains connected until a subsequent
SCAN_N instruction is issued.
The scan path select register is 4 bits long in this implementation, although no finite
length is specified.
The INTEST instruction places the selected scan chain in test mode:
• The INTEST instruction connects the selected scan chain between DBGTDI and
DBGTDO.
• When the INTEST instruction is loaded into the instruction register, all the scan
cells are placed in their test mode of operation. For example, in test mode, input
cells select the output of the scan chain to be applied to the core.
• In the CAPTURE-DR state, the value of the data applied from the core logic to
the output scan cells, and the value of the data applied from the system logic to
the input scan cells is captured.
• In the SHIFT-DR state, the previously-captured test data is shifted out of the scan
chain via the DBGTDO pin, while new test data is shifted in via the DBGTDI
pin.
C-8 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
The IDCODE instruction connects the device identification code register (or
ID register) between DBGTDI and DBGTDO. The ID register is a 32-bit register that
allows the manufacturer, part number, and version of a component to be read through
the TAP. See ARM9E-S device identification (ID) code register on page C-10 for details
of the ID register format.
When the IDCODE instruction is loaded into the instruction register, all the scan cells
are placed in their normal (System) mode of operation:
The BYPASS instruction connects a 1-bit shift register (the bypass register) between
DBGTDI and DBGTDO.
When the BYPASS instruction is loaded into the instruction register, all the scan cells
assume their normal (System) mode of operation. The BYPASS instruction has no
effect on the system pins:
• In the SHIFT-DR state, test data is shifted into the bypass register through
DBGTDI, and shifted out through DBGTDO after a delay of one CLK cycle.
The first bit to shift out is a zero.
The RESTART instruction is used to restart the processor on exit from debug state. The
RESTART instruction connects the bypass register between DBGTDI and DBGTDO,
and the TAP controller behaves as if the BYPASS instruction has been loaded.
The processor exits debug state when the RUN-TEST/IDLE state is entered.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-9
Debug in depth
In addition, other scan chains can be added between DBGSDOUT and DBGSDIN, and
selected when in INTEST mode.
In the following descriptions, data is shifted during every CLK cycle when
DBGTCKEN enable is HIGH.
Length 1 bit.
C-10 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Note
IEEE Standard 1149.1 requires that bit 0 of the ID register be set to 1.
Length 4 bits.
Operating mode In the SHIFT-IR state, the instruction register is selected as the
serial path between DBGTDI and DBGTDO.
During the CAPTURE-IR state, the binary value b0001 is loaded
into this register. This value is shifted out during SHIFT-IR (least
significant bit first), while a new instruction is shifted in (least
significant bit first).
During the UPDATE-IR state, the value in the instruction register
specifies the current instruction.
On reset, IDCODE specifies the current instruction.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-11
Debug in depth
Length 5 bits.
Operating mode SCAN_N as the current instruction in the SHIFT-DR state selects
the scan path select register as the serial path between DBGTDI
and DBGTDO.
During the CAPTURE-DR state, the value b10000 is loaded into
this register. This value is shifted out during SHIFT-DR (least
significant bit first), while a new value is shifted in (least
significant bit first). During the UPDATE-DR state, the value in
the scan path select register selects a scan chain to become the
currently active scan chain. All further instructions such as
INTEST then apply to that scan chain.
The currently selected scan chain changes only when a SCAN_N
instruction is executed, or when a reset occurs. On reset, scan
chain 3 is selected as the active scan chain.
The number of the currently-selected scan chain is reflected on the
DBGSCREG[4:0] output bus. You can use the TAP controller to
drive external chains in addition to those within the ARM9E-S
macrocell. The external scan chain is connected between
DBGSDIN and DBGSDOUT, and must be assigned a number.
The control signals are derived from DBGSCREG[4:0],
DBGIR[4:0], DBGTAPSM[3:0] and the clock, CLK, and clock
enable, DBGTCKEN.
Scan chain
Function
number
0 Reserved
1 Debug
2 EmbeddedICE-RT
programming
C-12 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
Scan chain
Function
number
3 External boundary
scan
4–15 Reserved
16–31 Unassigned
The scan chain present between DBGSDIN and DBGSDOUT is connected between
DBGTDI and DBGTDO whenever scan chain 3 is selected, or when any unassigned
scan chain number is selected. If there is more than one external scan chain, a
multiplexor must be built externally to apply the desired scan chain output to
DBGSDOUT. The multiplexor can be controlled by decoding DBGSCREG[4:0].
The scan chains allow serial access to the core logic and to the EmbeddedICE hardware
for programming purposes. Each scan chain cell is simple, and comprises a serial
register and a multiplexor. A typical cell is shown in Figure C-4.
0 1
CLK
Test mode
select
Shift
enable
Serial data in
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-13
Debug in depth
For input cells, the capture stage involves copying the value of the system input to the
core into the serial register. During shift, this value is output serially. The value applied
to the core from an input cell is either the system input or the contents of the parallel
register (loads from the shift register after UPDATE-DR state) under multiplexor
control.
For output cells, capture involves placing the value of a core output into the serial
register. During shift, this value is serially output as before. The value applied to the
system from an output cell is either the core output or the contents of the serial register.
All the control signals for the scan cells are generated internally by the TAP controller.
The action of the TAP controller is determined by current instruction and the state of the
TAP state machine.
Scan chain 1
Purpose Scan chain 1 is used for communication between the debugger and
the ARM9E-S core. It is used to read and write data, and to scan
instructions into the instruction pipeline. The SCAN_N
instruction is used to select scan chain 1.
Length 67 bits.
Scan chain 1 provides serial access to RDATA[31:0] when the core is doing a read, and
to the WDATA[31:0] bus when the core is doing a write. It also provides serial access
to the INSTR[31:0] bus, and to the control bits, SYSPEED and WPTANDBKPT. For
compatibility with the ARM9TDMI, there is one additional unused bit that must be zero
when writing, and is UNPREDICTABLE when reading.
There are 67 bits in this scan chain, the order being (from serial data in to out):
1. INSTR[31:0]
2. SYSPEED
3. WPTANDBKPT
4. unused bit
5. RDATA[31:0] or WDATA[31:0].
C-14 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
66 RDATA[0] Bidir
/WDATA[0]
35 RDATA[31] Bidir
/WDATA[31]
34 Unused -
33 WPTANDBKPT Input
32 SYSSPEED Input
31 INSTR[31] Input
0 INSTR[0] Input
The scan chain order is the same as for the ARM9TDMI. The unused bit is to retain
compatibility with ARM9TDMI.
• For a read the data value taken from the 32 bits in the scan chain allocated for data
is used to deliver the RDATA[31:0] value to the core.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-15
Debug in depth
Scan chain 2
Length 38 bits.
Scan chain order From DBGTDI to DBGTDO. Read/write, register address bits 4
to 0, data values bits 31 to 0.
During SHIFT-DR, a data value is shifted into the serial register. Bits 32 to 36 specify
the address of the EmbeddedICE register to be accessed.
During UPDATE-DR, this register is either read or written depending on the value of
bit 37 (0 = read, 1 = write).
C-16 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
During normal operation, CLKEN conditions CLK to clock the core. When the
ARM9E-S is in debug state, DBGTCKEN conditions CLK to clock the core.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-17
Debug in depth
Before examining the core and system state, the debugger must determine whether the
processor entered debug from Thumb state or ARM state by examining bit 4 of the
EmbeddedICE-RT debug status register. When bit 4 is HIGH, the core has entered
debug from Thumb state. When bit 4 is LOW the core has entered debug from ARM
state.
When the processor has entered debug state from Thumb state, the simplest method is
for the debugger to force the core back into ARM state. The debugger can then execute
the same sequence of instructions to determine the processor state.
To force the processor into ARM state, execute the following sequence of Thumb
instructions on the core (with the SYSSPEED bit set LOW):
STR R0, [R1]; Save R0 before use
MOV R0, PC ; Copy PC into R0
STR R0, [R1]; Now save the PC in R0
BX PC ; Jump into ARM state
MOV R8, R8 ; NOP
MOV R8, R8 ; NOP
Note
Because all Thumb instructions are only 16 bits long, the simplest method, when
shifting scan chain 1, is to repeat the instruction. For example, the encoding for BX R0
is 0x4700, so when 0x47004700 shifts into scan chain 1, the debugger does not have
to keep track of the half of the bus on which the processor expects to read the data.
You can use the sequences of ARM instructions shown in Example C-1 on page C-19
to determine the processor state.
With the processor in the ARM state, typically the first instruction to execute is:
STMIA R0, {R0-R15}
This instruction causes the contents of the registers to appear on the data bus. You can
then sample and shift out these values.
C-18 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
Note
The use of r0 as the base register for the STM is only for illustration, and you can use any
register.
After you have determined the values in the bank of registers available in the current
mode, you might want to access the other banked registers. To do this, you must change
mode. Normally, a mode change can occur only if the core is already in a privileged
mode. However, while in debug state, a mode change can occur from any mode into any
other mode.
The debugger must restore the original mode before exiting debug state. For example,
if the debugger has been requested to return the state of the User mode registers and FIQ
mode registers, and debug state is entered in Supervisor mode, the instruction sequence
can be as shown in Example C-1.
All these instructions execute at debug speed. Debug speed is much slower than system
speed. This is because between each core clock, 67 clocks occur in order to shift in an
instruction, or shift out data. Executing instructions this slowly is acceptable for
accessing the core state because the ARM9E-S is fully static. However, you cannot use
this method for determining the state of the rest of the system.
While in debug state, you can only scan the following ARM or Thumb instructions into
the instruction pipeline for execution:
• all data processing operations
• all load, store, load multiple, and store multiple instructions
• MSR and MRS
• B, BL, and BX.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-19
Debug in depth
To meet the dynamic timing requirements of the memory system, any attempt to access
system state must occur synchronously. Therefore, the ARM9E-S must be forced to
synchronize back to system speed. Bit 32 of scan chain 1, SYSSPEED, controls this.
You can place a legal debug instruction onto the instruction data bus of scan chain 1
with bit 32 (the SYSSPEED bit) LOW. This instruction is then executed at debug speed.
To execute an instruction at system speed, a NOP (such as MOV R0, R0) must be
scanned in as the next instruction with bit 32 set HIGH.
After the system speed instructions are scanned into the instruction data bus and clocked
into the pipeline, the RESTART instruction must be loaded into the TAP controller. This
causes the ARM9E-S automatically to resynchronize back to CLK conditioned with
CLKEN when the TAP controller enters RUN-TEST/IDLE state, and executes the
instruction at system speed. Debug state is reentered once the instruction completes
execution, when the processor switches itself back to CLK conditioned with
DBGTCKEN. When the instruction completes, DBGACK is HIGH. At this point
INTEST can be selected in the TAP controller, and debugging can resume.
To determine if a system speed instruction has completed, the debugger must look at
SYSCOMP (bit 3 of the debug status register). The ARM9E-S must access memory
through the data data bus interface, as this access can be stalled indefinitely by
CLKEN. Therefore, the only way to determine if the memory access has completed is
to examine the SYSCOMP bit. When this bit is HIGH, the instruction has completed.
The state of the system memory can be fed back to the debug host by using system speed
load multiples and debug speed store multiples.
There are restrictions on which instructions can have the SYSSPEED bit set. The valid
instructions on which to set this bit are:
• loads
• stores
• load multiple
• store multiple.
When the ARM9E-S returns to debug state after a system speed access, the SYSSPEED
bit is set LOW. The state of this bit gives the debugger information about why the core
entered debug state the first time this scan chain is read.
C-20 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
After restoring the internal state, a branch instruction must be loaded into the pipeline.
See Behavior of the program counter during debug on page C-24 for details on
calculating the branch.
The SYSSPEED bit of scan chain 1 forces the ARM9E-S to resynchronize back to CLK
conditioned with CLKEN. The penultimate instruction in the debug sequence is a
branch to the instruction at which execution is to resume. This is scanned in with bit 32
(SYSSPEED) set LOW. The final instruction to be scanned in is a NOP (such as MOV
R0, R0), with bit 32 set HIGH. The core is then clocked to load this instruction into the
pipeline.
Next, the RESTART instruction is selected in the TAP controller. When the state
machine enters the RUN-TEST/IDLE state, the scan chain reverts back to System
mode, and clock resynchronization to CLK conditioned with CLKEN occurs within
the ARM9E-S. Normal operation then resumes, with instructions being fetched from
memory.
The delay, waiting until the state machine is in RUN-TEST/IDLE state, allows
conditions to be set up in other devices in a multiprocessor system without taking
immediate effect. Then, when RUN-TEST/IDLE state is entered, all the processors
resume operation simultaneously.
The function of DBGACK is to tell the rest of the system when the ARM9E-S is in
debug state. You can use this signal to inhibit peripherals such as watchdog timers that
have real-time characteristics. Also, you can use DBGACK to mask out memory
accesses that are caused by the debugging process. For example, when the ARM9E-S
enters debug state after a breakpoint, the instruction pipeline contains the breakpointed
instruction plus two other instructions that have been prefetched. On entry to debug
state, the pipeline is flushed. So, on exit from debug state, the pipeline must be refilled
to its previous state. Therefore, because of the debugging process, more memory
accesses occur than are normally expected. It is possible, using the DBGACK signal
and a small amount of external logic, for a peripheral which is sensitive to the number
of memory accesses to return the same result with and without debugging.
Note
You can only use DBGACK in such a way using breakpoints. It does not mask the
correct number of memory accesses after a watchpoint.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-21
Debug in depth
For example, consider a peripheral that simply counts the number of instruction fetches.
This device must return the same answer after a program has run both with and without
debugging.
Figure C-5 shows the behavior of the ARM9E-S on exit from debug state.
CLK
INSTR[31:0]
DBGACK
In Figure C-6 on page C-23, you can see that two instructions are fetched after the
instruction which breakpoints. Figure C-5 shows that DBGACK masks the first three
instruction fetches out of the debug state, corresponding to the breakpoint instruction,
and the two instructions prefetched after it.
Under some circumstances DBGACK can remain HIGH for more than three
instruction fetches. Therefore, if you require precise instruction access counting, you
must provide some external logic to generate a modified DBGACK that always falls
after three instruction fetches.
Note
When system speed accesses occur, DBGACK remains HIGH throughout. It then falls
after the system speed memory accesses are completed, and finally rises again as the
processor reenters debug state. Therefore, DBGACK masks all system speed memory
accesses.
C-22 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
CLK
IA[31:1]
INSTR[31:0] 1 2 3
DBGIEBKPT
DBGACK
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-23
Debug in depth
C.8.1 Breakpoints
For example, if the ARM9E-S entered debug state from a breakpoint set on a given
address and two debug speed instructions were executed, a branch of seven addresses
must occur (four for debug entry, plus two for the instructions, plus one for the final
branch). The following sequence shows ARM instructions scanned into scan chain 1.
This is the Most Significant Bit (MSB) first, so the first digit represents the value to be
scanned into the SYSSPEED bit, followed by the instruction.
0 EAFFFFF9 ; B -7 addresses (two’s complement)
1 E1A00000 ; NOP (MOV R0, R0), SYSSPEED bit is set
After the ARM9E-S enters debug state, it must execute a minimum of two instructions
before the branch, although these can both be NOPs (MOV R0, R0). For small branches,
you can replace the final branch with a subtract, with the PC as the destination (SUB
PC, PC, #28 in the above example).
C.8.2 Watchpoints
To return to program execution after entry to debug state from a watchpoint, use the
same procedure described in Breakpoints.
Debug entry adds four addresses to the PC, and every instruction adds one address. The
difference from breakpoint is that the instruction that caused the watchpoint has
executed, and the program must return to the next instruction.
C-24 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
If a watchpointed access also has a Data Abort returned, the ARM9E-S enters debug
state in Abort mode. Entry into debug is held off until the core changes into Abort mode,
and has fetched the instruction from the abort vector.
A similar sequence follows when an interrupt, or any other exception, occurs during a
watchpointed memory access. The ARM9E-S enters debug state in the mode of the
exception. The debugger must check to see if an exception has occurred by examining
the current and previous mode (in the CPSR and SPSR), and the value of the PC. When
an exception has taken place, you must be given the choice of servicing the exception
before debugging.
For example, suppose that an abort has occurred on a watchpointed access and ten
instructions have been executed in debug state. You can use the following sequence to
return to program execution:
0 EAFFFFF1; B -15 addresses (two’s complement)
1 E1A00000; NOP (MOV R0, R0), SYSSPEED bit is set
This code forces a branch back to the abort vector, causing the instruction at that
location to be refetched and executed.
Note
After the abort service routine, the instruction that caused the abort and watchpoint is
refetched and executed. This triggers the watchpoint again, and the ARM9E-S reenters
debug state.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-25
Debug in depth
Entry into debug state through a debug request is similar to a breakpoint. Entry to debug
state adds four addresses to the PC, and every instruction executed in debug state adds
one address.
For example, the following sequence handles a situation in which the user has invoked
a debug request, and then decides to return to program execution immediately:
0 EAFFFFFB; B -5 addresses (2’s complement)
1 E1A00000; NOP (MOV R0, R0), SYSSPEED bit is set
This code restores the PC, and restarts the program from the next instruction.
When a system speed access is performed during debug state, the value of the PC
increases by five addresses. System speed instructions access the memory system, and
so it is possible for aborts to take place. If an abort occurs during a system speed
memory access, the ARM9E-S enters Abort mode before returning to debug state.
This scenario is similar to an aborted watchpoint, but the problem is much harder to fix
because the abort is not caused by an instruction in the main program, and so the PC
does not point to the instruction that caused the abort. An abort handler usually looks at
the PC to determine the instruction that caused the abort, and the abort address. In this
case, the value of the PC is invalid, but because the debugger can determine which
location was being accessed, you can write the debugger to help the abort handler fix
the memory system.
where N is the number of debug speed instructions executed (including the final
branch), and S is the number of system speed instructions executed.
C-26 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
When a breakpointed instruction fetch causes a Prefetch Abort, the abort is taken and
the breakpoint is disregarded. Normally, Prefetch Aborts occur when, for example, an
access is made to a virtual address that does not physically exist, and the returned data
is therefore invalid. In such a case, the normal action of the operating system is to swap
in the page of memory, and to return to the previously invalid address. This time, when
the instruction is fetched, and providing the breakpoint is activated (it might be
data-dependent), the ARM9E-S enters debug state.
The Prefetch Abort, therefore, takes higher priority than the breakpoint.
C.9.2 Interrupts
When the ARM9E-S enters debug state, interrupts are automatically disabled.
If an interrupt is pending during the instruction prior to entering debug state, the
ARM9E-S enters debug state in the mode of the interrupt. On entry to debug state, the
debugger cannot assume that the ARM9E-S is in the mode expected by your program.
The ARM9E-S must check the PC, the CPSR, and the SPSR to determine accurately the
reason for the exception.
Debug, therefore, takes higher priority than the interrupt, but the ARM9E-S does
recognize that an interrupt has occurred.
When a Data Abort occurs on a watchpointed access, the ARM9E-S enters debug state
in Abort mode. The watchpoint, therefore, has higher priority than the abort, but the
ARM9E-S remembers that the abort happened.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-27
Debug in depth
Because the ARM9E-S processor core has a Harvard Architecture, you must specify
whether the watchpoint unit examines the instruction or the data interface. This is
specified by bit 3 of the control value register:
• when bit 3 is set, the data interface is examined
• when bit 3 is clear, the instruction interface is examined.
There cannot be a don’t care case for this bit because the comparators cannot compare
the values on both buses simultaneously. Therefore, bit 3 of the control mask register is
always clear and cannot be programmed HIGH. Bit 3 also determines whether the
internal IBREAKPT or DWPT signal must be driven by the result of the comparison.
Figure C-7 on page C-30 gives an overview of the operation of the EmbeddedICE-RT
logic.
C-28 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
a. An attempted write to the comms channel control register can be used to reset bit 0 of that
register.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-29
Debug in depth
Scan chain
register
R/W Update
4
5 Address
Address decoder
0 Enable
31
Control
Control
Control
I Control
Breakpoint/
D Control watchpoint
32
Data
Data
Rangeout
Data
Data
INSTR[31:0]
DD[31:0]
Address
Address
Address
IA[31:1]
0 DA[31:0]
TDI TDO
For each value register there is an associated mask register in the same format. Setting
a bit to 1 in the mask register causes the corresponding bit in the value register to be
ignored in any comparison.
C-30 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
In this case, the format of the control register is as shown in Figure C-8.
Note
You cannot mask bit 8 and bit 3.
8 7 6 5 4 3 2 1 0
Bit
Name Function
number
0 DnRW Compares against the data not read/write signal from the core in
order to detect the direction of the data data bus activity. DnRW
is 0 for a read, and 1 for a write.
2:1 DMAS[1:0] Compares against the DMAS[1:0] signal from the core in order
to detect the size of the data data bus activity.
4 DnTRANS Compares against the data not translate signal from the core in
order to determine between a User mode (DnTRANS = 0) data
transfer, and a privileged mode (DnTRANS = 1) transfer.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-31
Debug in depth
Table C-5 Watchpoint control register for data comparison functions (continued)
Bit
Name Function
number
8 7 6 5 4 3 2 1 0
C-32 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
Bit
Name Function
number
1 ITBIT Compares against the Thumb state signal from the core to
determine between a Thumb (ITBIT = 1) instruction fetch or an
ARM (ITBIT = 0) instruction fetch.
4 InTRANS Compares against the not translate signal from the core in order to
determine between a user mode (InTRANS = 0) instruction fetch,
and a privileged mode (InTRANS = 1) fetch.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-33
Debug in depth
The debug control register is 6 bits wide. Writing control bits occurs during a register
write access (with the read/write bit HIGH). Reading control bits occurs during a
register read access (with the read/write bit LOW).
5 4 3 2 1 0
These functions are described in Table C-7 and Table C-8 on page C-35.
Bit
Name Function
number
5 Embedded- Controls the address and data comparison logic contained within
ICE disable the Embedded-ICE logic. When set to 1, the address and data
comparators are disabled. When set to 0, the address and data
comparators are enabled. You can use this bit to save power in a
system where the Embedded-ICE functionality is not required.
The reset state of this bit is 0 (comparators enabled). An extra
piece of logic initialized by debug reset ensures that the
Embedded-ICE logic is automatically disabled out of reset. This
extra logic is set by debug reset and is automatically reset on the
first access to scan chain 2.
1:0 DBGRQ, These bits allow the values on DBGRQ and DBGACK to be
DBGACK forced.
C-34 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
0 0 Permitted
1 x Inhibited
x 1 Inhibited
Both IRQ and FIQ are disabled when the processor is in debug state (DBGACK =1),
or when INTDIS is forced.
As shown in Figure C-12 on page C-37, the value stored in bit 1 of the control register
is synchronized and then ORed with the external EDBGRQ before being applied to the
processor.
In the case of DBGACK, the value of DBGACK from the core is ORed with the value
held in bit 0 to generate the external value of DBGACK seen at the periphery of the
ARM9E-S. This allows the debug system to signal to the rest of the system that the core
is still being debugged even when system-speed accesses are being performed (in which
case the internal DBGACK signal from the core is LOW).
The structure of the debug control and status registers is shown in Figure C-12 on
page C-37.
The debug status register is five bits wide. If it is accessed for a read (with the read/write
bit LOW), the status bits are read. The format of the debug status register is shown in
Figure C-11.
4 3 2 1 0
8 7 6 5 4 3 2 1 0
ITBIT SYSCOMP IFEN DBGRQ DBGACK
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-35
Debug in depth
Bit
Name Function
number
1:0 DBGRQ, Allow the values on the synchronized versions of EDBGRQ and
DBGACK DBGACK to be read.
2 IFEN Allows the state of the core interrupt enable signal to be read.
3 SYSCOMP Allows the state of the SYSCOMP bit from the core to be read.
This allows the debugger to determine that a memory access
from the debug state has completed.
4 ITBIT Allows the status of the output ITBIT to be read. This enables
the debugger to determine what state the processor is in, and
therefore which instructions to execute.
The structure of the debug control and status registers is shown in Figure C-12 on
page C-37.
C-36 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
ITBIT Bit 4
(from core) ITBIT
SYSCOMP Bit 3
(from core) SYSCOMP
DBGACK
Interrupt mask enable
(from core) + (to core)
Bit 2
INTDIS
Bit 2
+ IFEN
Bit 1
DBGRQ
DBGRQ
EDBGRQ + (to core)
(from ARM9E-S input)
Bit 1
DBGRQ
DBGACK Bit 0
(from core) DBGACK
DBGACK
Bit 0 + (to ARM9E-S output)
DBGACK
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-37
Debug in depth
7 6 5 4 3 2 1 0
C-38 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
For example, if the processor executes a SWI instruction while bit 2 of the vector catch
register is set, the ARM9E-S fetches an instruction from location 0x8. The vector catch
hardware detects this access and forces the internal IBREAKPT signal HIGH into the
ARM9E-S control logic. This, in turn, forces the ARM9E-S to enter debug state.
In monitor mode debug, vector catching is disabled on Data Aborts and Prefetch Aborts
to avoid the processor being forced into an unrecoverable state as a result of the aborts
that are generated for the monitor mode debug.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-39
Debug in depth
C.12 Single-stepping
The ARM9E-S EmbeddedICE-RT logic contains logic that allows efficient
single-stepping through code. This leaves the watchpoint comparators free for general
use.
Enable this function by setting bit 3 of the debug control register. The state of this bit
must only be altered while the processor is in debug state. If the processor exits debug
state and this bit is HIGH, the processor fetches an instruction, executes it, and then
immediately reenters debug state. This happens independently of the watchpoint
comparators. If a system speed data access is performed while in debug state, the
debugger must ensure that the control bit is clear first.
Note
This bit must not be set when using monitor mode debug.
C-40 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
Let:
Av[31:0] be the value in the address value register
Am[31:0] be the value in the address mask register
A[31:0] be the IA bus from the ARM9E-S if control register bit 3 is clear, or the
DA bus from the ARM9E-S if control register bit 3 is set
Dv[31:0] be the value in the data value register
Dm[31:0] be the value in the data mask register
D[31:0] be the INSTR bus from the ARM9E-S if control register bit 3 is clear, or
the RDATA bus from the ARM9E-S if control register bit 3 is set and the
processor is doing a read, or the WDATA bus from the ARM9E-S if
control register bit 3 is set and the processor is doing a write
Cv[8:0] be the value in the control value register
Cm[7:0] be the value in the control mask register
C[9:0] be the combined control bus from the ARM9E-S, other watchpoint
registers, and the DBGEXT signal.
CHAINOUT signal
Note
There is no CHAIN input to Watchpoint 1 and no CHAIN output from Watchpoint 0.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-41
Debug in depth
Take, for example, the request by a debugger to breakpoint on the instruction at location
YYY when running process XXX in a multiprocess system. If the current process ID is
stored in memory, you can implement the above function with a watchpoint and
breakpoint chained together. The watchpoint address points to a known memory
location containing the current process ID, the watchpoint data points to the required
process ID, and the ENABLE bit is set to off.
The address comparator output of the watchpoint is used to drive the write enable for
the CHAINOUT latch. The input to the latch is the output of the data comparator from
the same watchpoint. The output of the latch drives the CHAIN input of the breakpoint
comparator. The address YYY is stored in the breakpoint register, and when the
CHAIN input is asserted, the breakpoint address matches, and the breakpoint triggers
correctly.
This RANGE input allows you to couple two breakpoints together to form range
breakpoints.
For Watchpoint 1:
C-42 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
For Watchpoint 0:
If Watchpoint 0 matches but Watchpoint 1 does not (that is the RANGE input to
Watchpoint 0 is 0), the breakpoint is triggered.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-43
Debug in depth
C-44 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Debug in depth
Refer to Chapter 9 AC Parameters for details of the required setup and hold times for
these signals.
ARM DDI 0165B Copyright © 2000 ARM Limited. All rights reserved. C-45
Debug in depth
C-46 Copyright © 2000 ARM Limited. All rights reserved. ARM DDI 0165B
Index
The items in this index are listed in alphabetic order. The references given are to page numbers.
A ARM Boundary-scan
instruction set 1-5 chain cells C-5
Abort 2-23 instruction set summary 1-12 interface C-5
Data 2-23, C-27 state 1-5, 2-3 Breakpoint instruction 2-25
handler 2-24 ARM state to Thumb state 2-3
mode 2-8 Breakpoints 7-7, 7-9, C-24
Prefetch 2-23, C-27 ARM9E-S instruction boundary 7-10
vector C-25 architecture 1-5 Prefetch Abort 7-10
block diagram 1-7 Burst types 4-10
Aborted watchpoint C-26 core diagram 1-7
Access functional diagram 1-7 Bus cycles, CLKEN 4-31
system speed C-24 instruction set 1-10 Busy-wait 6-6, 6-17
watchpointed C-25, C-27 signals compared to abandoned 6-17
Address bits, significant 4-7 ARM9TDMI B-2 interrupted 6-17
Addressing mode 2 1-16 Bypass register C-9, C-10
Addressing mode 2 (privileged) 1-17 B Byte 2-7
access 4-21
Addressing mode 3 1-18 Banked registers 2-9, C-19
Addressing mode 4 (load) 1-18 Big-endian 2-4
Addressing mode 4 (store) 1-18 BKPT 2-25
C
Alignment 2-7 Block diagram, ARM9E-S 1-7 C flag 2-16
CFGBIGEND A-6
CFGDISLTBIT A-6