0% found this document useful (0 votes)
191 views

CortexM4 FPU

CortexM4_FPU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
191 views

CortexM4 FPU

CortexM4_FPU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Cortex M4

Floating Point Unit

Overview
FPU : Floating Point Unit
Handles real number computation
Standardized by IEEE.754-2008

Number format
Arithmetic operations
Number conversion
Special values
4 rounding modes
5 exceptions and their handling

ARM Cortex-M FPU ISA


Supports
Add, subtract, multiply, divide
Multiply and accumulate
Square root operations

C language example
float function1(float number1, float number2)
{
float temp1, temp2;
temp1 = number1 + number2;
temp2 = number1/temp1;
return temp2;
}

# float function1(float number1, float number2)


# {
#
float temp1, temp2;
#
#
temp1 = number1 + number2;
VADD.F32 S1,S0,S1
#
temp2 = number1/temp1;
VDIV.F32 S0,S0,S1
#
#
return temp2;
BX
LR
# }

1 assembly instruction
Call Soft-FPU

# float function1(float number1, float number2)


# {
PUSH
{R4,LR}
MOVS
R4,R0
MOVS
R0,R1
#
float temp1, temp2;
#
#
temp1 = number1 + number2;
MOVS
R1,R4
BL
__aeabi_fadd
MOVS
R1,R0
#
temp2 = number1/temp1;
MOVS
R0,R4
BL
__aeabi_fdiv
#
#
return temp2;
POP
{R4,PC}
# }

Performances
Time execution comparison for a 29 coefficient FIR on float 32 with
and without FPU (CMSIS library)
Execution
Time

10x improvement
Best compromise
Development time
vs. performance

No FPU

FPU

Rounding issues
The precision has some limits
Rounding errors can be accumulated along the various operations an
may provide unaccurate results (do not do financial operations with
floatings)

Few examples
If you are working on two numbers in different base, the hardware
automatically denormalizes one of the two numbers to make the
calculation in the same base
If you are substracting two very close numbers, you are loosing the
relative precision (also called cancellation error)

If you are reorganizing the various operations, you may not


obtain the same result because of the rounding errors

ARM Cortex-M FPU

Introduction
Single precision FPU

Conversion between
Integer numbers
Single precision floating point numbers
Half precision floating point numbers

Handling floating point exceptions (Untrapped)


Dedicated registers
32 single precision registers (S0-S31) which can be viewed as 16
Doubleword registers for load/store operations (D0-D15)
FPSCR for status & configuration

Modifications vs IEEE 754


Full Compliance mode
Process all operations according to IEEE 754

Alternative Half-Precision format


(-1)s x (1 + (Ni.2-i) ) x 216 and no de-normalize number support

Flush-to-zero mode
De-normalized numbers are treated as zero
Associated flags for input and output flush

Default NaN mode


Any operation with an NaN as an input or that generates a NaN
returns the default NaN

Complete implementation
Cortex-M4F does NOT support all operations of IEEE
754-2008
Full implementation is done by software
Unsupported operations

Remainder (% operator)
Round FP number to integer-value FP number
Binary to decimal conversions
Decimal to binary conversions
Direct comparison of Single Precision (SP) and Double Precision
(DP) values

Floating-Point Status & Control Register


Condition code bits
negative, zero, carry and overflow (update on compare
operations)

ARM special operating mode configuration


half-precision, default NaN and flush-to-zero mode

The rounding mode configuration


nearest, zero, plus infinity or minus infinity

The exception flags


Inexact result flag may not be routed to the interrupt controller

FPU instructions

FPU arithmetic instructions


Operation
Absolute value

Description

Assembler

Cycle

of float

VABS.F32

Addition

float
and multiply float
floating point

VNEG.F32
VNMUL.F32
VADD.F32

1
1
1

Subtract

float

VSUB.F32

float
then accumulate float
then subtract float
then accumulate then negate float
the subtract the negate float
then accumulate float
then subtract float
then accumulate then negate float
then subtract then negate float

VMUL.F32
VMLA.F32
VMLS.F32
VNMLA.F32
VNMLS.F32
VFMA.F32
VFMS.F32
VFNMA.F32
VFNMS.F32

1
3
3
3
3
3
3
3
3

float

VDIV.F32

14

of float

VSQRT.F32

14

Negate

Multiply

Multiply
(fused)
Divide
Square-root

FPU compare & convert instructions


Operation
Compare
Convert

Description
float with register or zero
float with register or zero
between integer, fixed-point, half precision
and float

Assembler

Cycle

VCMP.F32
VCMPE.F32

1
1

VCVT.F32

FPU Load/Store Instructions


Operation
Load

Store

Move

Pop
Push

Description
multiple doubles (N doubles)
multiple floats (N floats)
single double
single float
multiple double registers (N doubles)
multiple float registers (N doubles)
single double register
single float register
top/bottom half of double to/from core register
immediate/float to float-register
two floats/one double to/from core registers
one float to/from core register
floating-point control/status to core register
core register to floating-point control/status
double registers from stack
float registers from stack
double registers to stack
float registers to stack

Assembler
VLDM.64
VLDM.32
VLDR.64
VLDR.32
VSTM.64
VSTM.32
VSTR.64
VSTR.32
VMOV
VMOV
VMOV
VMOV
VMRS
VMSR
VPOP.64
VPOP.32
VPUSH.64
VPUSH.32

Cycle
1+2*N
1+N
3
2
1+2*N
1+N
3
2
1
1
2
1
1
1
1+2*N
1+N
1+2*N
1+N

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy