CortexM4 FPU
CortexM4 FPU
Overview
FPU : Floating Point Unit
Handles real number computation
Standardized by IEEE.754-2008
Number format
Arithmetic operations
Number conversion
Special values
4 rounding modes
5 exceptions and their handling
C language example
float function1(float number1, float number2)
{
float temp1, temp2;
temp1 = number1 + number2;
temp2 = number1/temp1;
return temp2;
}
1 assembly instruction
Call Soft-FPU
Performances
Time execution comparison for a 29 coefficient FIR on float 32 with
and without FPU (CMSIS library)
Execution
Time
10x improvement
Best compromise
Development time
vs. performance
No FPU
FPU
Rounding issues
The precision has some limits
Rounding errors can be accumulated along the various operations an
may provide unaccurate results (do not do financial operations with
floatings)
Few examples
If you are working on two numbers in different base, the hardware
automatically denormalizes one of the two numbers to make the
calculation in the same base
If you are substracting two very close numbers, you are loosing the
relative precision (also called cancellation error)
Introduction
Single precision FPU
Conversion between
Integer numbers
Single precision floating point numbers
Half precision floating point numbers
Flush-to-zero mode
De-normalized numbers are treated as zero
Associated flags for input and output flush
Complete implementation
Cortex-M4F does NOT support all operations of IEEE
754-2008
Full implementation is done by software
Unsupported operations
Remainder (% operator)
Round FP number to integer-value FP number
Binary to decimal conversions
Decimal to binary conversions
Direct comparison of Single Precision (SP) and Double Precision
(DP) values
FPU instructions
Description
Assembler
Cycle
of float
VABS.F32
Addition
float
and multiply float
floating point
VNEG.F32
VNMUL.F32
VADD.F32
1
1
1
Subtract
float
VSUB.F32
float
then accumulate float
then subtract float
then accumulate then negate float
the subtract the negate float
then accumulate float
then subtract float
then accumulate then negate float
then subtract then negate float
VMUL.F32
VMLA.F32
VMLS.F32
VNMLA.F32
VNMLS.F32
VFMA.F32
VFMS.F32
VFNMA.F32
VFNMS.F32
1
3
3
3
3
3
3
3
3
float
VDIV.F32
14
of float
VSQRT.F32
14
Negate
Multiply
Multiply
(fused)
Divide
Square-root
Description
float with register or zero
float with register or zero
between integer, fixed-point, half precision
and float
Assembler
Cycle
VCMP.F32
VCMPE.F32
1
1
VCVT.F32
Store
Move
Pop
Push
Description
multiple doubles (N doubles)
multiple floats (N floats)
single double
single float
multiple double registers (N doubles)
multiple float registers (N doubles)
single double register
single float register
top/bottom half of double to/from core register
immediate/float to float-register
two floats/one double to/from core registers
one float to/from core register
floating-point control/status to core register
core register to floating-point control/status
double registers from stack
float registers from stack
double registers to stack
float registers to stack
Assembler
VLDM.64
VLDM.32
VLDR.64
VLDR.32
VSTM.64
VSTM.32
VSTR.64
VSTR.32
VMOV
VMOV
VMOV
VMOV
VMRS
VMSR
VPOP.64
VPOP.32
VPUSH.64
VPUSH.32
Cycle
1+2*N
1+N
3
2
1+2*N
1+N
3
2
1
1
2
1
1
1
1+2*N
1+N
1+2*N
1+N