Skip to content

stm32: Add support for STM32N6xx MCUs and two N6 boards #17171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 36 commits into
base: master
Choose a base branch
from

Conversation

dpgeorge
Copy link
Member

@dpgeorge dpgeorge commented Apr 23, 2025

Summary

This PR adds preliminary support for ST's new STM32N6xx MCUs.

Supported features of this MCU so far are:

  • basic clock tree initialisation, running at 800MHz
  • fully working USB
  • mboot support (required, because there's no internal flash)
  • XSPI in memory-mapped mode
  • machine.Pin
  • machine.UART
  • SD card
  • filesystem
  • ROMFS
  • WiFi and BLE via cyw43-driver (SDIO and UART backend)

Supported boards:

  • NUCLEO_N657X0
  • STM32N6570_DK Edit: this board is no longer supported, can be added later if needed
  • OPENMV_N6

Note that the N6 does not have internal flash, and has some tricky boot sequence, so using a custom bootloader (mboot) is almost a necessity.

The ST CMSIS and HAL files are added verbatim here, but will eventually be moved into stm32lib. Edit: N6 CMSIS and HAL files are now in stm32lib.

OpenMV have generously sponsored the development of this port.

Testing

This PR has been tested on the two N6 boards that are added here.

Copy link

github-actions bot commented Apr 23, 2025

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:   +12 +0.003% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2:    +0 +0.000% RPI_PICO_W
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

Copy link

codecov bot commented Apr 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.57%. Comparing base (b153484) to head (2143232).
Report is 88 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #17171      +/-   ##
==========================================
+ Coverage   98.54%   98.57%   +0.02%     
==========================================
  Files         169      169              
  Lines       21910    21968      +58     
==========================================
+ Hits        21591    21654      +63     
+ Misses        319      314       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

// TODO: if (HAL_PWREx_ConfigSupply(PWR_EXTERNAL_SOURCE_SUPPLY ) != HAL_OK)
//xspi_flash_init();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#define OMV_BOOT_MAGIC_ADDR (0x3401FFFCU)
#define OMV_BOOT_MAGIC_VALUE (0xB00710ADU)
void board_enter_bootloader(void) {
*((uint32_t *) OMV_BOOT_MAGIC_ADDR) = OMV_BOOT_MAGIC_VALUE;
SCB_CleanDCache();
NVIC_SystemReset();
}

To enter our bootloader.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to have the OPENMV_N6 board definition included in MicroPython's stm32 port (like with the OPENMV_AE3)? If so, it should work with mboot. But probably also a good idea to work with the OpenMV bootloader, which means adding this code.

Related: how did you choose address 0x3401FFFC? That's part way through SRAM1, in the FLEXRAM area. You'd need to make sure that isn't cleared on reset like the rest of SRAM1/2, and also make sure it's not overwritten by the bootloader.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to have the OPENMV_N6 board definition included in MicroPython's stm32 port (like with the OPENMV_AE3)? If so, it should work with mboot. But probably also a good idea to work with the OpenMV bootloader, which means adding this code.

Either way is fine with me. If mboot support is kept, I can always use MP_CONFIGFILE to override and not build mboot when building our firmware.

Related: how did you choose address 0x3401FFFC? That's part way through SRAM1, in the FLEXRAM area. You'd need to make sure that isn't cleared on reset like the rest of SRAM1/2, and also make sure it's not overwritten by the bootloader.

Good point. The bootloader uses the 128K (or 64K) DTCM for its memory (heap, stack etc..), so this address is typically 0x2001FFFCU (last word of bootloader's memory) and it's the same for almost all boards. However, for some reason, this doesn't work on the N6, so I just used SRAM1. Note that SRAM1 and SRAM2 don't seem to get erased on reset, otherwise this wouldn't work, but it would still be better to use DTCM for this, but it's not working.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think I just needed a DSB, seems the write might be buffered. The following works too:

#define OMV_BOOT_MAGIC_ADDR   (0x3001FFFCU)
#define OMV_BOOT_MAGIC_VALUE  (0xB00710ADU)

void board_enter_bootloader(void) {
    *((uint32_t *) OMV_BOOT_MAGIC_ADDR) = OMV_BOOT_MAGIC_VALUE;
    __DSB();
    NVIC_SystemReset();
}

Copy link
Contributor

@iabdalkader iabdalkader Apr 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpgeorge There seems to be some issue with DTCM, I can't access any address above ~0x1000. This causes a fault on boot when it tries accesses the magic number address (oddly enough, it doesn't always happen). Perhaps it has something to do with security config, flexram, clocks or something else that gets enabled by the main firmware, but even from the main firmware I still can't access this memory from gdb. Anyway let's please keep the original boot address (with cache clean).

@dpgeorge dpgeorge force-pushed the stm32-add-n6-support branch 2 times, most recently from 562ed7c to 6cb319e Compare April 30, 2025 15:25
MICROPY_PY_NETWORK_CYW43 = 1
MICROPY_PY_SSL = 1
MICROPY_SSL_MBEDTLS = 1
MICROPY_VFS_LFS2 = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to be able to disable mboot USE_MBOOT ?= 1 and also we never use LFS2, just fat.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made all these options use ?=.

But note that the way it's configured you'll probably want to enable USE_MBOOT because that puts the firmware in external flash. Otherwise MicroPython runs from RAM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to build a flash-based image just without mboot? Like an option to do so, that gets forced to one if mboot is enabled, otherwise is user-defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's just USE_MBOOT=1. I guess it should be called USE_BOOTLOADER=1 but for consistency it's the mboot option.

That option is anyway local to the board (nothing outside the board uses this config option, except mboot itself). The option controls:

  1. which ld scripts to use
  2. location of .text

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what if you don't want to build mboot, yet still want a flash-based image? Can USE_MBOOT=1 define something like MICROPY_FLASH_BASED=1? For example:

USE_MBOOT ?= 1
MICROPY_FLASH_BASED ?= $(USE_MBOOT)

This way I can define USE_MBOOT=0 MICROPY_FLASH_BASED=1 to get a flash-based image without mboot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mboot is not built unless you explicitly do make in the ports/stm32/mboot directory.

But, I can do what you suggest, it makes sense.

#define MICROPY_HW_ENABLE_RNG (0)
#define MICROPY_HW_ENABLE_RTC (0)
#define MICROPY_HW_ENABLE_RTC (1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our bootloader, we need the following:

#define MICROPY_HW_ENTER_BOOTLOADER_VIA_RESET   (0)

extern void board_early_init(void);
#define MICROPY_BOARD_EARLY_INIT    board_early_init

extern void board_enter_bootloader(void);
#define MICROPY_BOARD_ENTER_BOOTLOADER(nargs, args) board_enter_bootloader()

With the following code in board.c (which could be gated if mboot is enabled):

#include STM32_HAL_H
#include "py/mphal.h"

#define OMV_BOOT_MAGIC_ADDR   (0x3401FFFCU)
#define OMV_BOOT_MAGIC_VALUE  (0xB00710ADU)

void board_early_init(void) {

}

void board_enter_bootloader(void) {
    *((uint32_t *) OMV_BOOT_MAGIC_ADDR) = OMV_BOOT_MAGIC_VALUE;
    SCB_CleanDCache();
    NVIC_SystemReset();
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now made the board work with both the OpenMV bootloader and mboot.

LL_AHB5_GRP1_EnableClockLowPower(LL_AHB5_GRP1_PERIPH_OTGPHY1);

// Select 24MHz clock.
MODIFY_REG(USB1_HS_PHYC->USBPHYC_CR, USB_USBPHYC_CR_FSEL, 2 << USB_USBPHYC_CR_FSEL_Pos);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a proper init sequence for this in the examples in the HAL, if you want to take a look. I think it's more or less the same, but there were some delays, more force_reset/release etc...

Also, could you please add this? I use it in HS mode.

diff --git a/ports/stm32/usbdev/class/inc/usbd_cdc_msc_hid.h b/ports/stm32/usbdev/class/inc/usbd_cdc_msc_hid.h
index 3a87896b4..5906f95e0 100644
--- a/ports/stm32/usbdev/class/inc/usbd_cdc_msc_hid.h
+++ b/ports/stm32/usbdev/class/inc/usbd_cdc_msc_hid.h
@@ -11,7 +11,7 @@
 
 // Work out if we should support USB high-speed device mode
 #if MICROPY_HW_USB_HS \
-    && (!MICROPY_HW_USB_HS_IN_FS || defined(STM32F723xx) || defined(STM32F733xx))
+    && (!MICROPY_HW_USB_HS_IN_FS || defined(STM32F723xx) || defined(STM32F733xx) || defined(STM32N6))
 #define USBD_SUPPORT_HS_MODE (1)
 #else
 #define USBD_SUPPORT_HS_MODE (0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the USBD_SUPPORT_HS_MODE option as above.

But testing it with #15909 shows that there is data corruption when HS is enabled. That needs investigation.

@dpgeorge dpgeorge force-pushed the stm32-add-n6-support branch from 43eb2ba to 653f480 Compare May 16, 2025 06:34
@dpgeorge dpgeorge force-pushed the stm32-add-n6-support branch 3 times, most recently from 32f5690 to 5fb6eff Compare May 23, 2025 14:02
@iabdalkader
Copy link
Contributor

The N6 is missing from the FPU filter list in micropython/ports/stm32/stm32.mk.

@dpgeorge dpgeorge force-pushed the stm32-add-n6-support branch from 75dc026 to 5522b3b Compare May 28, 2025 07:37
@dpgeorge
Copy link
Member Author

The N6 is missing from the FPU filter list

OK, now fixed.

@dpgeorge dpgeorge force-pushed the stm32-add-n6-support branch 2 times, most recently from dbb5086 to 5c6c893 Compare May 30, 2025 03:53
@dpgeorge
Copy link
Member Author

I finally got the filesystem working. Needed to make sure all relevant code and data structures to erase/write SPI flash is in RAM, and that interrupts are fully disabled during erase/write (obvious but tricky to do).

@iabdalkader
Copy link
Contributor

iabdalkader commented May 31, 2025

I pulled this PR and tested it again. Things still seem to be working, and I noticed some improvements (like no longer calling xspi_init() from ioctl). However, I still couldn’t get the storage working. I get a hardfault when trying to initialize the XSPI PSRAM (XSPI1) after storage_init() has been called. I did move the drivers and IRQs to ITCM, following what's done here, but I think it's a different issue. It might be because it remaps XSPI back to 111 mode, when the bus is released, which isn't what we use initially. Or that it changes something in the common XSPI config.

Update1: I get a hardfault on the first function call (to any function) after storaget_init() so it's likely because of the remap to 111.

Update2: Yes it's the remap, if I change it to 888 it works but then it fails to create a filesystem (likely because it tries to read/write in 111). Also, I'm not able to move memcpy to RAM, why is this needed? Note that we link to libc, so memcpy likely has more dependencies that will also need to be moved to RAM. I'm not sure how to fix this one.

As for CYW43, I just skipped testing it altogether, as it's still using SPI. Other than that, I don’t think this is ready to coexist with older MCUs yet, for example, it comments out a large chunk of powerctrl.c, so we should probably wait for now.

Comment on lines 21 to 45
.isr_vector :
{
. = ALIGN(4);
__isr_vector_ram_start = .;

KEEP(*(.isr_vector)) /* Startup code */

/* These functions need to run from ram to enable uart
reception during flash erase/write operations.
Defining them here ensures they're copied from
flash (in main.c) along with the isr_vector above.
*/
. = ALIGN(4);
*(.text.mp_sched_keyboard_interrupt)
*(.text.pendsv_schedule_dispatch)
*(.text.storage_systick_callback)
*(.text.SysTick_Handler)
*(.text.uart_irq_handler)
*(.text.UART*_IRQHandler)
*(.text.USART*_IRQHandler)
*(.text.HAL_GetTick)
*drivers/memory/spiflash.o(.text* .rodata*)
*boards*(.rodata.spiflash_config*)

. = ALIGN(4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use function attributes to put these in a ram_func section instead? There's an example for this in mimxrt, and then we just need to add asm code to copy this section to where it needs to go (also an example in mimxrt's startup code). This is much cleaner and will not need the code in main.c.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few functions that need to go in RAM, and even some HAL functions (which are inline... but the compiler still makes them non-inline). So I think it's simpler to configure that in a linker script.

There's already asm code in resethandler_m3_iram.s that does the copy. The code in main.c that copies to RAM was already there before N6 support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even some HAL functions (which are inline... but the compiler still makes them non-inline).

Could it be the -Os ?

There's already asm code in resethandler_m3_iram.s that does the copy. The code in main.c that copies to RAM was already there before N6 support.

So code in main is not really needed? Just for my own understanding, why do we need so much code (including HAL GPIO functions) in RAM? Is it needed when waking up from STOP mode? I assume standby doesn't really need that because it resets on wakeup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now cleaned this IRAM stuff up, and there's now only a single RAM section that needs to be copied. That copy is done in asm in resethandler_m3_iram.s.

Just for my own understanding, why do we need so much code (including HAL GPIO functions) in RAM? Is it needed when waking up from STOP mode? I assume standby doesn't really need that because it resets on wakeup.

Code in RAM is needed for two things:

  • Writing to flash, which needs all the SPI flash logic from drivers/memory/spiflash.o and XSPI code
  • Waking from standby! The N6 has SRAM1 retention during standby and a special SYSCFG register that you can set to tell the MCU where to resume when waking from standby. This allows it to wake up much faster from standby by executing code straightaway from SRAM1, instead of going through the bootloaders. So there needs to be enough code in RAM to be able to act like a mini bootloader that can do just enough to enable XSPI memory mapped mode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpgeorge - This is for light sleep or deepsleep? Deepsleep should always result in a reboot after waking up. Awesome for light sleep though.

Copy link
Contributor

@kwagyeman kwagyeman Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpgeorge @kwagyeman Thanks for explaining and for the app note! The app note seems to imply that this is optional, so if not retained it should NVIC reset. Perhaps this is worth a support question? My goal here is to minimize the RAM code to just xspi.c, to eventually use gcc section attribute on all functions and data that file, and then all we need in our firmware is a ram_func section and copy code to the startup code.

Writing to flash, which needs all the SPI flash logic from drivers/memory/spiflash.o and XSPI code

Is spiflah.o really needed in RAM or is it a speed optimization? Could we get away with just the XSPI code as it does the actual erase/write?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is spiflah.o really needed in RAM or is it a speed optimization? Could we get away with just the XSPI code as it does the actual erase/write?

Yes it's needed. That code has all the logic to set WREN and wait for the erase to complete, for example.

Also, as mentioned, if the SPI flash is in DTR mode then it won't work with the ST ROM bootloader if the SPI flash is not fully reset upon waking from standby/deepsleep. And I don't think it is fully reset when waking from standby. If so, that would need code in SRAM1 in retention mode to reset it, before doing an NVIC reset (if that even works, I couldn't get it to bounce into the ST ROM bootloader).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only using about 4k of SRAM1 at the moment. Is that really such an issue? We could try to optimise that down a bit. Eg maybe it doesn't need the app ISR moved there (although I'd suggest doing that regardless, for IRQ performance reasons.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the SPI flash is in DTR mode then it won't work with the ST ROM bootloader if the SPI flash is not fully reset upon waking from standby/deepsleep

The boards are designed such that they reset the flash on NVIC reset (same should apply for devkits, we're following the reference design here), so it should fully reset the flash if an NVIC reset occurs. This feature is crucial for them to function, otherwise machine.reset() wouldn't work. So normally you just need to ensure an NVIC reset occurs somehow. How about using the address of something like this instead? Would this simplify things a bit?

__attribute__((noreturn, section(".ram_function"))) void board_reset(void) {
    // NVIC_SystemReset doesn't get inlined here.
    SCB->AIRCR  = (uint32_t)((0x5FAUL << SCB_AIRCR_VECTKEY_Pos) |
                             (SCB->AIRCR & SCB_AIRCR_PRIGROUP_Msk) |
                             SCB_AIRCR_SYSRESETREQ_Msk);
    __DSB();
    for (;;) {
        __NOP();
    }
}

Is that really such an issue? We could try to optimise that down a bit. Eg maybe it doesn't need the app ISR moved there (although I'd suggest doing that regardless, for IRQ performance reasons.)

Probably not, it's not the memory size I'm concerned about, in fact I put all of that code in ITCM which is not used at all, it's just that I use a shared linker script so I'm trying to figure out the bare minimum customization needed for the N6.


KEEP(*(.isr_vector)) /* Startup code */

/* These functions need to run from ram to enable uart
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is because IRQs are being disabled for too long? If that’s the case, would it help to re-enable IRQs after each block erase/write operation, rather than keeping them disabled across all blocks? If I’m not wrong, this is how it's done in the mimxrt and rp2 ports, and it might help avoid the need to move all those functions and their data to RAM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the UART functions don't need to go in RAM anymore. I've fully disabled IRQs during XSPI flash erase/write. It didn't seem to work if I just raised the IRQ priority level.

And the code in spibdev.c does enable IRQs after each block erase.

@dpgeorge
Copy link
Member Author

dpgeorge commented Jun 1, 2025

Yes it's the remap, if I change it to 888 it works but then it fails to create a filesystem

What read mode do you use for 888? Do you enable the flash in single or double data rate for octal mode?

Note that all SPI flash data structures must be in RAM as well. Check that your mp_spiflash_config_t is placed in RAM.

Also, I'm not able to move memcpy to RAM, why is this needed?

memcpy and memset are sometimes used by the compiler to optimise code, and it's hard to stop the compiler doing that. I'm not sure exactly where memcpy is used but that was necessary for me to get it waking up from deepsleep.

@iabdalkader
Copy link
Contributor

iabdalkader commented Jun 1, 2025

What read mode do you use for 888? Do you enable the flash in single or double data rate for octal mode?
Note that all SPI flash data structures must be in RAM as well. Check that your mp_spiflash_config_t is placed in RAM.

The bootloader maps the flash in octal DTR. Yes, I copied the isr_vector linker script section plus *xspi.o(.text* .rodata*);. I made sure they're in ITCM by checking the firmware map. I get a hardfault with 111 on the very first instruction of the first function call after the remap. If I uncomment and use 888, this doesn't happen, but fatfs fails to create a file system. I see that XSPI attempts to read (write?) in 111 later, so maybe that's why.

memcpy and memset are sometimes used by the compiler to optimise code, and it's hard to stop the compiler doing that. I'm not sure exactly where memcpy is used but that was necessary for me to get it waking up from deepsleep.

memcpy hardfaults when moved to ram, I think it's because this is libc's memcpy, so maybe it has other deps that need to be moved too. Something here might work: https://chatgpt.com/share/683bf9a7-7a94-8013-9a7e-b4b3c91705dc

@dpgeorge dpgeorge force-pushed the stm32-add-n6-support branch from 5c6c893 to 70347c8 Compare June 1, 2025 13:35
@dpgeorge
Copy link
Member Author

dpgeorge commented Jun 2, 2025

I nearly have 8-8-8 octal DTR mode working with the filesystem. Just a few things to tidy up...

@iabdalkader
Copy link
Contributor

I nearly have 8-8-8 octal DTR mode working with the filesystem. Just a few things to tidy up...

I have it working here, but I use HAL drivers: https://github.com/openmv/openmv/blob/2206dcb31c2a854c79e83cd62d6b55939f6c351a/boot/src/ports/stm32/stm32_xspi.c#L502

@dpgeorge
Copy link
Member Author

dpgeorge commented Jun 2, 2025

I have it working here, but I use HAL drivers

Yes, thanks, I was studying that code (you use HAL_XSPI_INSTRUCTION_8_BITS when I think it should be HAL_XSPI_INSTRUCTION_16_BITS?).

@iabdalkader
Copy link
Contributor

iabdalkader commented Jun 2, 2025

Yes, thanks, I was studying that code (you use HAL_XSPI_INSTRUCTION_8_BITS when I think it should be HAL_XSPI_INSTRUCTION_16_BITS?).

Yes, I think you're right. It should have been a test like the others:

(dtr) ? HAL_XSPI_INSTRUCTION_16_BITS : HAL_XSPI_INSTRUCTION_8_BITS

It's weird but it seems to be working fine either way, but I've fixed it anyway.

Note that we'll be using an 8/24MBs split for the FS/ROMFS (for both N6 and AE3). It's just more useful to have a bigger ROMFS for models. I'll send a PR later to update the AE3.

EDIT: For the N6, it's actually 4MBs (bootloader + firmware) 4MBs filesystem 24MBs ROMFS.

@dpgeorge dpgeorge force-pushed the stm32-add-n6-support branch from 70347c8 to a05ace7 Compare June 3, 2025 15:32
@dpgeorge
Copy link
Member Author

This gcc bug looks related: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116444 . That's just a missed optimisation bug, but could have deeper implications.

@iabdalkader
Copy link
Contributor

I'm not sure what to do here, we could use -O2, it seems to work and we'll not lose the new m55 instructions, or m33 but it will be a very very long time before we see a fix. Anyway, you should definitely report this somewhere to gcc or Arm, I'm surprised that no one else reported it and it's been a while since the m55 was released.

@dpgeorge
Copy link
Member Author

we could use -O2, it seems to work

I wouldn't be confident using -O2 with -mcpu=cortex-m55 -mtune=cortex-m55. The bug seems to be something to do with optimising the counter used in a loop, and it could still be there with -O2.

I will file a report to gcc then we can see what to do.

@kwagyeman
Copy link
Contributor

kwagyeman commented Jun 18, 2025

I can verify this issue is present on 13.2.rel1 and 13.3.rel1. O2 seems to fix the issue.

Damien, we use M55 Helium instructions heavily in our upstream software. Is there a way we could switch things to O2 globally? We have the firmware space on the AE3 and N6.

@kwagyeman
Copy link
Contributor

kwagyeman commented Jun 19, 2025

From ARM:

Hi all,

Thank you for reporting this, this had already been reported on a different target in PR 116799.

The good news is the issue has already been fixed on trunk and backported to all active branches. This fix is in our latest Release Candidate for the 14.3.Rel1 release that should be coming out before the end of the month.

If you would like a way to work around this wrong-code gen for the existing releases you can disable the pass that introduces this issue by passing the option '-fdisable-rtl-loop2_doloop'.

Relevant links:
Bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116799
Patches on:
trunk: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=f87fe2579b48d57c5f97bb91674b60808722855d

gcc-14: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=26615af93cf759999c5dfae0c51a827b05968cca

gcc-13: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0eed81612ad6eac2bec60286348a103d4dc02a5a

gcc-12: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=c1e55bcc1075aa74bebd7e1bb87d3939da2e498b

@@ -10,9 +10,24 @@

#define LWIP_RAND() rng_get()

// Increase memory for lwIP to get better performance.
#if defined(STM32N6)
#define MEM_SIZE (16 * 1024)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make larger? We have 1GB/s Ethernet onboard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer if we use the config I made for mimxrt and alif, back then I checked it against the docs and every option is carefully calculated and it performs very well:

#define MEM_SIZE                        (16 * 1024)
#define TCP_MSS                         (1460)
#define TCP_OVERSIZE                    (TCP_MSS)
#define TCP_WND                         (8 * TCP_MSS)
#define TCP_SND_BUF                     (8 * TCP_MSS)
#define TCP_SND_QUEUELEN                (2 * (TCP_SND_BUF / TCP_MSS))
#define TCP_QUEUE_OOSEQ                 (1)
#define MEMP_NUM_TCP_SEG                (2 * TCP_SND_QUEUELEN)

Could increase the MEM_SIZE if you like.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably wait till performance testing things with ipref3 when ethernet is working.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer if we use the config I made for mimxrt and alif,

When I updated this I checked that all options I left out here take their defaults and the defaults match that your configuration. So these settings do match what you have.

@dpgeorge
Copy link
Member Author

I can confirm that using -fdisable-rtl-loop2_doloop fixes the issue. So I suggest using that flag on compilers known to have the bug.

@iabdalkader
Copy link
Contributor

-fdisable-rtl-loop2_doloop

This option generates one warning/note per file, that seems impossible to suppress.

@dpgeorge
Copy link
Member Author

This option generates one warning/note per file, that seems impossible to suppress.

Yes I saw that. I don't think there's much choice, except to update to a non-broken compiler.

I will make sure -fdisable-rtl-loop2_doloop is only enabled for gcc versions that are known broken.

@iabdalkader
Copy link
Contributor

I will make sure -fdisable-rtl-loop2_doloop is only enabled for gcc versions that are known broken.

That would be anything less than the future 14.3 release. I enable it unconditionally for CM55 for now, but the following also works:

ifeq ($(CPU),cortex-m55)
# Check if GCC version is less than 14.3
GCC_VERSION := $(shell arm-none-eabi-gcc -dumpversion | cut -d. -f1-2)
GCC_MAJOR := $(shell echo $(GCC_VERSION) | cut -d. -f1)
GCC_MINOR := $(shell echo $(GCC_VERSION) | cut -d. -f2)

# Convert to comparable number (14.3 becomes 1403)
GCC_VERSION_NUM := $(shell echo $$(($(GCC_MAJOR) * 100 + $(GCC_MINOR))))

# Only add the flag if version < 14.3 (1403)
ifeq ($(shell test $(GCC_VERSION_NUM) -lt 1403 && echo yes),yes)
$(warning )
$(warning *** WARNING ***)
$(warning GCC $(GCC_VERSION) has known issues with Cortex-M55)
$(warning Recommend upgrading to GCC 14.3+ for proper CM55 support)
$(warning )
CFLAGS += -fdisable-rtl-loop2_doloop
endif
endif

Anyway, I've already merged this PR in our fork (with the fixed flash layout). Everything seems to be working fine so far. I don't have any more comments, I think this is ready to be merged.

dpgeorge added 6 commits June 24, 2025 00:26
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
@dpgeorge
Copy link
Member Author

I've pushed a few new commits here:

  • improve ADC N6 code (although it's still not fully working... not sure why)
  • enable ADC on OPENMV_N6
  • remove lwip alignment to 8 bytes
  • disable gcc optimisation on Cortex-M55 (we can reenable this once there's a release of gcc that fixes the issue)
  • adjust OPENMV_N6 flash partitioning to match OpenMV settings

I just need to retest it with the NUCLEO and DK boards, then it should be good to merge as preliminary N6 support. Any further improvements can be made in the future.

dpgeorge added 5 commits June 25, 2025 10:52
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
@dpgeorge dpgeorge changed the title WIP: stm32: Add support for STM32N6xx MCUs and three N6 boards stm32: Add support for STM32N6xx MCUs and three N6 boards Jun 25, 2025
@dpgeorge dpgeorge changed the title stm32: Add support for STM32N6xx MCUs and three N6 boards stm32: Add support for STM32N6xx MCUs and two N6 boards Jun 25, 2025
@dpgeorge
Copy link
Member Author

I have improved a few things here (eg mboot now uses XSPI in octal mode for writing, which speeds up DFU deployment), fixed and tested NUCLEO_N657X0 and removed support for the STM32N6570_DK board (I don't have time to test, get this board working and document it, that can be done later if needed/desired).

I've taken this PR out of draft/WIP, it should be pretty much ready to merge.

@dpgeorge dpgeorge marked this pull request as ready for review June 25, 2025 01:22
Signed-off-by: Damien George <damien@micropython.org>
@dpgeorge dpgeorge added this to the release-1.26.0 milestone Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy