Cortex-M is the Arm Inc. micro-controller (or MCU) profile, hence them M, so DDR3 or PCIe are out of scope here. You also only run thumb code on the M profile.
edit: Did you mean STM32 Nucleo, the line of MCU dev boards from ST?
> Otherwise using those platforms is a bit like programming on 8086 today. Fun. You get basic stuff done and then you hit a wall.
If these platforms are as problematic as you claim then they would not sell as well as they do. The problem is not the platform but how you are using it.
> Only option is to jump on SoM stuff or FPGA which is another can of worms in itself.
Only? For what application? SoM means system on module and is nothing more than a ready to run CPU board you plug into your circuit board. Do you mean an applications CPU running a general purpose OS like Linux? And an FPGA is a completely different animal. Normally an FPGA is needed when you need specific custom logic to accomplish a task OR interfacing hardware with glue logic. Otherwise programming an FPGA is way more difficult than an MCU.
[1] https://www.st.com/content/st_com/en/campaigns/stm32v8-high-...
> or your real-time deadlines are so tight an FPGA would be a better choice
That's why I'd say Cortex-M fits well for a very-very good soft real-time (99.99+%). But for hard real-time (100%) you need an FPGA.
They have not released anything in years, only modest incremental updates. STM32V8 is a welcome update, but it is nothing ground breaking and not available yet.
> The real reason to use a Cortex-M is determinism in hard real-time systems.
Correct.
> but at that point you either aren't making a hard real-time system and would benefit greatly from an MPU
Why not?
> allowing you to run Linux and benefit from the wealth of drivers available for these interfaces
It is precisely to avoid running Linux and the "wealth of drivers".
> or your real-time deadlines are so tight an FPGA would be a better choice
When MCUs were not good enough, the FPGA was a sensible choice. That said you won't get Cortex-M7 core on FPGA and other cores might be available, though there is not going to be any substantial performance improvement, unless you want to spend unreasonable amount of money (talking 5 figures at least).
What I am trying to point out is that there is a huge market gap.
i.MX8 is not realtime and the support for running bare metal code is very much non existent.
Why not? I understand not wanting to deal with unnecessary complexity as a hobbyist, but you'll find yourself creating far more complexity trying to implement all of this yourself (and vendors certainly don't want to support you in this). Secondly, I think the number of customers for chip vendors who are uncomfortable with setting up an embedded Linux environment, but perfectly confident in routing DDR and PCIe signals is approximately 0.
> What I am trying to point out is that there is a huge market gap.
> i.MX8 is not realtime and the support for running bare metal code is very much non existent.
This isn't quite true and is what I'm trying to get at. Most of these embedded SoCs contain a Cortex-M and a Cortex-A (not all but there are quite a lot). High performance DRAM, external PCIe devices, and large internal caches are fantastic for compute performance but most of the things you want to do with a PCIe device (networking, asynchronous compute) don't require cycle-accurate determinism. Generally there isn't much you need to do with such stringent timing requirements, so you can offload that work to secondary Cortex-M33 core with a shared memory interface to the main core and get the best of both worlds.
I see so many systems trying to take advantage of the impressive compute power of modern MCUs (which is really cool!) but often end up just re-inventing the cooperative multitasking OS, but worse.
Looking at e.g. STM32H755: 1x Cortex-M7 480 MHz, 1x Cortex-M4 240 MHz, USB2, 100 Mbit Ethernet, DAC.
Comparing to AM6421: 1x Cortex-A53 1GHz, 2x Cortex-R5F 800MHz, 1x Cortex-M4F 400MHz, 2x PRU, USB3, 1 Gbit Ethernet, PCIe Gen2.
I can hardly believe STM32H755 microcontroller is almost so costly as AM6421 SoC.
For example the AMD Xilinx UltraScale+, like in the AMD Kria modules and development kits (3-digit prices), include some Cortex-R5 cores, which provide deterministic operation, like Cortex-M.
Cortex-R5 are somewhat slower than Cortex-M7 at the same clock frequency, but they are available at a higher clock frequency than many Cortex-M7 implementations.
If you can implement some custom peripherals in the FPGA logic array, then you can obtain much higher performance than with a microcontroller alone.
So they are similar with an older Raspberry Pi and they have far more computational power than a Cortex-M7 or Cortex-M85 CPU, even if they are very slow in comparison with modern Cortex-A7x or Cortex-A7xx cores.
I have never heard of any FPGA containing better CPU cores than Cortex-A78, but even those with Cortex-A78 are extremely expensive, so they may be worthwhile only for their FPGA part, not for a CPU that is much slower than cheaper alternatives.
The same is true even for the cheaper modules with UltraScale+ FPGAs, like AMD Kria, which cost as much as one of the cheaper mini-PCs with a much faster Intel or AMD CPU, so they are worthwhile only if you can implement in the FPGA an essential part of the functionality.
There is however another advantage of the FPGAs with ARM cores, besides implementing fast peripherals with hard real-time requirements.
Unlike with most non-microcontroller ARM CPUs where the vendor keeps secret various things, including the boot loader, so you cannot be absolutely certain about what the vendor does, because ARM has followed the example of Intel and has introduced a potential Trojan horse in its CPUs, i.e. an execution mode controlled by the vendor, which is more privileged than even a hypervisor, in the FPGAs with ARM cores you have complete documentation and absolute control over what the CPU does, so you could implement with greater confidence some devices for which security is important.