Maybe the largest remaining difference is around the strength of the memory model, as the size of the architectural register file, and the complexity of addressing modes and other traditional RISC/CISC arguments are mostly pointless in the face of deep OoO superscaler machines doing register renaming/etc from mop caches, etc.
Even then, like variable length instructions (which yes exist on many RISCs in limited forms) this differentiation is more about when the ISA was designed rather than anything fundamental in the philosophy.
If it weren't for the following project... I'd agree with you.
https://github.com/saagarjha/TSOEnabler
> A kernel extension that enables total store ordering on Apple silicon, with semantics similar to x86_64's memory model. This is normally done by the kernel through modifications to a special register upon exit from the kernel for programs running under Rosetta 2; however, it is possible to enable this for arbitrary processes (on a per-thread basis, technically) as well by modifying the flag for this feature and letting the kernel enable it for us on. Setting this flag on certain processors can only be done on high-performance cores, so as a side effect of enabling TSO the kernel extension will also migrate your code off the efficiency cores permanently.
--------
Its clear that Apple has implemented total-store ordering on its chips (including the M1).