Initially the difference between RISC and CISC processors was clear. Today many say there is no real difference. This story digs into the details to explain significant differences which still exist.
While mostly missing the mark, and just rehashing the old discussions. AKA the micro architecture concepts of both "RISC" designs and "CISC" designs is so similar across product lines as to be mostly meaningless. As mentioned you have RISC designs using "micro ops" and microcode, and you have CISC designs doing 1:1 instruction micro op mapping. Both are doing various forms of cracking and fusing depending on the instruction. All have the same problems with branch prediction, speculative execution, and solve problems with OoO in similar ways.

Maybe the largest remaining difference is around the strength of the memory model, as the size of the architectural register file, and the complexity of addressing modes and other traditional RISC/CISC arguments are mostly pointless in the face of deep OoO superscaler machines doing register renaming/etc from mop caches, etc.

Even then, like variable length instructions (which yes exist on many RISCs in limited forms) this differentiation is more about when the ISA was designed rather than anything fundamental in the philosophy.

> Maybe the largest remaining difference is around the strength of the memory model

If it weren't for the following project... I'd agree with you.

https://github.com/saagarjha/TSOEnabler

> A kernel extension that enables total store ordering on Apple silicon, with semantics similar to x86_64's memory model. This is normally done by the kernel through modifications to a special register upon exit from the kernel for programs running under Rosetta 2; however, it is possible to enable this for arbitrary processes (on a per-thread basis, technically) as well by modifying the flag for this feature and letting the kernel enable it for us on. Setting this flag on certain processors can only be done on high-performance cores, so as a side effect of enabling TSO the kernel extension will also migrate your code off the efficiency cores permanently.

--------

Its clear that Apple has implemented total-store ordering on its chips (including the M1).