> RISC-V however does not work like this. The RISC-V vector registers are in a separate register file not shared with the scalar floating point registers.

Honestly... in hardware, they probably are actually in the same register file. It just now means you have two sets of architectural registers that rename to the same register file.

As for the rest of the article, it looks like it mostly boils down to "I'm intimidated by assembly programming" as opposed to any actual critique of the strengths and weaknesses of the vector ISAs. There's superficial complaints about the numbers of instructions, or different ways to write (the same? I only know scalar ARM assembly, not any vector extensions) instructions. On a quick reread, I see a complaint that's entirely due to how ARM represents indexed load operations, which has absolutely nothing to do with the vector ISA whatsoever.

If your goal is to understand how hardware SIMD works, you're probably better off sticking to C code with intrinsics, that way you're not distracted by the extra hoops you may have to go through that arise just by translating C into assembly.

> If your goal is to understand how hardware SIMD works, you're probably better off sticking to C code with intrinsics

Agreed, and we're also using intrinsics in time-critical places. I am confident we will be able to hide both SVE and RVV behind the same C++ interface (https://github.com/google/highway) - works for RVV, just started SVE.