The loop un-rollers in the old Intel high performance C compiler used to emit code like this.

It was sometimes pure joy telling it to emit the assembly language and looking at the code to work out what it was doing. Intel must have had, and still probably has, some awesomely clever people in the basement writing that stuff.

I still hope they'll open source it one day. I never really got to use it in anger, and I bet that when you use PGO all the compilers end up pretty similar in the "real world" but it would be a shame for all the work to be lost
Intel has a opensource SPMD compiler: https://github.com/ispc/ispc