This work should be read in context of prior work, STOKE: http://stoke.stanford.edu/

This work outperforms STOKE, and STOKE outperformed existing methods including GCC, LLVM, ICC, traditional superoptimization, expert human, etc. That's why comparison with existing methods are omitted.

This work is also limited to Hacker's Delight and automatically generated programs, but so was STOKE. But STOKE readily extended to computational kernels in libm, BLAS, OpenSSL, ray tracer, computational fluid dynamics, etc. You can read about them on the link above. The next step would be applying the improvements to other STOKE tasks and more.

STOKE is an open source.

That sounds pretty awesome. So, are these only instruction-level optimizations or is there a good toolkit for doing it SSA level as well? My idea being combining it with formal, equivalence checks in certified compilers so we get optimizations plus ensure correctness given original version was correct via certified transform. Certified to v1 -> superoptimize to v2 -> certified checker (v1, v2) -> rinse repeat.