> Souper is a superoptimizer for LLVM IR. It uses an SMT solver to help identify missing peephole optimizations in LLVM's midend optimizers.
Related: I work on Souper (https://github.com/google/souper).
Feel free to reach out if anyone has questions!
It's a shame it's a bit hard to actually build these days. Its target is people working on compilers, not end users. But there are some odd cases where I'd really like to use it myself, like trying to get speedup in llama.cpp.
* SAT (more specifically, SMT) has been applied with some success to automated vulnerability discovery via symbolic execution. Underconstrained symex has cropped up more recently as a strategy for minimizing state space explosions.
* Others have mentioned register allocation, but SAT/SMT is broadly applicable to compilation problems: instruction selection and superoptimization[1] come to mind.
clang -Xclang -load -Xclang libsouperPass.so -mllvm -z3-path=/usr/bin/z3
, or increase the inlining heuristics..., or just create your own optimization strategy like rustc does [1], or...LLVM is super configurable, so you can make it do whatever you want.
Clang defaults are tuned for the optimizations that give you the most bang for the time you put in, while still being able to compile a Web browser like Chrome or Firefox, or a whole Linux distribution packages, in a reasonable amount of time.
If you don't care about how long compile-times take, then you are not a "target" clang user, but you can just pass clang extra arguments like those mentioned above, or even fork it to add your own -Oeternity option that takes in your project and tries to compile it for a millenia on a super computer for that little extra 0.00001% reduction in code size, at best.
Because often, code compiled with -O3 is slower than code compiled with -O2. Because "more optimizations" do not necessarily mean "faster code" like you seem tobe suggesting.
[0]: https://github.com/google/souper [1]: https://github.com/rust-lang/rust/blob/master/src/rustllvm/P...
https://en.wikipedia.org/wiki/Concolic_testing
Basically, the problem "find an input that causes the program to execute this sequence of statements".
---
Identification of missing optimizations in the LLVM backend.
https://github.com/google/souper
The paper mentions that Souper links against LLVM 3.9, but a recent commit on GitHub states that dependencies have been bumped to LLVM 5.0, the most recent version.
The article mentioned superoptimization. Google's Souper uses an SMT solver (like Z3) to make LLVM peephole optimizations.
https://github.com/google/souper
Alternatively, Alive allows users to specify rewrite rules for peephole optimizations. The rules are verified with Z3 and then compiled into an LLVM pass. If the rewrite can't be verified, then Alive won't allow it.
https://github.com/nunoplopes/alive
And then there's STOKE, which uses stochastic search to investigate possible program transformations of x86 to find non-obvious code sequences. STOKE also verifies the resulting transformation with Z3.
For an example of the kind of logical problem optimizers solve you can take a look at: https://github.com/google/souper.
So I don't see how you can take the current neural nets and get them to design a more efficient CPU architecture or a better JIT.
Anyone know what the current state of super optimization compilers is?
Have a look at Souper. I believe it's got slightly different approach, but the same idea / result. (https://github.com/google/souper)
Actually I'm a bit disappointed that none of the papers references it (did a quick text search only).
If you're referring to super-optimization, it can take far more than 100x and isn't something that is being "classically automated" as humans likely can't do it in the first place. In addition, super-optimization will yield better results if you give it a better starting point. It's nothing at all like the optimization discussed in the blog post. Either way, it can be done against LLVM IR[4] and I'm sure it will creep into clang at some point.
So far as using profiling data to optimize code in the way that humans would: MSVC supports it[1], clang supports it[2] and gcc supports it[3]. Supposedly Microsoft saw a 30% performance increase [with a very specific workload] when they tried POGO on an internal build of SQL Server. Under normal circumstances, PGO should net you around 7% free perf.
[1]: https://msdn.microsoft.com/en-us/library/e7k32f4k.aspx [2]: http://clang.llvm.org/docs/UsersManual.html#profile-guided-o... [3]: http://dom.as/2009/07/27/profile-guided-optimization-with-gc... [4]: https://github.com/google/souper
Still, it does compile some pretty big programs as long as you cache the results (in redis).