There's a part which seems strange, where clang is used with "-O2" to generate code:
$ clang -O2 -target bpf -Xclang -target-feature -Xclang +alu32 -c sub64.c -o - | llvm-objdump -S -
> Apparently the compiler decided it was better to operate on 64-bit registers and discard the upper 32 bits.The workaround was to use the `volatile` keyword.
The problem kind of sounds like one of the LLVM optimisation passes made the change.
http://releases.llvm.org/8.0.0/docs/Passes.html#transform-pa...
Wonder if disabling optimisations ("-O0") would have also worked?
Compiling to eBPF is hard! The compiler must: - avoid loops (backwards jumps) - unroll everything - inline everything - no function calls
Basically, it's impossible to use clang to generate eBPF without -O2. Sorry.
I'm new to this area of work, but has anyone stepped back and thought: "this is not the right way to build eBPF programs"? Would it be better to create a new high-level language and toolset? All this voodoo hackery to try to trick a C compiler into making eBPF-compatible code feels unsustainable.