What does HackerNews think of ubpf?

Standardizing BPF | Apr 2023

UBPF https://github.com/iovisor/ubpf/ and rbpf https://github.com/qmonnet/rbpf are userspace bpf interpreters.

Via https://github.com/zoidbergwill/awesome-ebpf#user-space-ebpf

Ask HN: Recommendation for general purpose JIT compiler | May 2022

The usual recommendations have been given. Now for more touristic approach what I would like to use if given excuse and time. All those options are mostly written in C:

- QBE [1] - small compiler backend with nice IL

- DynASM [2] - IIUC the laujit's backend, that can and is used by other languages

- uBPF - Userspace eBPF VM. Depending on your DSL the eBPF toolchain could fit your use-case, but this would probably be the biggest excursion. There is some basic assembler in python.

[1] https://c9x.me/compile/

[2] https://luajit.org/dynasm.html

[3] https://github.com/iovisor/ubpf

GCC eBPF for Linux port has landed | Sep 2019

Expand Context ↕

The implementation that exists in Linux is perfectly capable of running unbounded loops. But it runs your program through a function that rejects your program if it can't prove that it terminates.

The point being that you could rip the eBPF implementation out of the kernel, remove the verify check and have a very usable VM.

Here's a implementation that exists because of the GPL: https://github.com/iovisor/ubpf.

Google XRay: A Function Call Tracing System [pdf] | May 2016

Expand Context ↕

Thanks for the reply; some replies:

> "anything that takes a context switch per traced function will probably be dramatically slower."

Good thing uprobes don't context switch:

  # perf stat -e context-switches -e probe_libc:re_search_internal sed '/./d' /mnt/data.txt 
  
   Performance counter stats for 'sed /./d /mnt/data.txt':
  
                 6      context-switches                                            
        15,122,432      probe_libc:re_search_internal                                   
  
      19.744738204 seconds time elapsed

You mean mode switch? Cheaper, but yes, still costly. Here's runtime without the probe:

  # time sed '/./d' /mnt/data.txt 
  
  real	0m3.349s
  user	0m3.345s
  sys	0m0.004s

Which means we can calculate the cost to be ~1.1 us per probe (on my system). Anyone know what XRay is clocking in at?

AFAIK, LTTng has done work for user<->user instrumentation. I think uBPF will be doing this (https://github.com/iovisor/ubpf) - although that project is very new. Could use some help from some more good engineers (please do!).

> "In "speed while not tracing", anything much more expensive than nop-sleds will be too slow to run in production."

I'm not sure anyone is suggesting anything more than nop-sleds. Dynamic tracing is zero, and static tracing is nop-sleds.

> "probably won't be able to completely hook functions that get inlined"

Sure. Sometimes there's static tracing probes (nop-sled based), sometimes there isn't and it's dynamic probes, sometimes those dynamic probes are inlined and you walk up the stack to find one that isn't. If it is inlined, maybe you need to trace the address rather than the function entry.

In my experience it's pretty rare that something is just untracable because inlining is so insane. But yes, it does happen sometimes. Usually I figure out a workaround before giving up.

> "a mechanism to execute arbitrary code at function entry or exit in a way that's runtime-customizable and very low overhead when you want it to be"

BPF! In-kernel virtual machine that runs JIT'd code on events, and is part of mainline Linux. Lots of enhancements in the Linux 4.x series.