This was a cool example of a class of bugs that are both hard to find with no active example, and hard to prevent in complex systems. The optimization that was added many years ago for performance didn't update something that had a use case that was incompatible with not being updated in a very small number of circumstances.

It is an interesting thought experiment to consider what kind of tool or automated detection could have found this. Some type of dependency linking between variables might have shed some light, but I'm not sure that would have really highlighted this kind of issue.

Great description of both the bug and the path to the solution!

Probably the only way to prevent this type of issue in an automated fashion is to change your perspective from proving that a bug exists, to proving that it doesn't exist. That is, you define some properties that your program must satisfy to be considered correct. Then, when you make optimizations such as bulk receiver fast-path, you must prove (to the static analysis tool) that your optimizations to not break any of the required properties. You also need to properly specify the required properties in a way that they are actually useful for what people want the code to do.

All of this is incredibly difficult, and an open area of research. Probably the biggest example of this approach is the Sel4 microkernel. To put the difficulty in perspective, I checkout out some of the sel4 repositories did a quick line count.

The repository for the microkernel itself [0] has 276,541

The testsuite [1] has 26,397

The formal verification repo [2] has 1,583,410, over 5 times as much as the source code.

That is not to say that formal verification takes 5x the work. You also have to write your source-code in such a way that it is ammenable to being formally verified, which makes it more difficult to write, and limits what you can reasonably do.

Having said that, this approach can be done in a less severe way. For instance, type systems are essentially a simple form of formal verification. There are entire classes of bugs that are simply impossible in a properly typed programs; and more advanced type systems can eliminate a larger class of bugs. Although, to get the full benefit, you still need to go out of your way to encode some invariant into the type system. You also find that mainstream languages that try to go in this direction always contain some sort of escape hatch to let the programmer assert a portion of code is correct without needing to convince the verifier.

[0] https://github.com/seL4/seL4

[1] https://github.com/seL4/sel4test

[2] https://github.com/seL4/l4v