Here is a 4th one:
It is more of a bunch of different things for working out source code correctness glued together and the EVA code that is a static analyzer has limitations, but it can be used to write proofs of the correctness of C code, so it is basically a tool to investigate after I have investigated the others.
Be warned that it is the first project that I have ever seen that has so much documentation that it needs documentation on how to browse the documentation.
> Aren't analyzers today part of the build pipeline form the get go? Especially as C is known to be full of booby traps.
No, since developers generally cannot be bothered to do that. You are lucky if you get -Wall -Werror. OpenZFS uses that, but many other projects cannot be bothered to do even that much. I have been working on improving this in OpenZFS. Unfortunately, Coverity’s scan service is incompatible with PRs due to it not supporting branches and having weekly scan limits, but CodeQL was integrated a few months ago and I have plans to integrate Clang’s static analyzer. I am also looking into SonarCloud.
> Adding such tools later on in the development is like activating warnings post factum: You'll get drowned in issues.
So you understand my pain.
> Especially in such critical domains as file-systems I would actually expect that the developers are using "the best tools money can buy" (or at least the best OpenSource tools available).
I submitted a paper to Asia BSDCon 2023 titled “Lessons from Static Analysis of OpenZFS” that explains the extent to which these tools are used in OpenZFS, which might interest you. Assuming it is accepted, I will be giving a talk there. I would post the paper for others to read, but I still have 2 months before the final paper is due and I want to make some additional improvements to it before then. Also, they would still need to accept the paper; I will be informed of their decision next month.
> "Still fixing bugs found by some code analyzer" doesn't sound like someone should have much trust with their data in something like ZFS, to be honest… The statement sounds actually quite scary to me.
You should look at the coverity scan results for the Linux kernel. They are far worse.
https://scan.coverity.com/projects/linux
https://scan.coverity.com/projects/openzfs-zfs
At present, the ZFS Linux kernel module is at 0.14 unresolved defects per thousand lines. That is better than it’s in tree competitors:
* btrfs: 0.90
* ext*: 0.87
* jfs: 1.78
* reiserfs: 1.12
* xfs: 0.79
At the time of writing, the overall Linux kernel is at 0.48 while the entire OpenZFS tree is at 0.09, but sure, be worried about ZFS. Ignore Linux, which has a number of obvious unfixed bugs being reported by Coverity (and this does not even consider other static analyzers). I would send patches for some of them if mainline Linux did not have an annoying habit of ignoring many of the patches that I send. They have ignored enough that I have stopped sending patches since it is often a waste of time.
Anyway, static analysis has the benefit of checking little used execution paths, but it also has the downside of checking never used execution paths. It is often hard to reliably tell which are little used and which are never used.
Also, not every defect report is a bug in the code and not every bug in the code is a potential data loss bug. Many are relatively benign things that do not negatively impact the safety of data. For example, if our test suite is not using cryptographically secure random numbers (and it is not), certain static analyzers will complain, but that is not a real bug. I might even write patches changing to /dev/urandom just for the sake of making the various static analyzers shut up so I do not have to write explanations of why it is wrong for every defect single report. That does not mean that the code had real bugs in it.
Most of the static analysis fixes being done lately are fixing things not known to affect end users (no one reported issues that resemble what the static analyzers are finding lately) and the fixes are just being done either for good measure or because I and others think changing the code in a way that gets the static analyzer to shut up makes the code easier to read. It is wrong to glance at a summary of static analyzer reports, or even a list of fixes done based on those reports, and think the code itself is bad at data integrity. It is not so simple.
That said, most of the remaining defect reports that I had yet to handle in the various static analyzers that I use are not even real issues. I just have not yet gotten around to writing an explanation of why something is a false positive or rewriting code in which a non-issue was reported where I felt some minor revision would make the code better while silencing the static analyzer. There might be a few remaining real bugs, which is why I am going through every single report. Look at how developers of other software projects handle static analyzer scan reports and you will be horrified. The only others I know that are this rigorous with static analysis reports are grub and libreoffice. The rest are far less rigorous (and GCC’s 1.59 defects per thousand lines reported by coverity that the GCC developers try to hide is just plain frightening).
In addition, ZFS has plenty of other QA techniques that it employs, such as a test suite, stochastic testing and code review. All three are done on every pull request. I am not aware of another major filesystem that does that much QA. Let me know if you find one.
Also, you do have a point. My suggestion is that you go back to punched cards, and keep them in triplicate at different geographically distinct facilities under climate control. Ideally, those facilities will have armed security and be on different continents in places that are known to have low seismic activity. That should keep your data safer than it is on a single ZFS pool. ;)
Back before I converted the back end of the D compiler from C++ to D, I ran a couple of static checkers over it. Thousands of errors were found. I was rather excited, here are lots of bugs I can fix at very low cost!
I only found one actual bug. While I was happy to get that fixed, the overall result was disappointing.
My conclusion that the best path forward was to design a language where a static analysis tool would add no value.
Which ones did you try?
The only static analyzers that I am allowed to say have stood out to me are:
* Clang's static analyzer
* GCC's static analyzer with -Wno-analyzer-malloc-leak -Wno-analyzer-use-of-uninitialized-value -Wno-analyzer-null-dereference because those three checks are uniquely buggy. `-Wanalyzer-null-dereference` fails to understand assertions. `-Wanalyzer-use-of-uninitialized-value` fails to recognize initialization from pass by pointer. `-Wanalyzer-malloc-leak` is so buggy that out of the 32 reports it made, only 1 was real, but it was right before an `exit()` call in a test suite, so it was not a particularly interesting report. Unfortunately, I do not have notes describing why that check is buggy.
Note that I am restricted by Coverity's Scan User Agreement from saying anything involving Coverity that could constitute a comparison. I suggest that you look at the public information about its use by myself and draw your own conclusions. Keep in mind, your conclusions are not things that I said.
I think it was PC-something, but it was several years ago.
I once told the Coverity staff that the purpose of D was to put Coverity out of business (!)
Was it PC-Lint?
That was probably a bad choice. John Carmack did not rate it very highly:
http://www.sevangelatos.com/john-carmack-on-static-code-anal...
No, not that one. Maybe PVS-xxxx
Here is a list of almost every known static analysis tool for C++:
https://analysis-tools.dev/tag/cpp
The only one I know to exist absent from that list is Microsoft’s /analyze.
That said, I am surprised that D only has 1 tool listed: