CompCert [2] is a formally verified optimizing C compiler.
Most formal verification happens in compilers or low-level mission critical systems due to the cost.
If you want to write some formally verified C, you can check out Frama-C [3].
Until recently I was heading a compiler team. We found it impossible to hire experienced compiler engineers despite paying very well, so we essentially switched to hiring promising university graduates and trained them on the job. My peers in other companies are all in the same situation.
Note that there is a mismatch between academic and industrial compiler work. A university compilers course is necessary, but not sufficient to be considered experienced, you should have good knowledge of things like ABI design, ELF, DWARF, LLVM's tablegen, C++, compilation of varargs etc. This stuff is not typically taught in compiler courses (for good reason, since it is a bit arbitrary) but it is absolutely vital to producing a working, industrial strength compiler. You should also have some idea about computer architecture, so you can read ISA manuals and understand what individual machine instructions do, and what performance characteristics they have. Otherwise you won't be able to write good instruction selection and scheduling. Given the increasing prevalence of JIT compilers (eg eBPF in the Linux kernel), it's also useful to know about that. Another dimension is security: modern compilers to need to worry about stack canaries, shadow stacks etc. This too is not taught at universities.
Regarding what tool/library/framework, skills are rather transferrable, so I would not worry too much about this. LLVM is by far the most widely used open framework and is easier to get into than other open stuff. If you are knowledgable about security and verification, doing hacking on e.g. CakeML [1] or CompCert [2] might be interesting.
A good way of being hired is to go to e.g. LLVM or GCC meetup, and make it clear that you are looking for a job. Put it on your Github that you are looking for a job. Put a link to an LLVM or GCC (or similar) compiler phase that you have written. The next LLVM dev meeting is November 8-9 in San Jose, CA, hang out there, rest assured there will be many companies seeking to hire. Bonus points if you give a presentation, even if it's whimsy like compiling Brainfuck to Intel 4004 ... Another avenue is RISCV startups which are currently 10-a-penny, they all need and hire compiler engineers. Cold-contact them ...
PS, since you are a PhD student. I suggest finishing your PhD in some form or other. A lot of companies are loath to hire PhD students before they finish, or make the job offer contingent upon successful graduation. That certainly was our policy. If you are not yet near graduation, I suggest to do a compiler related internship.
SPARK does not allow doubly linked lists:
https://blog.adacore.com/pointer-based-data-structures-in-sp...
For example, singly-linked lists and trees are supported whereas
doubly-linked lists are not.
With Frama-C you can prove doubly linked lists and all manner of complicated pointer manipulating graph algorithms. It does not impose a Rust-like pointer ownership policy as does SPARK.However, for embedded development, SPARK's restrictions are a good trade-off, as the more restrictive rules allow more proofs to be fully automated than with Frama-C and simplify diagnostic messages. A fly-by-wire avionics computer doesn't need to dynamically allocate a billion graph nodes. But SPARK is not "general purpose" like C with Frama-C is.
AdaCore's SPARK tool stack is not actually written in SPARK as far as I can see, much of it is actually OCaml and Coq/Gallina for the Why3 component also used by Frama-C. See all the .ml OCaml and .v Gallina source code for yourself:
https://github.com/AdaCore/why3
And of course the compiler backend for Ada/SPARK is GNU GCC, written in unverified C:
https://github.com/gcc-mirror/gcc/tree/master/gcc/config
Compare with CompCert, the formally verified C compiler:
https://github.com/AbsInt/CompCert
Frama-C unfortunately requires a user to be mathematician-logician logic programming expert to fully utilize. One can begin training in Coq/Gallina with the large free online Software Foundations course:
1. Frama-C, which enables writing bug-free C programs, is implemented using a combination of OCaml and the Coq proof assistant (which is itself implemented in OCaml):
2. CompCert, a formally verified C compiler, is implemented using OCaml and Coq:
https://github.com/AbsInt/CompCert
Nothing at this level yet exists in the Rust and Zig ecosystems, for example. Rust is a very complicated language with many features which makes formalization much harder than with C and OCaml, which both had mathematical ideas of simplicity, analyzability, and minimalism inspire their designs, even if they continue to grow into more complicated monsters, which, incidentally this OCaml 5.0 release will contribute towards. :-)
Unfortunately I don’t have many more suggestions that would be good pedagogically. Benjamin Pierce (who wrote Software Foundations) has another very famous intro book, Types and Programming Languages: https://www.cis.upenn.edu/~bcpierce/tapl/ which is regarded as a “Bible” of type theory for computer scientists. Note: I haven’t personally read this, but I have read the sequel, Advanced Types and Programming Languages. Both books have exercises and most exercises have solutions.
Simon Peyton-Jones wrote a very good book, the Implementation of Functional Programming Languages. It is old (1987), but free, a good read, and covers most of the “important” topics without assuming too much background: https://www.microsoft.com/en-us/research/publication/the-imp...
For formal verification - there are definitely better-qualified people than me. I found a lot of the source code to CompCert C (a verified C compiler) illuminating: https://github.com/AbsInt/CompCert However, it will be difficult to understand without doing Software Foundations first (I still find personally tactics-style proofs in Coq confusing and don’t like Coq as much as Idris or Agda).
I am very shy and can’t help you with the online communities :) The type theory subreddit is pretty active and they seem nice.
By that argument every implementation of quicksort (say) would be using formal methods. But they aren't. There are some definitions of formal methods allowing this kind of "weaker" form, but for the sake of this discussion such methods do not prevent a particular class of bugs. Explicit goal statements are, while being a very good direction, informal.
> Though, if the parents goal was to imply that the whole of SQLite should be verified... I'm not convinced its entirely possible. Quite a lot of SQLite revolves around things around where the halting problem may come into play.
This is probably correct and I don't recommend to do so. Even CompCert [1], probably the single most verified C compiler, is not absolutely fully verified and of course there had been bugs in the unverified modules (codegen AFAIK). But formal methods can be partially applied to produce provably more correct code (in some metrics).
Take a look at the ideal mathematical (verified) C compiler written in Coq. I find it neat:
Question for anyone handing out unofficial legal opinions:
This one is dual-licensed. There are a small kerfuffle of custom licenses that say "no commercial use", but everything is also variously covered by the GPL (v2 or later), LGPL v3, and BSD 3-clause.
Ref: https://github.com/AbsInt/CompCert/blob/master/LICENSE
I'm curious if I could use this in a commercial setting if I perfectly comply with the various GPL and BSD licenses.
If I could, that could be a problem for this group, because the repo readme (at https://github.com/AbsInt/CompCert) explicitly states "this is not free software."
* unison (a great file sync tool for win/posix from Benjamin Pierce, the author of Software foundations and TAPL)
https://github.com/bcpierce00/unison
* mirage (which also has a bunch of interesting subprojects, like irmin)
* coq
* flow
https://github.com/facebook/flow
* CompCert
https://github.com/AbsInt/CompCert
* frama-c
* 0install
What's the difference between GCC deleting parts of your code, and an attacker hacking into the source code repository and deleting those parts of the code?
The C standards committee addressed the problem in 2009 with memset_s. [0] The GNU developers reject patches and state they hope this feature is never implemented. [1]
The bug fix is to stop using GCC for sensitive code. Use CompCert instead. https://github.com/AbsInt/CompCert
[0] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1381.pdf
[1] https://sourceware.org/ml/libc-alpha/2014-12/threads.html#00...
https://github.com/AbsInt/CompCert
It's free for non-commercial use. I've used it for several months now to build things like Tor. I haven't noticed any disadvantage compared to GCC or Clang.
http://compcert.inria.fr/compcert-C.html
EDIT: See also https://github.com/AbsInt/CompCert