In Ninja I sorta stumbled through some of the same issues described here. I eventually realized that the interesting question is "does this output file reflect the state of all the inputs" and not anything in particular about mtimes, and that"inputs" includes not only the contents of the input files, but also the executables and command lines used to produce the output.

If you squint, mtime/inode etc. behave like a weak content signature of the input. And once you have that perspective, you say "if mtime != mtime I had last time, rebuild", without caring about their relative values, and that sidesteps a lot of clock skew related issues. It does "the wrong thing" if someone intentionally pushes timestamps to a point in the past (e.g. when switching branches to an older branch) as an attempt to game such a system, but playing games with mtime is not the right approach for such a thing, totally hermetic builds are.

One nice trick is that you can even capture all the "inputs" with a single checksum that combines all the files/command lines/etc., and that easily transitions between truly looking at file content or just file metadata. The one downside is that when the build system decides to rebuild something, it's hard to tell the user why -- you end up just saying "some input changed somewhere".

Is there a description of Ninja's algorithm anywhere? I looked at the manual [1] and didn't quite see it.

Does Ninja use a database like sqlite? It seems like it has to if it does something better than Make's use of mtimes. (e.g. the command line, which Make doesn't consider.)

I looked at redo (linked in the article) and it uses sqlite to store the extra metadata.

[1] https://ninja-build.org/manual.html

No, sorry. And I also mixed what Ninja actually does with some random observations in that comment.

Ninja does use some database-like things, but they are just in a simple text/binary format. It's actually been long enough that I have forgotten the details.

https://ninja-build.org/manual.html#ref_log (contains a hash of the commands used) / https://ninja-build.org/manual.html#_deps (database-like thing with some mtimes, see https://github.com/ninja-build/ninja/blob/master/src/deps_lo... )

This reminds me, I should study ninja's binary format and maybe borrow it :)

The sqlite3 database used in redo was just something I threw together in the first few minutes. sqlite was always massive overkill for the problem space, but because it never caused any problems, it's been hard to justify working on it. I'd like to port redo from python to C, though, and then the relative size of depending on a whole database will matter a lot more.

Someone rewrote ninja from scratch in C at one point and it's shockingly tiny. (No tests = no extra abstractions to make testing possible.)
I found samurai and it is indeed tiny! ~3400 lines was less than I was expecting for *.[ch] !

https://github.com/michaelforney/samurai

Ninja isn't too big though. It looks like about 13K of non-test code, which is great for a mature and popular project. Punting build logic to a higher layer seems to have been a big win :)