Diff is reverse engineering a many to one function - many possible insert/delete sequences applied to string X map to the same string Y.

What would it look like to store files natively as insert/delete sequences instead? So instead of filesystems and diffs on top, we could have DIFFsystems and files on top. Kind of like a WAL. Files would be checkpoints in the WAL for efficiency, and diffs would be 100% accurate between two checkpoints. Probably takes a hell of a lot more space & CPU though..

This reminds me of codeq, a clojure+datomic project that intended to move source control from lines of text to functions and expressions. From their introduction [0]:

  Backstory

  Programmer Sally: "So, what are you going to do today Bob?"
  Programmer Bob: "I'm not happy with the file baz.clj residing in my/ns. So I'm going to go to line 96 and change 2 to 42. I've been thinking about deleting line 124. If I have time, I'm also going to insert some text I've been working on at line 64."
  Programmer Sally: (what's wrong with Bob?)

  Short Story

  codeq ( 'co-deck') is a little application that imports your Git repositories into a Datomic database, then performs language-aware analysis on them, extending the Git model down from the file to the code quantum (codeq) level, and up across repos. By doing so, codeq allows you to:
  - Track change at the program unit level (e.g. function and method definitions)
  - Query your programs and libraries declaratively, with the same cognitive units and names you use while programming
  - Query across repos
I never got to use it, though, and it seems that there have been no more updates in the repo [1] since 6 years ago.

[0] https://blog.datomic.com/2012/10/codeq.html [1] https://github.com/Datomic/codeq