See also automerge [1], discussed at the end. They are currently working on performance improvements [2]. Quoting from the repo, "automerge is a library of data structures for building collaborative applications in JavaScript:
* You can have a copy of the application state locally on several devices (which may belong to the same user, or to different users). Each user can independently update the application state on their local device, even while offline, and save the state to local disk.
* (Similar to git, which allows you to edit files and commit changes offline.)
* When a network connection is available, Automerge figures out which changes need to be synced from one device to another, and brings them into the same state. (Similar to git, which lets you push your own changes, and pull changes from other developers, when you are online.)
* If the state was changed concurrently on different devices, Automerge automatically merges the changes together cleanly, so that everybody ends up in the same state, and no changes are lost. (Different from git: no merge conflicts to resolve!)"
[1] https://github.com/automerge/automerge [2] https://github.com/automerge/automerge/pull/253
So if I change "foo" to "moo" and you change "foo" to "boo", who wins?
So this is either represented as a delete followed by an insert (delete one character at offset N, insert "m" at offset N), or as a replace (atomically replace character at offset N with "m").
For an atomic replace operation, CRDT algorithms will solve this by having the last write win. What CRDTs give you here is a guarantee that the order is the same for every participant. So if you're building a collaborative text editor, for example, either everyone will either see "moo" or everyone will see "boo".
For a delete + insert, it might not be atomic, in which case only the delete will "conflict". Since you both deleted at the same time, it's not actually a conflict (you both did the same thing), and the result will be either "mboo" or "bmoo". But again, it will the same for everyone.
Interesting. What about the seqeunce "Alice deletes f, Bob replaces f with b, Alice inserts m"? I guess it doesn't matter, as long as all implementations do the same thing.
git could easily take an approach like this too, but there are obvious reasons why it doesn't. It feels like the people designing this algorithm believe the text being worked on is less important than source code.
I don't see how it's possible. I get Alice's changes, I spend 3 hours working on them, I get Bob's changes. The algorithm might be able to resolve these three sets of changes consistently according to its rules, but I've got no faith that the meaning of the text would survive the process.
For offline sync, where someone edits a text document for an hour and then syncs, you're right: You can end up with something unintended, since each participant is editing based on ("branching off") a snapshot. For example, if I deleted a whole paragraph, and you edited it, what should the end result be? But at least the end result will be consistent in the sense that all participants end up seeing the same thing, though semantically it may be wrong.
Note that CRDTs go beyond just text. CRDTs can be used to represent arbitrary data structures and operations on them: Array s (insert, delete, append, etc.), numbers(increment, decrement, etc.), dictionaries (insert, delete), etc. A great implementation of this is Automerge [1].