What does HackerNews think of automerge?

A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.

Language: JavaScript

#16 in JavaScript
Anyone unsure of what a CRDT is (I think everyone on HN must know by now), this is the perfect intro: https://www.inkandswitch.com/peritext/

The two most widely used CRDT implementations (combining JSON like general purpose types and rich text editing types) are:

- Automerge https://github.com/automerge/automerge

- Yjs https://github.com/yjs/yjs

Both have JS and Rust implementations, and have bindings to most online rich text editors.

CRDTs are addictive one you get into them.

Tooling for developing real-time collaborative remote working environments is about to take of dramatically.

CRDTs [0], while complex to work with untill recently, are now so much easer for developers to use with toolkits such as Yjs[1] and AutoMerge[2]. SAAS and PAAS companies proving tooling around these, enabling developers to easily build collaborative tools for specific niches and verticals are going to explode into the market.

Every 5-ish years there is a big “new” database tech that receives massive investment for both enterprise and small business. Real time CRDT based data stores are the “next big thing” - in my view.

CRDTs are often only talked about in relation to rich text editing, but “generic” CRDTs that represent “standard” data types (think JSON), and basic operations to them (inset, edit, remove) are able to represent so much more. You can use them for building so many CRUD type business apps, and by using a CRDT as your base data representation you get conflict free collaborative (and offline) editing for free.

The nice thing about both Yjs and AutoMerge is that they provide both Rich Text and JSON-like data types, covering 95% of what people would need for building business apps.

0: https://en.m.wikipedia.org/wiki/Conflict-free_replicated_dat...

1: https://github.com/yjs/yjs

2: https://github.com/automerge/automerge

Thank you! I actually kind of cheated and didn't have to implement any of that myself because Firebase took care of it. But when I was considering how to build Calenday, those were on my mind. I had read about CRDTs on Figma's blog (https://www.figma.com/blog/how-figmas-multiplayer-technology...) and was considering using something like this to ease the burden on implementation: https://github.com/automerge/automerge
Electric Tables looks quite cool and I love the thought process going into it.

It seems like it could pair really nicely with the work that Ink&Switch (https://www.inkandswitch.com/local-first) is doing around local-first app development and Automerge (https://github.com/automerge/automerge) as a good way to keep disparate private copies of work in sync.

I have no connection to Ink&Switch, other than appreciating their work.

Would you care to compare this with for example @localfirst/state, automerge and automerge-rs (latter two provide the CRDTs, i.e. similar to Yjs)?

https://github.com/local-first-web/state

https://github.com/automerge/automerge

https://github.com/automerge/automerge-rs

By the way despite that particular repo (@localfirst/state) last being touched 6 months ago, Herb Caudill definitely seems still active in this space (I believe he's been working on other parts of this more recently -- e.g. ideas about authentication), and I think automerge development itself (Martin Kleppmann) is quite active right now leading up to a 1.0 release which seems fairly imminent, for which a lot of fundamental work has been done, also coordinating with automerge-rs.

Eventually we plan to integrate this algorithm into the Text type of Automerge [1], which is a more production-ready CRDT.

[1]: https://github.com/automerge/automerge

I haven't used it, only read through documentation, but IMO Fluid's problem is not so much lock-in as an embrace of old-school columnar storage and handle-based object manipulation. An experienced Windows developer or game dev might feel entirely at home with the tradeoffs/footguns implied by https://fluidframework.com/docs/build/dds/#picking-the-right... ... but show that to a junior React developer and they're likely to be fundamentally confused, or worse assume that the only code example shown is a valid code example. (People writing documentation: please do not make one of the most prominent code examples in your Getting Started an example of what not to do!). And on the handle front, https://fluidframework.com/docs/build/data-modeling/#using-h... is similarly counterintuitive, to say the least.

Comparatively, I'm much more excited about Automerge https://github.com/automerge/automerge, which promises much friendlier developer ergonomics as simple as:

    doc1 = Automerge.change(doc1, 'Mark card as done', doc => {
      doc.cards[0].done = true
    })
Contributor Martin Kleppman (of Designing Data-Intensive Applications fame) has great overview slides here: https://martin.kleppmann.com/2021/06/04/craft-conf.html . If anything, Automerge suffers from a "there's multiple ways to have a server backend, including P2P and centralized, and no one right way" anti-lockin problem, which is refreshing and also frustrating for people who just want to try something out. This is a solvable problem though!
I haven't yet done this but based on some research it seems to me like the core of any collaborative app today (that wants to avoid Firebase and the other hosted platforms like Replicache seems to be) is easiest served by picking some CRDT library.

There are a couple of open-source CRDT libraries that provide both clients and servers (yjs [0] and automerge [1] are two big ones for JavaScript I'm aware of).

My basic assumption is that as long as you put all your relevant data into one of these data structures and have the CRDT library hook into a server for storing the data, you're basically done.

This may be a simplistic view of the problem though. For example I've heard people mention that CRDTs can be space inefficient so you may want/have to do periodic compaction.

[0] https://github.com/yjs/yjs

[1] https://github.com/automerge/automerge

Kleppmann, et al.'s paper on OpSets [1], a specification for building CRDTs with support for an atomic tree move operation, was the best one for me.

Automerge [2] implements a variant of this.

[1] https://arxiv.org/abs/1805.04263

[2] https://github.com/automerge/automerge

Congrats on your new job!

Since you are good at Javascript and willing to learn a new language, why not try to help port CRDT based library e.g. Automerge to compiled language like D language:

https://github.com/automerge/automerge

It seems that Automerge algorithm is quite stable now just need a good native/compiled language implementation especially for enabling local-first desktop applications.

Looks like a CRDT/Operational Transform server/pubsub as a service. There are also some open source frameworks on GitHub which doesn't depend on Azure.

https://github.com/automerge/automerge https://github.com/share/sharedb

CRDTs work best for high-contention situations such as online text editing (think Google Docs) where the conflict resolution can be seen right away and addressed by the user.

For offline sync, where someone edits a text document for an hour and then syncs, you're right: You can end up with something unintended, since each participant is editing based on ("branching off") a snapshot. For example, if I deleted a whole paragraph, and you edited it, what should the end result be? But at least the end result will be consistent in the sense that all participants end up seeing the same thing, though semantically it may be wrong.

Note that CRDTs go beyond just text. CRDTs can be used to represent arbitrary data structures and operations on them: Array s (insert, delete, append, etc.), numbers(increment, decrement, etc.), dictionaries (insert, delete), etc. A great implementation of this is Automerge [1].

[1] https://github.com/automerge/automerge