What does HackerNews think of unsplit?

Resolves conflicts in Mnesia after network splits

Language: Erlang

If I recall correctly, if some node desyncs, it's hard to get it started again. You have to manually transfer the entire database from another node which can obviously take a long time if you have a large database. It doesn't really handle netsplits by itself (there's https://github.com/uwiger/unsplit) so you have to be prepared to do this yourself.
Sure, few words on customizations we've built into mnesia: - we integrated it with our eureka-based service discovery mechanism, so that it can automatically cluster with other servers that are spun up in a process of cluster bootstrap/resizing. Also relaxed constraints when merging 2 identical schemas of separate clusters (when table cookies don't match, but everything else matches we still want to merge and take a union of already existing data) - we've added a bunch of auto-merge code (heavily inspired by Ulf's wonderful https://github.com/uwiger/unsplit library) in case of network partitions - we've also added support for maintaining pools of processes for each table for dirty updates (as opposed of going through mnesia_tm for every single operation, including transactions as well as dirty_asyncs)

I'm 100% aware that these changes are RMS/Riot specific and won't work in many other situations (e.g. they violate certain transaction isolation properties).

I'm the Fred mentioned in the quoted text. As mentioned by FLGMwt, Mnesia being a 90s database was an offhand remark done at a workshop in a discussion doing a break.

The reasons I considered it a DB of the 90s is that back then, it could have been state of the art, but by today's standards, under its current form, it makes sense to be used mostly on fixed cluster sizes with reliable networks and a fairly stable topology.

Any fancier cases and you start requiring to dive into the internals when it comes to coming back up from failures, partitions, requiring repairs, and so on. The DB has 3 standard backends: all in RAM, all on Disk (with a 2GB limit), or as a log-copy (infinite disk size, but also bound by memory size).

That ends up leaving you with a DB that has a need for its whole dataset to fit in memory, supports distributed transactions but can't deal with network failures well out of the box (you need something like https://github.com/uwiger/unsplit)

Mnesia gaining new backends (Klarna is currently open-sourcing code for an experimental postgres backend and are using a leveldb one) would fix a lot of issues as a single-node DB, but another overhaul would be required for the rest.

The problem I see is that it was a very cool database back then, but it started lagging behind for a long while and now it has to play a catch-up game. Its model and Erlang interface is still extremely nice and I wish it made more sense to use in production without committing to learning its internals in case of troubles.

Actually, a lot of internal business style applications fit Mnesia's model too (if you're hosting them in a single datacenter). While it means you'll need to manually deal with netsplits (or write your own code to address them, or try to borrow https://github.com/uwiger/unsplit), if you're hosting it internally on your own servers, that might be a rare enough event (and with sufficiently minimal likelihood/consequence of things going down, coming back up, etc, such as to introduce problematic inconsistencies rather than downtime/ignorable inconsistencies in the event of major network/system thrashing) as to be worth it for certain things.

Something, like, say, file processing. You have a watch directory, you want to be able to process everything that lands in that directory in a scalable manner, but don't want to re-process the same thing (but it's okay if you do, just inefficient). Mnesia is probably fine to keep tabs of what you've processed already; in the event of a netsplit you can just let all sides of it keep going, until you get around to fixing the cluster. Your inconsistencies just lead to inefficiencies, rather than real data loss, and you have a clear path to fixing them (just dump the data on the partitioned nodes, and rejoin them to the cluster). As such, you have a more resilient, scalable system than you would if you just used a centralized database, while not having to configure and manage a separate DB.

That said, I like the idea of being able to swap Mnesia out for something a little less warty, if it's pretty seamless in operation.