Seems like overkill, and just a shiny new toy to play with for this use case. Stackoverflow.com uses SQL. You aren't exceeding stackoverflow. In a couple of years they'll migrate back, or to something else.

The whole point of cdb is to enable things like Google scale. But all I ever see on it are random devs spinning up three nodes and being so proud.

Which is too bad because I'd love to see someone pushing its limits, and betting mission critical services on it that nothing else could handle.

Sorry if I wasn't clear, but the goal of the migration wasn't for scale, it was for HA.

I find it difficult to setup PostgreSQL without having a single point of failure and introducing downtime for things such as upgrades.

We aren't doing life changing stuff, true, but a downtime in our system can result in employees (often at the bottom of the economic ladder) not being able to get to work. That gnaws at me. I guess the better alternative is to go for a hosted approach, but that isn't without its own complications and challenges.

Out of curiosity what do you feel so hard to set up with PostgreSQL?

Having a master with a read replica and then making the read replica the master in case the master goes down seems to be a very well known methodology. You can use a virtual ip with both servers behind it (keepalived) or a 0 ttl DNS solution (consul). Is you case more complex than that? With master to master replication I feel there are couple of gotchas and you need to design your data in a way to avoid conflict as much as you can. A solution like bdr seems to be an off-the-shelf one that is proven.

I haven't really upgraded postgres in a rapid fashion, usually just keep with whatever version comes with the OS package manager forever.. or whatever docker image I start it with. I guess the same approach for HA can be used to upgrade one at a time, am assuming postgres doesn't break backwards compatibility very often.

You can either automate the failover or have a system to SMS you when it happens. I like to do it manually personally since I have seen the automated failover fail more times than the manual procedure.

I have probably managed over 30 postgres instances in my services that have grown up to 1/2 PB of data and I've never had a single issue with it... postgres, redis, rabbitmq, consul are technologies that you RTFM, set it up, and it just works!

I wouldn't trust a technology that isn't established and widely adopted which is how I see cockroachdb a the moment.

No, our setup is basic. We use barman, so I guess repmgr by the same company (2ndQuadrant) would have been the route to go. I could have figured it out, set it up, and been done with it. But, without having done it before, I didn't feel very confident. There seems to be a lot of moving parts, a lot of options, it's changing quite a bit (12.0 introduced additional changes), and I'm still not sure how to do upgrades of major releases without downtime.

We still use Postgresql, and I agree, it's setup and forget. But those aren't HA (and don't need to be).

Citus released this last year: https://github.com/citusdata/pg_auto_failover It looks interesting although I haven't used it, I tried doing it with corosync + pacemaker.