I too am coming up on a need for no-downtime HA failover for Postgres. I too am not allowed to use a hosted PaaS-ish solution like RDS. I was considering Citus's multi master impl (I don't need to spread the load, just need HA). I had not considered Pacemaker. Has GoCardless investigated this option and have any insight to give? HA has traditionally been a real pain point for traditional RDBMS's in my experience.

To be honest we've not looked into Citus in any depth.

My early impression of it (can't speak for the rest of the team) was that it was mostly aimed at sharding analytics workloads, but parts of the docs (e.g. https://docs.citusdata.com/en/v7.1/admin_guide/cluster_manag...) make it sound like it handles OLTP workloads too.

Maybe I've been ignoring it for bad reasons!

EDIT: Managing Postgres clusters is something that a lot of people are working on. Thought I'd mention two projects that have me excited right now:

  - Patroni https://github.com/zalando/patroni
  - Stolon https://github.com/sorintlab/stolon
Stolon's client proxy approach in particular looks interesting, and reminds me of how people are using Envoy (https://github.com/envoyproxy/envoy), albeit as a TCP proxy rather than one that understands and can do fun stuff with the database's protocol. I wonder if we'll start to see more Envoy filters for different databases!