There is now a response to the support thread from Fly[1]:

> Hi Folks,

> Just wanted to provide some more details on what happened here, both with the thread and the host issue.

> The radio silence in this thread wasn’t intentional, and I’m sorry if it seemed that way. While we check the forum regularly, sometimes topics get missed. Unfortunately this thread one slipped by us until today, when someone saw it and flagged it internally. If we’d seen it earlier, we’d have offered more details the.

> More on what happened: We had a single host in the syd region go down, hard, with multiple issues. In short, the host required a restart, then refused to come back online cleanly. Once back online, it refused to connect with our service discovery system. Ultimately it required a significant amount of manual work to recover.

> Apps running multiple instances would have seen the instance on this host go unreachable, but other instances would have remained up and new instances could be added. Single instance apps on this host were unreachable for the duration of the outage. We strongly recommend running multiple instances to mitigate the impact of single-host failures like this.

> The main status page (status.fly.io) is used for global and regional outages. For single host issues like this one we post alerts on the status tab in the dashboard (the emergency maintenance message @south-paw posted). This was an abnormally long single-host failure and we’re reassessing how these longer-lasting single-host outages are communicated.

> It sucks to feel ignored when you’re having issues, even when it’s not intentional. Sorry we didn’t catch this thread sooner.

[1] https://community.fly.io/t/service-interruption-cant-destroy...

For what it’s worth, I left Fly because of this crap. At first my Fly machine web app had intermittent connection issues to a new production PG machine. Then my PG machine died. Hard. I lost all data. A restart didn’t work - it could not recover. I restored an older backup over at RDS and couldn’t be happier I left.

I left digitalocean for fly because some of their tooling was excellent. I was pretty excited.

I’m back on digitalocean now. I’m not unhappy about it, they’re very solid. I don’t love some things about their services, but overall I’d highly recommend them to other developers.

I gave up on fly because I’d spontaneously be unable to automate deployments due to limited resources. Or I’d have previously happy deployments go missing with no automatic recovery. I didn’t realize this was happening to a number of my services until I started monitoring with 3rd party tools, and it became evident that I really couldn’t rely on them.

It’s a shame because I do like a lot of other things about them. Even for hobby work it didn’t seem worth the trouble. With digitalocean, everything “just works”. There’s no free tier, but the lower end of pricing means I can run several Go apps off of the same droplet for less than the price of a latte. It’s worth the sanity.

I moved from DO to Hetzner ( cheaper), I am happy about it.

Does anyone know how Hetzner pricing is half of DO yet is profitable, while DO is loss making with 6% operating margin?

They run their own data centres and have for a while. There is a pretty big industry for that sort of thing as an alternative to “the cloud” here in Europe.

We used to use nianet to house our hardware in Denmark. Basically these companies does hardware renting and they also do hardware renting with more steps which is where you rent rack space but own the hardware. They provide the place for the hardware and they also have multiple locations so that you have both backup and redundancy, and while it doesn’t scale globally in 20 years I’ve literally never worked on anything that needed to beyond having some buffer caches for clients logging in on their vacations or something like that.

What Hetzner seems to be doing with the DO styled hosting, and this is just a guess, is that they are one or the many EU companies preparing for the big EU exodus from the non-EU cloud. Which is frankly a solid bet these days where both AWS and Azure are increasing prices and are becoming more and more unusable because of EU legislation. Part of this is privacy which Microsoft and Amazon are great with in terms of compliance, but part of it is also national security. I work in an investment bank that builds solar plants, since finance and energy are both critical sectors we risk being told that half of the finance/energy companies in the world can’t use Microsoft because the EU seems it as a single point of failure if our entire energy sector relies on Azure. Which is sort of reasonable right? But what this means for us is that we can’t vendor lock-in, not really, because we need to have up-to-date exit strategies for how we plan on being fully operation a month after leaving Azure. Which is easy when you just containerise everything and run it in VMs or similar, and really annoying if you go full in on things like AKS. Which doesn’t help our Azure costs.

Anyway, right now we are planning on leaving Azure because of cost. Not today, not next week but sometime in the next 5-10 years and a lot of these EU cloud alternatives that actually operate the hardware instead of renting it are likely going to be a very realistic alternative. And that is the private sector, I spend time in the EU public sector which is a massive amount of money and I’m guessing it’ll leave both AWS and Azure by 2050. Some of these EU cloud initiatives is going to explode when that happens, and right now, hetzner is one of the best bets.

To get back to your question, DO rents server space. I have no idea where they’d rent it in Germany but they could potentially be renting it from Hetzner.

Couldn't agree more, I think Hetzner is probably Europe's best bet on a hyperscaler. One of the more telling indicators IMO is their growing market share outside of the EU/DACH.

To add on to the comments about Hetzner building their own custom hardware, they also custom built their own software stack. They rejected the hype that was OpenStack and worked diligently on their own hypervisor platform (that they are incredibly secretive about) and that appears to be paying off in spades for them. Most sovereign cloud plays end up being suffocated by the complexity, and incoherence, of the OpenStack ecosystem. It just becomes impossible to ship.

For a fascinatingly different take on how to build a datacenter: https://www.youtube.com/watch?v=5eo8nz_niiM

* Edit: remove speculation about Kubernetes and Hetzner, that was based on hazy memory.

For anyone interested in Kubernetes on Hetzner, there's a really interesting CAPI provider being actively developed:

https://github.com/syself/cluster-api-provider-hetzner