OK guys, running on a single instance is REALLY a BAD IDEA for non-pet-projects....

MaKey · 2025-11-13T18:32:54 1763058774

I think you're being overly dramatic. In practice I've seen complexity (which HA setups often introduce) causing downtimes far more often than a service being hosted only on a single instance.

mads_quist · 2025-11-13T20:28:25 1763065705

You'll have planned downtime just for upgrading MongoDB version or rebooting the instance. I don't think that this is sth you'd want to have. Running MongoDB in a replica set is really easy and much easier than running postgres or MySQL in an HA setup.

No need for SREs. Just add 2 more Hetzner servers.

spwa4 · 2025-11-13T22:55:35 1763074535

The sad part of that is that 3 Hetzner servers are still less than 20% of the price of equivalent AWS resources. This was already pretty bad when AWS started, but now it's reaching truly ridiculous proportions.

from the "Serverborse": i7-7700 with 64GB ram and 500G disk.

37.5 euros/month

This is ~8 vcpus + 64GB ram + 512G disk.

585 USD/month

It gets a lot worse if you include any non-negligible internet traffic. How many machines before for your company a team of SREs is worth it? I think it's actually dropped to 100.

mads_quist · 2025-11-14T06:26:58 1763101618

Sure, I am not against Hetzner, it's great. I just find that running sth in HA mode is important for any service that is vital to customers. I am not saying that you need HA for a website. Also, I run many applications NOT in HA mode but those are single customer applications where it's totally fine to do maintenance at night or on the weekend. But for SaaS this is probably not a very good idea.

lewiscollard · 2025-11-13T18:49:26 1763059766

Yes, any time someone says "I'm going to make a thing more reliable by adding more things to it" I either want to buy them a copy of Normal Accidents or hit them over the head with mine.

immibis · 2025-11-14T14:53:19 1763131999

How bad are the effects of an interruption for you? Google has servers running every day, but you with one server can afford to gamble on it, since it probably won't fail for years - no matter the hardware though, keep a backup, because data loss is permanent. Would you lose millions of dollars a minute, or would you just have to send an email to customers saying "oops"?

Risk management is a normal part of business - every business does it. Typically the risk is not brought down all the way to zero, but to an acceptable level. The milk truck may crash and the grocery store will be out of milk that day - they don't send three trucks and use a quorum.

If you want to guarantee above-normal uptime, feel free, but it costs you. Google has servers failing every day just because they have so many, but you are not Google and you most likely won't experience a hardware failure for years. You should have a backup because data loss is permanent, but you might not need redundancy for your online systems. Depending on what your business does.

smartbit · 2025-11-14T06:21:05 1763101265

Normal Accidents https://en.wikipedia.org/wiki/Normal_Accidents

PunchyHamster · 2025-11-13T21:38:00 1763069880

HA can be hard to get right, sure, but you have to at least have (TESTED) plan for what happens

"Run a script to deploy new node and load last backup" can be enough, but then you have to plan on what to tell customers when last few hours of their data is gone

badestrand · 2025-11-14T05:43:19 1763098999

I have a website with hundreds of thousands of monthly visitors running on a single Hetzner machine since >10 years (switched machines inside Hetzner a few times though).

My outage averages around 20 minutes per year, so an uptime of around 99.996%.

I have no idea where you see those "huge outages" coming from.

freefaler · 2025-11-13T20:10:16 1763064616

We have used Hetzner for 15+ years. There were some outages with the nastiest being the network ones. But they're usually not "dramatically bad" if you build with at least basic failover. With this we had seen less than 1 serious per 3 years. Most of the downtime is because of our own stupidity.

If you know what you're doing Hetzner is godsend, they give you hardware and several DCs and it's up to you what you can do. The money difference is massive.

notTooFarGone · 2025-11-13T19:11:15 1763061075

There are so many applications the world is running on that only have one instance that is maybe backupped. Not everything has to be solved by 3 reliability engineers.

antoniojtorres · 2025-11-13T18:12:22 1763057542

agree on single instance, but for hetzner, I run 100+ large bare metal servers in hetzner, have for at least 5 years and there’s only been one significant outage they had, we do spread across all their datacenter zones and replicate, so it’s all been manageable. It’s worth it for us, very worth it.

raxxorraxor · 2025-11-14T12:31:54 1763123514

Tell me about a service that needs this reliability please. I cannot think of anything aside perhaps some financial transaction systems, which all have some fallback message queue.

Also, all large providers had outages of this kind as well. Hell, some of them are partially so slow that you could call it an outages as well.

Easy config misstep and your load balancer goes haywire because you introduced unnecessary complexity.

I did that because I needed a static outgoing IP on AWS. Not fun at all.