Once the first SSD fails after some years, and your monitoring catches that, you can either migrate to a new box, find another intermediate solution/replica, or let them hotswap it while the other drive takes on.
Of course though, going to physical servers loses redundency of the cloud, but that's something you need to price in when looking at the savings and deciding your risk model.
And yes, running this without also at least daily snapshotting/backup to remote storage is insane - that applies to cloud aswell, albeit easier to setup there.
Have 2x servers atleast then invest in proper monitoring.
Server can fail without disk failures.
For quite a while we ran single power supplies because they were pretty high quality, but then Supermicro went through a ~6 month period where basically every power supply in machines we got during that time failed within a year, and replacements were hard to come by (because of high demand, because of failures), and we switched to redundant. This was all cost savings trade-offs. When running single power supplies, we had in-rack Auto Transfer Switches, so that the single power supplies could survive A or B side power failure.
But, and this is important, we were monitoring the systems for drive failures and replacing them within 24 hours. Ditto for power supplies. If you don't monitor your hardware for failure, redundancy doesn't mean anything.
Yeah. This blog post reads like it was written by someone who didn't think things through and just focused on hyper-agressive cost-cutting.
I bet their DigitalOcean vm did live migrations and supported snapshots.
You can get that at Hetzner but only in their cloud product.
You absolutely will not get that in Hetzner bare-metal. If your HD or other component dies, it dies. Hetzner will replace the HD, but its up to you to restore from scratch. Hetzner are very clear about this in multiple places.
They could, but they didn't and instead they wrote that blog post which, even being generous is still kinda hard to avoid describing as misleading.
I would not have written the post I did if they had presented a multi-node bare-metal cluster or whatever more realistic config.
What do you feel was misleading?
They don't.
And reading the article, they don't seem to understand that.
Erm. I already spelt it out in my original post ?
I'm not going to re-write it, the TL;DR is they are making an Apples and Oranges comparison.
Yes they "saved money" but in no way, shape or form are the two comparable.
The polite way to put is is .... they saved as much money as they did because they made very heavy handed "architectural decisions". "Decisions" that they appear to be unaware of having made.
I agree with the other poster, this is fine for a toy site or sites but low quality manual DR isn't good for production.
I don't know where to start with this comment. Do I really need to spell out the difference between cloud and bare metal ?
A few examples...
- Live migration ? Cloud only.
- Snapshots ? Cloud only.
- Want to increase disk space ? Tick box in cloud vs. replace disks (or move to different machine) and re-install/restore in bare metal....
- Want to increase RAM ? Tick box in cloud vs. shutdown, pull out of rack, install new chips (or move to different machine and re-install/restore)....
- Want to upgrade to a beefier processor ? Tick box in cloud vs move to a completely different machine and re-install/restoreAlso, with something like Hetzner you would not be going in and physically doing anything. You also just tick a box for a RAM upgrade, and then migrate over or do active/passive switch.
The cloud does have advantages, mostly in how "easy" it is to do some specific workflows, but per-compute it's at least 10x the cost. Some will argue it's less than that, but they forget to factor in just how slow virtual disks and CPU are. Cloud only makes sense for very small businesses, in which the operational cost of colocation or on-prem hosting is too expensive.
Yeah you pay for and get additional stuff with cloud. Nobody disputed that.
Well, technically its still a possibility.
I am old enough to have seen issues with RAID1 setups not being able to restore redundancy, as well as RAID controller failures and software RAID failures.
Also, frankly you are being somewhat pedantic. My broader point was regarding cloud. I gave HD Failure as one example, randomly selected by my brain ... I could have equally randomly chosen any of the other items ... but this time, my brain chose HD.
Curious what the delta to pain-in-ass would be if I want to deal with storing data. (And not just backups / migrations, but also GDPR, age verification etc.)
i already design with Auto Scale Group in mind, we run it in spot instance which tend to be much cheaper. Spot instances can be reclaimed anytime, so you need to keep this is kind.
I also have data blobs which are memory maped files, which are swapped with no downtime by pulling manifest from GCS bucket each hour, and swapping out the mmaped data.
i use replicas, with automatic voting based failover.
I've used mongo with replication and automative failover for a decade in production with no downtime, no data lost.
Recently, got into postgres, so far so good. Before that i always used RDS or other managed solution like Datastore, but they cost soo much compared to running your own stuff.
Healthchecks start new server in no time, even if my Hertzner server goes out or if whole Hertzer goes out, my system will launch digital ocean nodes which will start soaking up all requests.
Recently, I did it in PostgreSQL using pg_auto_failover. I have 1 monitor node, 1 primary, and 1 replica.
Surprisingly, once you get the hang of PostgreSQL configuration and its gotchas, it’s also very easy to replicate.
I’m guessing MySQL is even easier than PostgreSQL for this.
I also achieved zero downtime migration.
Not every app needs 24/7 availability. The vast majority of websites out there will not suffer any serious consequences from a few hours of downtime (scheduled or otherwise) every now and then. If the cost savings outweigh the risk, it can be a perfectly reasonable business decision.
A more interesting question would be what kind of backup and recovery strategy they have, and which aspects of it (if any) they had to change when they moved to Hetzner.