> But no, it's not really ok when your servers have an average lifetime of 30 days. It's very hard to offer a stable service on an unstable substrate.
The whole cattle mindset because at the end of the day everything is a "unstable substrate" your building a stable service on unstable blocks pets don't solve the issue that each pet is fundamentally unstable and your just pretending it's not.
> The whole cattle mindset because at the end of the day everything is a "unstable substrate" your building a stable service on unstable blocks pets don't solve the issue that each pet is fundamentally unstable and your just pretending it's not.
That's not the way the world has to be. You can have a network that is rock solid. You can have power that is rock solid. You can have hardware that is rock solid.
Sure, if you have a couple thousand machines, a few of them will have hardware problems every year. Yes, once in a while an automatic transfer switch will fail and you'll have a large data center outage. Backhoes exist. Urgent kernel fixes happen. You have to acknowledge failures happen and plan for them, but you should also work to minimize failures, which I honestly haven't seen at the 'cattle not pets' workplaces. Cattle take about two years to get to market [1] (1.5 years before these people receive them, then 180 days before sending them to market); I'd be fine with expecting my servers to run for two years before replacement (and you know, rotating in new servers throughout, maybe swapping out 1/8th of the servers every quarter, etc), but after running for 30 days at 'cattle not pets', I started getting complaints that my systems were running for too long.
[1] https://cultivateconnections.org/how-do-you-determine-when-t...
I’ve had Linux servers with > 1 year of uptime. I’ve seen much, much higher. It’s entirely possible to have a stable foundation; it’s modern software that’s hot garbage, and relies on ephemerality to stay running.
...right, yes, servers. I've certainly never accidentally forgotten to reboot a laptop on cheap commodity hardware for a few months. Slightly more than a few months. Look, it got rebooted eventually, okay?