andix 1 day ago

Isn't one of the strategies also to run one or two backup clusters for any production cluster? Which can take over the workloads if the primary cluster fails for some reason?

In a cloud environment the backup cluster can be scaled up quickly if it has to take over, so while it's idling it only requires a few smaller nodes.

1
fragmede 23 hours ago

You might run a cluster per region, but the whole point of Kubernetes is that it's highly available. What specific piece are you worried about will go down in one cluster that you need two production clusters all the time? Upgrades are a special case where I could see spinning up a backup cluster for.

andix 17 hours ago

A lot of things can break (hardware, networking, ...). Spanning the workload over multiple clusters in different regions is already satisfying the "backup cluster" recommendation.

Many workloads don't need to be multi-region as a requirement. So they might run just on one cluster with the option to fail over to another region in case of an emergency. Running a workload on one cluster at a time (even with some downtime for a manual failover) makes a lot of things much easier. Many workloads don't need 99,99% availability, and nothing awful happens if they are down for a few hours.