Infra person here, this is such the wrong take.
> Do I really need a separate solution for deployment, rolling updates, rollbacks, and scaling.
Yes it's called an ASG.
> Inevitably, you find a reason to expand to a second server.
ALB, target group, ASG, done.
> Who will know about those undocumented sysctl edits you made on the VM
You put all your modifications and CIS benchmark tweaks in a repo and build a new AMI off it every night. Patching is switching the AMI and triggering a rolling update.
> The inscrutable iptables rules
These are security groups, lord have mercy on anyone who thinks k8s network policy is simple.
> One of your team members suggests connecting the servers with Tailscale: an overlay network with service discovery
Nobody does this, you're in AWS. If you use separate VPCs you can peer them but generally it's just editing some security groups and target groups. k8s is forced into needing to overlay on an already virtual network because they need to address pods rather than VMs, when VMs are your unit you're just doing basic networking.
You reach for k8s when you need control loops beyond what ASGs can provide. The magic of k8s is "continuous terraform," you will know when you need it and you likely never will. If your infra moves from one static config to another static config on deploy (by far the usual case) then no k8s is fine.
You’d be swapping an open-source vendor independent API for a cloud-specific vendor locked one. And paying more for the “privilege”
I mean that's the sales pitch but it's really not vendor independent in practice. We have a mountain of EKS specific code. It would be easier for me to migrate our apps that use ASGs than to migrate our charts. AWS's API isn't actually all that special, they're just modeling the datacenter in code. Anywhere you migrate to will have all the same primitives because the underlying infrastructure is basically the same.
EKS isn't any cheaper either from experience and in hindsight of course it isn't, it's backed by the same things you would deploy without EKS just with another layer. The dream of gains from "OS overhead" and efficient tight-packed pod scheduling doesn't match the reality that our VMs are right-sized for our workloads already and aren't sitting idle. You can't squeeze that much water from the stone even in theory and in practice k8s comes with its own overhead.
Another reason to use k8s is the original:
When you deploy on physical hardware, not VMs, or have to otherwise optimize maximum utilization out of gear you have.
Especially since sometimes Cloud just means hemorrhaging money in comparison to something else, especially with ASGs
We found that the savings from switching from VMs in ASGs to k8s never really materialized. OS overhead wasn't actually that much and once you're requesting cpu / memory you can't fit as many pods per host as you think.
Plus you're competing with hypervisors for maxing out hardware which is rock solid stable.
My experience was quite the opposite, but it depends very much on the workload.
That is, I didn't say the competition was between AWS ASGs and k8s running on EC2, but having already a certain amount of capacity that you want to max out in flexible ways.
You don't need to use an overlay network. Calico works just fine without an overlay.
I'm sure the American Sewing Guild is fantastic, but how do they help here?