Rendered at 11:27:11 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
mikeocool 13 hours ago [-]
I made this decision at a startup (albeit when the eng team was ~30 people, and we had a monolith with ~10 supporting services). I wouldn’t do it again, even for the reasons stated in the article.
The uniformity is nice, we were moving from apps running directly ec2 instances provisioned with ansible. Each time we spun up a new service it was a process to get the ec2 instances provisioned just so.
But k8s is such a pain in the ass. One thing that I think people new to it don’t realize is that it’s not at all batteries included - to get a basic managed cluster setup, you’re still going to be installing a bunch of additional controllers (ingress, cert-manager, external dns to start). And then you’re on the hook for making sure all those processes stay up (hope the admission webhook controller for a critical resource doesn’t go down!). Then you’ve got to do a major upgrade on not only your cluster, but all of those controllers every ~3 months. And no one is shy about introducing breaking changes.
Also you’re introducing a huge amount of complexity with the k8s networking and dns layer that most startups have zero need for (if you’re on EKS, make sure to read about scaling and monitoring CoreDNS).
I think there is a real hole in the market for a simple solution that lets you deploy some containers to some instances in a declarative fashion without all of that complexity and does decent LTS versions. I imagine there’s something out there that does this, but k8s has really sucked up all the oxygen.
BobbyTables2 12 hours ago [-]
Pretty sure if there was a simple alternative, people would hate it.
Everyone initially wants thing A. But then they want to customize it to do all permutations and combinations n of A, B, C. They want it to be extensible. They want redundancy. They want orchestration. They want integration.
It’s why practically every config file format eventually becomes its own scripting language. Even HTML started off simple — now ridiculously complex — all the more ironic since practically nobody writes it by hand. Instead of CSS simplifying it, it became more complex.
There is another thing that is extremely customizable and extensible. It’s called a programming language. People write programs to solve specific problems.
There seems to be a perverse trend of cobbling together a Byzantine mesh of libraries, plugins, and services with complex configuration files to make it do practically everything possible. We just used to write software for such purposes…
And for anyone who thinks HTML is simple… the A (anchor) tag has an “ping” attribute that results in POST requests to a list of URLs when a link is clicked ! The list of attributes and resulting variations in behavior is quite mind boggling. It was supposed to be a damn link!
https://html.spec.whatwg.org/multipage/links.html
aranelsurion 30 minutes ago [-]
There are already simpler alternatives, and yes people hate them too. Usually for the opposite reason of k8s: something they need isn’t included, and now bringing it is difficult or impossible.
Fargate and Cloud Run first come to mind.
ajayvk 10 hours ago [-]
I don't think you can provide all the features of Kubernetes while reducing the complexity. What is possible is to support a subset of the features of Kubernetes while making it easy to use.
https://github.com/openrundev/openrun is a project I am building. It supports declarative deployments, on a single-node with Docker or onto Kubernetes. The target use cases is limited to standalone web app, like internal tools. No support for stateful services, you manage stateful services yourself. With that simplification, OpenRun provides a much easier developer experience.
SOLAR_FIELDS 10 hours ago [-]
I look forward to the evolution of your project into a less standardized Kubernetes as end users request more and more features of your project.
ajayvk 8 hours ago [-]
Targeting a specific use case (internal tools) should hopefully help avoid feature creep. Also, the goal is that an OpenRun config should work on a single-node with Docker and with Kubernetes. That limits the types of features which can be implemented (for example no Docker Compose support, no Helm support).
+1 on the problem of moving complexity from programming languages to configuration.
One of the main problems here is that programming languages typically have lots of tools to help validate correctness, whereas configuration tools are typically either much less mature or woefully underused.
There is nothing more frustrating in something failing due to a misconfiguration - but you've no idea what the correct value should be.
notarobot123 4 hours ago [-]
Wow, I didn't know about ping attributes.
Advertisers have really shaped the Web right down to it's core specifications.
chaos_emergent 11 hours ago [-]
Totally agree with you. K8s ends up being the simplest solution for a very complex problem
bandrami 3 hours ago [-]
It was supposed to primarily be a link target as well as possibly a source. Berners-Lee guessed the ratio of in to out wrong there.
threethirtytwo 4 hours ago [-]
not true. The market OFTEN prefers simple things over complicated things.
chvid 2 hours ago [-]
But there is money in complexity.
vbezhenar 30 minutes ago [-]
Docker swarm is that simple solution you're looking for. But people don't need simple solutions. They want scalable solutions and Kubernetes fits this niche perfectly. You can deploy it on single server today and scale to 100 servers managed cluster tomorrow.
Just to provide a similar example. Linux system is insanely complicated. Kernel alone has thousands of options. Distos have tens of thousands of packages. Wherever you look at, everything is hard and complicated. Firewall, containers, init system, filesystem hierarchy, storage layers. One would think that some people desire simpler operating system. But everybody uses Linux despite all these complexities. Try to find OpenBSD in production, for example. It's not easy.
singron 7 hours ago [-]
I really wish there was an 80% kubernetes. I think you could get there with some changes:
1. No overlay networks. 1 IP per machine. pods use dynamically allocated ports, and the kubelet enforces pods listen only on their assigned ports using seccomp.
2. No kube-proxy or equivalent Layer-4 "load-balancer". It's not good, but it's often used. You should use some kind of Layer-7 load balancing instead. Also you need to look up the port number from (1). This also greatly lessens the need for DNS.
3. A better config language. YAML and helm templates are terrible. kustomize is built into kubectl, but it's frustratingly limiting and also still very complicated. Something like nix would have been great. This can make it easier to upgrade third party configs since you can have more logic to validate and merge your settings with upstream defaults or templates.
4. Maybe an EBF-like for the api server? If the built-in k8s objects don't have a setting for something, then you need to write an operator or control loop yourself and then run that too, which is a big lift. Over time, k8s just keeps adding more and more built-in things and then revising them, which creates a ton of churn. If you could easily script simple operations, then they wouldn't have to build in every permutation ahead of time. E.g. the HorizontalPodAutoscaler has 24 config object types with several fields each, but all it does is set replicas based on data read from the api-server, so it could be replaced by some kind of flexible script that runs in the control plane.
vbezhenar 18 minutes ago [-]
1. You can't force third-party software to do that. There are programs with hardcored ports. There are programs which require XML modifications and container rebuilds to change port number. If your platform does not support launching of unmodified containers, it is severely restricted and not suitable for general use. All my programs always use port 8080 for HTTP, I don't make it configurable because I have no reason to.
2. Does not work for all protocols. Again your solution restricts the number of protocols to HTTP protocols. Might work for many uses, but still this restriction doesn't sound very good. Universal load balancer is much simpler conceptually.
3. YAML is not terrible. YAML is awesome. Kubernetes manifests are terrible, that's I agree with. Docker compose is nice, for example. Kubernetes manifests felt like they were designed to be generated from something, but everyone ended up writing them directly or with templates. Though I think that XML generally is superior format so I'd vote for XML in the end.
Overall your suggestions look like you want to shift complexity from cluster operator to software developer. I'm not sure industry supports that, recently it seems to move in the opposite direction, but that's interesting perspective. I guess with some wrappers for some containers it could be made usable.
But honestly you just want to throw away years of progress in containers and network namespaces. I understand that kubernetes mechanisms are somewhat complicated, but the core idea is to make pods look like virtual machines and I think this is very worthy idea.
ddreier 6 hours ago [-]
Unless you hate HCL, 1, 2, and 3 pretty much describe Nomad exactly. We run over 100k production applications on Nomad. But migration to AWS from private data centers, our HashiCorp bill, and the severe lack of Nomad talent, have finally pushed us to k8s (EKS).
stock_toaster 4 hours ago [-]
I wish there was an EKS-like for Nomad!
4 hours ago [-]
a_c 55 minutes ago [-]
The pattern is pervasive. Big corp promotes a solution that fits their need. People read about it, think adopting big corp solution means they are doing the right thing. Few people have big corp need, let alone everyone big corps are different. And then endless hours spent fighting big corp solution to not so big corp problem.
himata4113 12 hours ago [-]
I don't know... running a startup sized kubernetes is relatively easy and pain free these days (k3s). Especially when it comes to scaling up.
CNPG is an absolute monster (in a good way). cert-manager is easier than the docker alternative, calico has never failed me (except in bgp mode which has some footguns like not being able to come back from a dead state since it has a chicken and an egg problem unless you point it to the external load balancer which I would have known if I read the documentation). trafeik is all you need. talosos largely mitigates the bare metal problems and comes pre-hardened and pre-optimized.
I solo most of my development projects and have used k3s for all of them. The only complaint is that cert-manager by default will fail silently and your certificates will expire. I largely mitigated this by having proper visibility setup via grafana and automated alerts (warns if certificates are about to expire) which should have been done by me anyway.
Two years ago I'd agree, today with LLMs everything I have runs talos with fully automated updates and I haven't had to be on-call for almost a year.
KronisLV 3 hours ago [-]
> cert-manager is easier than the docker alternative
MDomain blog.kronis.dev
I'm not saying that cert-manager isn't nice, but with regular Docker/Compose/Swarm setups you can just run a web server/load balancer on whatever ports you want. With mod_md the above is all I really need in a regular .conf file to provision LetsEncrypt certs for my blog (very similar with something like Caddy too). And it's the same in Docker as it is when running the web server directly, I think that's why starting with Docker is really nice, because it has fewer custom abstractions and sometimes regular software does things elegantly already.
makeitdouble 11 hours ago [-]
I think parent would wish for something close to what heroku represented (what would it be ?)
K8s is easier at smaller scales (I understand k3s as a packaged version ?), but you still need one or two people in your team to properly understands all of the concepts and inner workings of k8s, and be able to neck deep into if/when shit hits the fan.
For a small team that's a lot of commitment for something that is usually not their bread and butter and wish they could build once and only slightly tweak every year or so.
embedding-shape 12 hours ago [-]
> I think there is a real hole in the market for a simple solution that lets you deploy some containers to some instances in a declarative fashion without all of that complexity and does decent LTS versions
Hashicorp's Nomad basically is just that, supports various way of running stuff too which is neat. Shame about the license change which basically killed all my interest in it, so seems the hole is indeed still unfilled.
nyrikki 12 hours ago [-]
For simple cases I just launch podman containers on long lived hosts with ansible.
You can still add pods if needed and the systemd integration works.
Plus you can actually improve isolation by co-hosting services under separate UIDs.
Like any container it is just co-hosting, and elasticity is a bit slower with autoscaling instances, but it removes most of the complexity of K8s which very few org benefit from or have the culture to support.
alexjurkiewicz 3 hours ago [-]
AWS ECS and GCP Cloud Run are this. Run a container on abstract compute. But they aren't "without all that complexity" because it turns out all that complexity is required for even simple use-cases. Load balancing with SSl certs, cloud API keys, deployment pipelines, sidecars, etc.
embedding-shape 2 hours ago [-]
Those are hosted services? Completely different class of solution.
mikeocool 12 hours ago [-]
Yeah I’ve always meant to check out nomad and never had an opportunity.
Though as I recall, it makes heavy use of consul, which I have used in anger, and makes me a little weary (though that experience is likely very out of date).
embedding-shape 12 hours ago [-]
It doesn't require Consul IIRC, but bunch of features does depend on it, like service discovery and related stuff. But Nomad is totally usable without Consul for simpler setups.
xav0989 3 minutes ago [-]
They’ve now had nomad-native services without consul for a while, including health checks!
mocamoca 12 hours ago [-]
I've been using Nomad for years without Consul. Maybe if you complex networking requirements it is worth it, else don't really need it.
ddreier 6 hours ago [-]
Nomad has gained basic service discovery and K/V store without Consul. However, health checking is extremely limited.
mocamoca 12 hours ago [-]
As CTO of a small startup and cutting costs, setting up hashicorp nomad + bare metal is a joy to work with.
Some self-reloading HAProxy in nomad to automatically assign URLs to services when needed. Could have used Consul but meh.
Tailscale for private networking.
zzyzxd 12 hours ago [-]
> One thing that I think people new to it don’t realize is that it’s not at all batteries included - to get a basic managed cluster setup, you’re still going to be installing a bunch of additional controllers (ingress, cert-manager, external dns to start).
And if you can do this again, what's your solution to reverse proxy, certificate management, DNS...etc? I guess you can docker-compose some custom stack on a single machine, maybe add one more machine then you can say it's HA enough for small scale. But you can also spend the same amount of time to install those kubernetes controllers with zero customization. In my experience, if you go with the default configuration, most of the well-maintained k8s components are boring as hell these days.
> (if you’re on EKS, make sure to read about scaling and monitoring CoreDNS)
If load to your service increases, you need to scale up/out your service. This is universally true. Do you have a proprietary solution that's easier and more reliable than bumping up the replicas count in kubernetes?
There are lots of design decisions in Kubernetes that I hate. But if you want me to choose between Kubernetes and any proprietary stack, in 2026, I would definitely choose Kubernetes.
packetlost 10 hours ago [-]
I use NixOS with nginx + acme / caddy, coredns and no docker anywhere. It's extremely homogeneous, easy to scale out (add another flake output, deploy to a new server, update DNS records). You could easily automate some of that with more nix, but I don't bother because that's already only like 50 lines of config.
I have a strong preference for renting bare metal and it has served me extremely well.
zzyzxd 9 hours ago [-]
I totally believe this works for you. But in your case, isn't NixOS just another declarative orchestration system like Kubernetes? Similarly I can just run a standalone nginx, caddy with acme, and a coredns pod in a bare minimum k8s cluster.
Personally, I think the complexity is on the same level.
packetlost 9 hours ago [-]
It really isn't comparable. Sure, nixpkgs is huge, but the surface area for what you need to understand and work with is considerably smaller. They aren't even really in the same domain anyways. I was able to get very comfortable with Nix(OS) in a single weekend, but it took me months to get to a similar level with the K8s ecosystem.
XYen0n 8 hours ago [-]
NixOS has no "runtime" or controlplane to maintain.
mikeocool 11 hours ago [-]
I don’t have an answer I’m in love with today, I basically just want less moving parts.
As for EKS, having to monitor and manually scale the built in DNS service or else my queries are just going to stop resolving is not the type of thing I expect to have to manage on a managed service. I see they have finally released autoscaling for CoreDNS, though it took them 6 years.
jaggederest 10 hours ago [-]
Accidental complexity and essential complexity. There is no working system that achieves all the stated aims with fewer parts. [1]
I've been building multi-cluster Kubernetes for some time and things like External DNS and Ingress controllers per app are just non-starters. They always felt kludgy having K8S orchestrate things external to the cluster and their anti-patterns IMO.
petterroea 2 hours ago [-]
Working with k8s myself I'm somewhere in between you and the article on opinion. I think k8s is good when you can afford to hire a person dedicated to managing it (or at least find someone with experience in running it that can make it part of their MO)
That is, k8s is probably best considered when you are beginning to consider having an infrastructure department, or if one of your early hires knows Kubernetes and is opinionated in a way that is less "throw cool and complex stuff at the wall"* and more "the 5 things I want in a k8s cluster that I don't want to spend much time on and should just work"
My understanding of the 2000s and 2010s was that there was a big focus on inventing self service deployment systems for developers, and k8s is that solution(!), for the same scale that would begin considering re-inventing the wheel internally anyways
There was once a time when we could deploy software without spinning up 3 etcd databases, multiple controller processes constantly running event loops, and a virtual networking layer, before you even get off the ground.
Perhaps those days are behind us.
greenavocado 7 hours ago [-]
No way. In the future people will be able to vibe code entire businesses in optimized assembler
bellowsgulch 10 hours ago [-]
It's a shit blog article. A shell script is what 99% of businesses need.
Thaxll 11 hours ago [-]
k8s is not a pain, I would never return to something like Pupet / Ansible / etc ... to deploy bare ec2 instances, it's just re-inventing the wheel badly.
Just use ECS / Fargate with an ALB in front if you need a simpler use case.
mickael-kerjean 10 hours ago [-]
I've had the opposite experience. I used to run k8s on bare metal, troubleshooting something at least once a month (DNS going down was a recurring favorite). The breaking point came with the churn in the ecosystem, got bitten by the deprecation of the community darling weave net cni plugin, the killing of the nginx ingress was the nail in the coffin knowing I had way too many annotation tight to the ingress that it would take longer to migrate those than go the ansible way with k8s imposing tight upgrade schedules. While I agree ansible feels a lot more dirty than k8s, I spend much less time on infrastructure, sleeping betterat night and handling things like databases is much simpler
psviderski 7 hours ago [-]
>I think there is a real hole in the market for a simple solution that lets you deploy some containers to some instances in a declarative fashion without all of that complexity
That's how I see it as well but it's really tough to go against the grain. I have a small enthusiastic community of users around Uncloud (https://github.com/psviderski/uncloud) who went full circle - fed up with k8s and came back to simple, boring declarative Compose deployments across a handful of interconnected hosts.
Uncloud is essentially a cluster version of Docker Compose without a control plane and cluster management overhead.
Eridrus 10 hours ago [-]
We started out core product on ECS, which is a declarative way to run a containerized service. It has been nice and reliable, but it has limitations (slow scaling, weird AWS Quotas if you have ephemeral tasks).
We're moving our non-critical components onto EKS (pipelines, tooling, etc). We had one outage from runaway IP allocation in a subnet, but otherwise it's been pretty stable.
I do hear vague horror stories so I'm really not excited about moving our prod stack to it, but it's actually been really good for installing 3rd party software so far.
alexjurkiewicz 3 hours ago [-]
As the article mentions towards the end, AWS EKS, GCP GKE, and other competitors have made k8s setup turnkey. You can deploy a new cluster with all the controllers you mentioned in a single click / Terraform.
aurisl 6 hours ago [-]
We also had similar problems at our small scale startup. We tried k8s for a pilot project, and observation was the same: the complexity was not worth it for us. We needed something simpler instead we adopted Nomad, which actually fit our use case. It had its own issues and bugs, but overall, it was much more straightforward to work with.
zug_zug 10 hours ago [-]
> I think there is a real hole in the market for a simple solution
Unless of course, all of the busywork that comes with kubernetes IS the value (to the engineer). Perhaps a bunch of engineers know at some level that locking the company into an overcomplicated cloud-within-a-cloud setup that has all sorts of weekly issues and requires constant work gives them a lot of job safety that they wouldn't get if they just used an AWS autoscaling group and you're done for the next 5 years.
Because simpler solutions DO exist (like a loadbalancer in front of an autoscale group, and not making a giant SOA for an app that orders you taxis, or books you a bnb or whatever nonsense).
httgp 4 hours ago [-]
> there is a real hole in the market for a simple solution that lets you deploy some containers to some instances in a declarative fashion without all of that complexity and does decent LTS versions
There's Nomad for this; I wish more teams would run Nomad.
12 hours ago [-]
jpb0104 12 hours ago [-]
Kamal is somewhere in the middle. Probably a little closer to a bunch of bash scripts. But it’ll get your container going pretty quick. Can take a bit of fiddling with SSH/docker-login. Plus it handles deployments very well.
tayo42 12 hours ago [-]
Isn't fargate or ecs that simple service?
icedchai 12 hours ago [-]
Google's Cloud Run is also pretty simple.
epgui 12 hours ago [-]
I find them just as complicated as k8s.
czhu12 12 hours ago [-]
I built canine.sh for exactly that reason, gives you a sensible deployment platform on top of k8s with one install, and you can customize it once you outgrow it.
esafak 12 hours ago [-]
Your portainer link is broken.
erpellan 3 hours ago [-]
There was: Heroku
It was glorious.
dsincl12 4 hours ago [-]
Dokku?
emodendroket 12 hours ago [-]
> I think there is a real hole in the market for a simple solution that lets you deploy some containers to some instances in a declarative fashion without all of that complexity and does decent LTS versions. I imagine there’s something out there that does this, but k8s has really sucked up all the oxygen.
I mean, it's CDK and whatever equivalents other providers have, isn't it? If you fully embrace all the stuff they give you then it's straightforward to declare everything and it all works together. The downside is the vendor lock-in but unless you actively deploy to multiple environments, which most people don't, you're probably locked in in various ways without knowing about it.
antonvs 5 hours ago [-]
> k8s has really sucked up all the oxygen.
Because anything else involves making opinionated decisions that will be wrong for many users.
People who don’t understand why k8s is so widespread don’t understand all the problems it’s solving.
te_chris 5 hours ago [-]
GCP cloud run is pretty close to this. We’re using it now and I’ve a lot of experience with gke.
They’ve announced persistent “instances” recently which solves a big problem for us - sometimes you want continual long running workloads.
stevenaenns 12 hours ago [-]
to what extent would AWS EKS auto mode solve those problems?
peterldowns 11 hours ago [-]
"completely" in my experience
busterarm 12 hours ago [-]
Nomad, Consul and Vault all running on VMs that you manage with Terraform.
The problem is that when you run this long enough you want K8s features anyway.
kilobaud 8 hours ago [-]
And your starter “production” deployment of the Nomad/Consul/Vault stack is literally 12 VMs, comprising three independent Raft clusters. There is no decent way to do zero-downtime instance replacement without building your own orchestration layer, but also they’ve had a years-long track record of shipping bad upgrades and following up with only manual remediations or workarounds instead of a fix.
As someone who has productionized and maintained truly hundreds of those clusters across several jobs, it is hard at this point for me to recommend Consul, Nomad, or Vault to anyone serious about building reliable applications. Too many broken upgrades and manual click-ops tasks just to keep them online. (…and I’ve said nothing of the actual product!)
secondcoming 3 hours ago [-]
This is a timely post. We are going to use Consul to replace the need for Internal Load Balancers. What issues do you have with it?
andrewcamel 3 minutes ago [-]
I’m not sure it will cause a reversal, but I bet the k8s takeover wouldn’t have happened with costs of today. You inherently end up using far more resources than you need - between HA, built in services, overhead per node, etc.
I personally will be using more resource efficient approaches in everything I do. Question is just what provides the closest set of benefits without the full k8s weight.
xlii 13 hours ago [-]
One year ago I might agree that Kubernetes is an overkill but today?
Ask your favorite GPT to generate manifests, get primary app into cluster with telepresence or execute straight from container and switch contexts and clusters like it's 90s again.
One reason I dislike Docker Compose and Docker is lack of isolation. Yes sure if you put your arm deep enough you can get it, but on local k8s I can spin cluster per workspace and not worry about conflicting ports between PostgreSQL instances.
Before LLMs writing consistent YAMLs was PITA but today on low/development scale it's pretty much free lunch.
hadlock 12 hours ago [-]
Strong agree, if there's one thing LLMs are excellent at, it's writing Terraform and Kubernetes deployments (and/or helm charts). What used to be half a day of research, trial and error, is now 20 seconds of AI churn and 98% of the time it nails it on the first try. And then point it at grafana and tell it to write you a dashboard for the new service/s. Easy peasy lemon squeezy. What used to require a team of 4 devops/SRE to support a medium sized company, can now be collapsed down into a a single part time SRE.
d675 12 hours ago [-]
as I got into SWE 4 yrs ago, this was a big part of my job as a SRE/SDET and my next job came b/c of that SRE exp which was never used, so just became an SDET.
Now am laid off, and hard to find a job...
xlii 6 hours ago [-]
I'm sorry to read that.
Unfortunately it's an industry wide problem, and it touches many areas and levels of expertise. Some believed that AI can drop costs and compressed job spaces.
It starts to bounce off but it's not back to - what I could fall - normal baseline.
bigstrat2003 9 hours ago [-]
LLMs are pretty bad at writing those things in my experience. They will invent HCL syntax that doesn't exist, generate absurdly overwrought Helm charts, put in assumptions that don't make any sense, and so on. It's faster, and better quality, to write the stuff myself.
If you're feeling extra spicy you don't even need the deploy scripts. Just a `llm` user account with the right permissions & ssh keys on all your servers.
perrygeo 5 hours ago [-]
> Ask your favorite GPT to generate manifests, ...
> Before LLMs writing consistent YAMLs was PITA but today on low/development scale it's pretty much free lunch.
Writing manifests seems like a trivial thing to focus on. Who operates the k8s cluster in production? Who runs upgrades? Who's on call to monitor the system? Of course if someone else is doing all the work for you, it feels like free lunch!
hdjrudni 4 hours ago [-]
I find it much easier to upgrade k8s than a bare naked server.
With managed k8s, your host upgrades the control plane. And then you can upgrade your PHP, Python, Node, what have you, by flipping a number in your Dockerfile.
Not like other forms of sever infra don't need monitoring and upgrades anyway.
darkwater 22 minutes ago [-]
The day Kubernetes will have an LTS version supported for 5 years with no API churn, and EKS and other k8s managed systems will have an LTS version based on the k8s LTS plus a bunch of LTS addons supported natively, then I will agree. Actually we will probably live in a better world overall.
Meanwhile, the update stress of core k8s - even managed - is much higher than a good managed old fashioned (immutable) infrastructure.
aleksiy123 9 hours ago [-]
Finally just bought a piece of my own hardware and got LLM to deploy k3s cluster on it.
I think diy homelab/hosting is more accessible than ever.
Cut costs on cloud spend and invest into AI spend.
For a solo dev on a budget, I think it just makes sense.
globular-toast 5 hours ago [-]
It's not an investment, it's just a spend. If you had learnt to deploy k3s yourself, which is really easy, but still, that would be an investment. Paying for LLMs is basically renting.
embedding-shape 12 hours ago [-]
> One reason I dislike Docker Compose and Docker is lack of isolation. Yes sure if you put your arm deep enough you can get it, but on local k8s I can spin cluster per workspace and not worry about conflicting ports between PostgreSQL instances.
Using Kubernetes because you're unable to grok docker's networking enough so you can't run multiple containers using their own ports and not conflicting with other stuff sounds like a recipe for disaster, even (especially?) if you use agents for this. Particularly if you let them manage a production environment, you're bound to lose important data eventually.
> pretty much free lunch.
Aah, famous last words of the young :)
iamcreasy 13 hours ago [-]
Interesting. I have just started reading about Kubernetes. Is there an reading material that goes over this process you just described?
johnsmith1840 10 hours ago [-]
Don't. Get a chatgpt subscription and spin up a minikube cluster and launch some stuff and play around.
K8s is incredibly deep and complex but with AI it's finally easy to just hello world it.
bigstrat2003 9 hours ago [-]
This is absolutely terrible advice. You should never ever use LLMs to work on something you don't understand already, because you have no way to catch the machine when it screws up (and it will screw up). Just like with every other form of automation before LLMs, a smart person only automates things he already knows how to do himself.
xlii 6 hours ago [-]
"Only a Sith deals in absolutes" ;-)
I mostly agree it's an area that's risky to wander into mindlessly but it is much more easier to validate knowledge than to practice it.
E.g. I can't write Chinese but can validate if piece of Chinese is a valid one (by feeding to N translators, other LLMs or asking a friend who knows Chinese).
Under assumption of "LLM output is false until proven otherwise" it's not a bad approach and worked for me in various scenarios. (E.g. I asked for implementation of algorithm in Rust and then validated it against base definition).
johnsmith1840 9 hours ago [-]
Yeah no. Getting the first hello world up is more important than anything else.
Until you physically see it running learning is slow.
I learned k8s through many months of study and pain pre AI. Once I actually got it up learning was FAR easier.
This is like using a jupyter notebook to learn python and is always the first thing I point to for someone just starting to learn. Only after should you learn venv, pip install, classes ect.
100% use AI to get started on something you don't understand. I will literally never start to learn about a technical system again without first doing a hello world with AI.
SJC_Hacker 4 hours ago [-]
Might watch to start out with docker-cónaíse firat,
liampulles 11 hours ago [-]
There is a core 20% to Kubernetes which is very nice, mostly being the Deployment and Service management stuff. That along with a very basic GitOps for cluster management (an infra repo for operators using Flux, applying service level yaml from app repos in CI) above a cloud managed Kubernetes cluster, where you still keep your DB and build servers off the cluster, can be quite nice for a small team.
Beyond that, there are massive holes of despair to fall down if a novice team starts to engage with extensive operators (starving the control plane), DB operators (distributed persistence) and build operators (spikey, expensive loads). At least, I know that I've had to dig out of those holes.
I just hope people don't use k8s in the same way many use microservices: as a way to introduce complexity for complexity's sake.
zbentley 10 hours ago [-]
Spot on. I have a lot of trouble convincing cloud folks that for durable state, you probably don’t want kubernetes. It’s not that e.g. the CSI drivers and operators for clustered databases aren’t top notch—they are; the era of “avoid stateful kube services” is long behind us—it’s that the cloud provider managed services for e.g. blob stores or databases are so much more reliable. The S3s and Auroras of the world are expensive for a reason: no matter how good your kube native database operator is, it still doesn’t assume responsibility for a ton of the failure points that managed services do. And that’s true even at modest scale (e.g. upgrades are just harder when you’re running your own DB) and in cost conscious environments (sure, the Elasticache bill is steep, but the salary and velocity cost of fixing memory-leak-caused kube memcached crashes is steeper).
portly 6 hours ago [-]
Also cluster migrations are required pretty often in my company. Having state on a cluster means migrating that as well, which is a complex and time consuming operation. Having your state in S3 or external database makes migrations a breeze.
suralind 12 hours ago [-]
So I’m personally a huge fan of k8s and while I agree it may be „complicated”, it’s because deploying applications is complicated. (I want to point out that there is no requirement no set up cert manager, ArgoCD, external secrets, etc. - and many people who’d consider a VPS would happily slap a .env with an unencrypted secret then ssh to update, but when they choose Kubernetes they take the long route of doing proper GitOps and complain that there are so many things to configure :)
But I found funny that the OP summarized to use Kubernetes when CTO is no longer the only dev.
solatic 33 seconds ago [-]
> many people who’d consider a VPS would happily slap a .env with an unencrypted secret then ssh to update
I just want to point out that you can totally still do this with Kubernetes. Of course it's not correct, but you can save that unencrypted secret in a .env file right into your container while you're building it - no need to use Kubernetes's support for supplying environment variables from the manifest. And of course, you don't even need a Dockerfile to build that container - you can just exec into a running container, paste it in, and then docker save.
Kubernetes doesn't save you from making stupid decisions, it just makes it easier to make better ones.
vbezhenar 9 minutes ago [-]
100% agree with you.
You can actually treat kubernetes as a glorified docker compose engine. Deploy pods, deploy nginx instead of ingress controller, deploy certbot cronjob instead of cert-manager, and believe it or not, it'll work! On a single server!
People often compare Kubernetes with thousands of additional services to a simple VPS, but that's not apples to apples comparison.
codemog 5 hours ago [-]
It's not uniformity, it's cargo culting and offloading thinking to group norms. Doesn't help engineers are some of the most arrogant people alive and refuse to admit anything is complicated, as they consider it some kind affront to their intelligence.
I would not advise asking the majority of CTOs these questions either. Many got to that position by saying what people want to hear, which is the "average" safe answer. They will parrot whatever is "hot" at that time because it's the least risky response. They are not your friend nor a reliable source.
clickety_clack 13 hours ago [-]
Adopting k8s when you hire your _second_ engineer (first after the CTO)? That’s a red flag that the CTO’s priorities are wrong and he’s just enjoying tinkering with his infra instead of solving the users’ problems.
Esophagus4 13 hours ago [-]
I thought that was the point of the article, right?
That the tech benefits may not be there, but they’re using it for the non-tech benefits
mijoharas 23 minutes ago [-]
> That the tech benefits may not be there, but they’re using it for the non-tech benefits
My read of the article is that this is correct, but that the benefits they're using it for are the operational, and organisational.
I think the comment you're replying to is arguing that those benefits don't really matter or outweigh the additional complexity costs when N=2 (engineers). I think I'd probably agree.
clickety_clack 11 hours ago [-]
From the article:
> My personal threshold would be the moment the CTO isn't the only engineer anymore. As soon as a second person shows up, the problems K8s solves become real.
darkwater 19 minutes ago [-]
What a world we live in with people thinking that the problems of a 2 (founder) engineers startup deserve k8s complexity...
Even with LLMs, they are going to rack up tech debt if their focus is - as it should in as mall startup - the final product and not the tech stack itself.
avhception 12 hours ago [-]
Well, I totally get the benefits that made those people choose Kubernetes. It's just that those benefits could be had w/o running a massively complicated piece of machinery that is mostly engineered to solve problems I don't have.
SamuelAdams 13 hours ago [-]
This seems to be less about K8’s and more about the infra as code movement. It doesn’t matter if you use K8, CDK, or terraform - you get the same benefits the OP stated across the board.
It is nice to be able to have a consistent deployment pattern, with traceability, rollback support, and production approval checks. It’s nice to not have some archaic something stuck in someone’s head. It’s also nice to be able to see how something works by reading the code, which is usually up to date and deployable.
sshine 12 hours ago [-]
> less about K8’s and more about the infra as code movement. It doesn’t matter if you use K8, CDK, or terraform - you get the same benefits the OP stated
I’d like to gently push back on that. ;-D
Terraform, when committed to git, provides organisational memory. But less so uniformity, since all providers are different (and you should expect different things when applying). No tracing besides git. And tfstate is hard to share between developers, unlike kube state.
Kubernetes is more the same across providers. And it manages drift after something is applied, which is not a direct argument of OP, but a strong reason over other IAC.
And yes, I also enjoy how well deploying works. And how things generally fit together. Liking the networking complexity less so.
simoncion 11 hours ago [-]
> And tfstate is hard to share between developers...
Really? For years and years we put our tfstate files into private S3 buckets at $DAYJOB and it seemed to work just fine. We didn't even take pains to ensure that everyone was on the very same version of the Terraform CLI. What problems did you guys run into?
InvertedRhodium 3 hours ago [-]
One person upgrading terraform can break the state format for backwards compatibility. We use direnv with asdf to ensure everyone is in the same version now.
hanneshdc 1 hours ago [-]
I’ve worked for a few startups now and we’ve stayed away from k8s, instead being perfectly happy with ECS services managed via terraform.
It gives most of the benefits the author mentions (traceability of changes, clearly written down infrastructure), without the complexity of k8s.
mikgp 13 hours ago [-]
Kubernetes is so ubiquitous that yeah, as long as you're not trying to run it yourself, a small Kubernetes cluster just isn't all that much to manage. I think it's been so long, people forget how annoying it is to manage servers. All said and done Kubernetes is becoming more the "Boring Technology"
JohnMakin 13 hours ago [-]
> I still don't totally get why the shift happened when it did. Five years ago all three camps were doing fine. Now the VM+systemd crowd has basically disappeared from job postings, serverless stayed niche, and K8s just won.
>
> My best guesses: managed K8s (EKS, GKE, AKS) got mature and the talent pool flipped: enough people learned it that hiring for anything else became the harder choice. And Helm made "just use someone else's chart" a real option. But I'm not certain. If you were there for the shift and have a better theory, I'd genuinely like to know.
Pretty much, almost. Have spent a bunch of time in my career working on the "VM + systemd" setups, stuff running on a rack, or in an ec2 on cloud - managed kubernetes is a lot better for me than those cobbled together messes. There's "easier" setups but usually end up costing me a lot more in time and $.
To answer simply, it became good + convenient. I could complain about plenty, and people here like to, but honestly you couldn't pay me to go back to the old way. The one legitimate gripe is the upgrade schedule is exhausting, on AWS it's about every 6 months before you go into extended support. I also hate being at the mercy of arbitrary decisions like "ok we know a huge chunk of the web going back a decade has architected off our Ingress API, but recently we decided we dont really like that way anymore and we want you to use Gateway API instead, so, um, like ya we know it just killed off one of the most used open source ingress configs (ingress-nginx) but yea trust us bro this is going to be so much better" kind of thing.
hadlock 12 hours ago [-]
The upgrade cycle is a feature, not a bug. If (when) you need to do a big lift and shift, or there's some 0 day CVE, push buttan, get security update. You CAN drift behind but there's a real $$$ cost to that now. Every three months I toss opus at my k8s stack and verify it's compliant with k8s v1.xx.y and then push the upgrade button on my staging cluster, and then a week later I push the upgrade button on my prod cluster. What used to be two days of maintenance every quarter is now more like 2-5 minutes spread across the two upgrades.
I'll admit I'm dreading switching over to the gateway api, but by the time I get forced off ingresses it should be a stable/mature ecosystem. That's still a ways out though.
I don't know anyone still dealing with VMs anymore, except our IT guy who manages a couple of pet servers for random executives from the before times. In the last year k8s has started absorbing executive pet processes and the number of VMs our IT guy manages has dropped by about half.
While I'm here spouting stuff, yeah hiring for k8s is real easy, if our SRE gets hit by a bus, he can be replaced in a week, and we can probably struggle through using opus until that happens. K8s being he lingua franca of git ops IaC makes it real easy for the new guy to parachute in and start working. Every VM thing is going to be totally bespoke and have the personality of the guy who designed it, which is rarely a good thing.
JohnMakin 3 hours ago [-]
The gateway api people have clearly won and I can’t truly complain because I’m not a maintainer, but I have contributed in the past to a nontrivial part of the tooling built off this ecosystem. The issues with snippets/annotations are a core deficiency with k8s design and eliminating this api creates more problems than it supposedly solves. I have been working on solutions of my own preparing for this inevitability, but it’s rough. ingress annotations like it or not run the modern infra tech stack. if they are persona non grata at any point in the future, a lot of people are going to have a lot of urgent consulting problems in the near to mid distant future.
I to this date have not seen a viable drop in replacement to how I’ve seen big orgs use the ingress controller stack with the gateway api and what i understand currently is ingate is basically DOA.
mschuster91 13 hours ago [-]
I somewhat agree with you... but it's not like you don't need some actual experts who know what they're doing, especially when stuff goes bonkers and it will go bonkers.
Even on AWS EKS, you will run into bullshit with their network overlay. Egress policies are a mess (at least half a year ago, you were not able to say something like "allow pod A to egress traffic to service (!) B" despite a service resolving down to an IP address in the end.
And that's before going into the unholy mess that is getting connectivity to and from the external world to your cluster. Cloudfront, ACM certificates, ALB, ALB-EKS integration, Route53, Route53-EKS integration, EFS, EFS-EKS integration, EBS, EBS-EKS integration, RDS, RDS-EKS integration, IAM-EKS integration, SSM, SSM-EKS integration, autoscaling... and if you want more pain and don't already wince, try setting that up across regions or, as I had to do once, across account boundaries.
Kubernetes is powerful. But do not make the mistake of assuming it's easy to get started with, at least on the admin side. Even if you got prior AWS experience, getting it all integrated into EKS so you don't have to deal with Terraform and helm/k8s for a full deployment of a piece of software will take you an awful lot of time.
For users though? It's a breeze, I will admit as much. Everything down to the firewall rules can be encoded in k8s spec files.
JohnMakin 13 hours ago [-]
If you struggle with any of that (a lot of what you listed is not strictly necessary to running managed kubernetes, specifically EKS) you are also going to struggle with a lot of other things on AWS, or wrangling a VM setup at any kind of scale.
paulryanrogers 12 hours ago [-]
So they're holding it wrong?
mschuster91 12 hours ago [-]
> a lot of what you listed is not strictly necessary to running managed kubernetes, specifically EKS
Oh it's not necessary per se but if you want to host a web service with any sort of state and not having to do stuff in parallel either by hand or by terraform, I'd consider the integrations pretty vital.
It's easy enough (well, it's still addons whose versions you have to keep updated each on their own) once it is set up, but getting to the point where you have something reproducibly running for the first time is annoying as hell.
zbentley 10 hours ago [-]
I think the best supported and most mature pattern on most big cloud providers is precisely
> do stuff in parallel either by hand or by terraform
…specifically by terraform. Making k8s own the provisioning and management of external infrastructure on principle (as opposed to when that makes sense, e.g. load balancers/gateway/CSI providers) is not a good approach. Sure, it feels unified, but the cost of unification is incredibly not worth it.
shevis 12 hours ago [-]
> The CTOs I talked to aren't making a dumb choice. They're solving real problems.
Unrelated to the content of the article, this sentence structure is a dead giveaway of LLM writing.
reillyse 12 hours ago [-]
The reason this is accelerating recently is agents are really good at spinning up k8s clusters. They've made devops work super super simple. Basically all the annoying stuff you know you should do but it's way too much hassle - using let's encrypt to create unique certs for every app in your cluster to enable zero trust, configuring permissions and security profiles for everything etc etc (never mind just standing them up in the first place) - it's all simple now.
phailhaus 12 hours ago [-]
The setup may be simple, but what about the maintenance?
kakwa_ 28 minutes ago [-]
And the troubleshooting?
Given the number of moving parts, I would be terrified to have to look under the hood of what Talos deployed for me.
lawn 33 minutes ago [-]
Anecdotal of course but I've found that they're excellent at debugging and analysing Kubernetes issues.
Glyptodon 13 hours ago [-]
Even as a solo dev there's generally been a yawning gap between k8s and manual infra that nothing has ever filled that well and it's part of why things like Heroku were so popular for a while.
ghaff 13 hours ago [-]
Well, maybe especially as a solo dev. Things like Heroku and other tools that were largely called PaaS at the time were very popular with individuals and small teams but they had limitations for enterprise development--and even ran into barriers once anyone ran into those limitations.
chaos_emergent 11 hours ago [-]
In addition to all of OP’s points, another reason k8s is getting popular is that LLMs have made them easier to use! It's reasonably well represented in the dataset and there are pretty strong monitoring and observability tools and verification gates to make sure that you've specified your cluster specifications correctly.
aaronbrethorst 9 hours ago [-]
The knowledge is in the YAML
Exactly why I hate CloudFormation, K8S, GitHub Actions, etc. yaml is a terrible format for the knowledge encoded in these artifacts.
rienbdj 4 hours ago [-]
There are compile-to-yaml config languages
nitwit005 13 hours ago [-]
> First one was uniformity. Every service deploys the same way.
My current company makes this claim, but it's not true. They also have serverless apps, and also have some services running directly on EC2.
They just think of the Kubernetes deployments as the "standard" way.
> Second was shared, hireable knowledge. K8s is basically a lingua franca now.
People were demanding experience with Kubernetes, long before it was reasonable to expect it. Everyone added it to their resume, because they had to.
mianos 11 hours ago [-]
This is exactly why I call it 'resume++'. You have to use it to attract talent. People want to use it to expand their employment pool. This is not justification to using it.
To use it is a whole different question, and not in any way related to job interviews. I have worked in places that are crazy for not using it and others where using it was even crazier.
metaltyphoon 8 hours ago [-]
The grass is not always greener. I would change the Service Fabric crap I have to deal with any day to K8s.
johnsmith1840 9 hours ago [-]
AI, managed control plane, minikube
That makes it a no brainer for me for basically any sized project.
Small project? -> minikube single node deploy it.
Tiny project? -> minimum a docker container
I cringe watching anyone build and run code on a raw machine even locally without atleast a container. The endless hours of headaches you avoid is obvious k8s is just the natural extension from this.
h4kunamata 12 hours ago [-]
Taught me that companies follow hype.
I worked once at a bankm fully kubernetes, the amount of problems were out of reality from this world.
Complexities are being added for no reason at all.
SJC_Hacker 4 hours ago [-]
Yeah this was ,ynexperienve as Wellfleet. Nó one réaluadar knew what was going on and we had 5 figure cloud spend bills per month.
siliconc0w 11 hours ago [-]
I had a similar experience after a recent job search and started working on a 'kube-lite' that just uses object storage for coordination and normal cloud primitives like auto-scale groups (skiff.pages.dev).
I ended up in a different non-SRE role but if you're interested in working on it, please let me know and I'd love to walk you through it.
mattmatters 13 hours ago [-]
A pretty nothing burger of a post with a bunch of ai-isms. Is this written by a real human?
K8s is a complicated beast. CTOs hiring for their 10 person company because of its "used everywhere" is a bad reason to adopt a major piece of technology. You can always graduate to it later if need be.
jbnorth 13 hours ago [-]
Honestly I felt the same way for a while but the more I'm exposed to both Fortune 500 companies and ones who have a handful of employees I see Kubernetes as just a good starting point rather than adopting it later.
It removes the overhead of a lot of what sysadmins and devs of yesteryear did by hand or had to have a career's worth of experience to do quickly.
That's not to say that people don't need to know what they're getting into when they adopt kubernetes but especially when you're using a managed offering and not on the bleeding edge of what it supports it's pretty easy in terms of overhead and maintenance.
ritcgab 9 hours ago [-]
Multi-cluster federation is still hard.
dzonga 3 hours ago [-]
for smaller b2b startups - serverless is still a win.
zug_zug 11 hours ago [-]
Here's my conspiracy theory --
There's a certain type of engineer (maybe 25% of them) who does "hype-driven-development." No matter the technology, they are huge advocates for the technology. The hype may be absolutely real, complete nonsense (e.g. mongodb), or somewhere in between (ai). The vast majority of the time it's hype for a new technology that feels 90% the same from the end-user perspective (react vs vue, docker vs colima, go vs other, whatever vs whatever).
These engineers though, only care about something when it's new and trendy enough to be a differentiator. This is because they don't give any hoots about the actual usefulness of anything, they are just trying to differentiate themselves in a market by leveraging vibes rather than raw competence. I think these types of engineer drove kubernetes for companies that don't need it, but tipped the scales enough that it has critical mass.
The irony being kubernetes is way too heavy/clumsy an abstraction for most companies. The savings of packing pods onto the same node is usually a tiny fraction of the engineers' salaries who are managing it.
The other irony is now that kubernetes isn't the new sexy thing, but a standard tool that AI or a normie can do all the hard work for, the hype driven engineers are off looking for the next thing.
zbentley 10 hours ago [-]
The linked article discusses very different reasons for preferring kube. CTOs and hiring managers like it for reasons totally different from the cargo cult/hype-driven engineers.
zug_zug 10 hours ago [-]
Yeah I read the article and I saw that.
And I do think there is a way to use kubernetes with minimal damage, but it requires making firm rules about not focusing on things that aren't needed yet (e.g. istio) and making firm hiring choices about only people who understand that such optimizations are complete wastes of time for a series A startup.
10 hours ago [-]
stego-tech 11 hours ago [-]
OP gets it.
Right now, I’m one dinosaur managing a startup’s tech portfolio. Everything lives in my head first, then in my break-glass vault for addressing the bus problem. Our public cloud footprint is a single KMS for backups. We have no VMs, everything is a cloud service.
The literal fucking second we have real infrastructure requirements for compute, it’s right to GCE. No ifs, ands, or buts. Here’s our Git Repo, here’s the managed K8s control plane, make it work.
If (or when) we need on-prem compute, we add them to the K8s control plane as worker nodes and taint accordingly.
It’s just so much more interchangeable, even if the learning curve for non-SDEs can be a little steeper than VMs.
globular-toast 5 hours ago [-]
I pushed k8s in the medium sized company I work for much the same reasons. We use flux for gitops which works really well. The problem is we now have as many clusters as we did bare metal hosts before. There's production clusters, dev clusters, ones in other regions etc. The idea was to have "one place, one way to deploy" but it's actually many places. Am I doing it wrong? Should it all be one cluster and just have different nodes for different reasons and RBAC etc?
raesene9 4 hours ago [-]
that probably depends on how much security and resource isolation you need. Multi-Tenant security in Kubernetes is not a simple thing, for a wide variety of reasons, and noisy neighbour problems are also potentially a headache.
vasco 12 hours ago [-]
It's a bit odd that the author presents no data other than their interviewing and declares that the shift happened recently. It's not true, there's been steady growth of adoption of kubernetes for years. Just reading CNCF surveys from last years before posting would tell them that.
Their identified reasons are OK though.
crefiz 12 hours ago [-]
Another complementary approach is what Vasilios shared today[1] (the ex-Attlasian guy that recently got attraction)
I'm going against the grain but I read: we have a cultural/policy issue and we 'fix' it with tools.
I think what you hear is never the whole story, there is much more going on.
FpUser 11 hours ago [-]
I call BS on that. I serve SMB clients and many are happy like a clams with monoliths deployed using those proverbial bash scripts that also does lots of other things. Understanding scripts in the age of AI is trivial for newcomers. I for example fed my own uber script to AI as an experiment and it has produced all encompassing nice documentation with examples and tests.
The uniformity is nice, we were moving from apps running directly ec2 instances provisioned with ansible. Each time we spun up a new service it was a process to get the ec2 instances provisioned just so.
But k8s is such a pain in the ass. One thing that I think people new to it don’t realize is that it’s not at all batteries included - to get a basic managed cluster setup, you’re still going to be installing a bunch of additional controllers (ingress, cert-manager, external dns to start). And then you’re on the hook for making sure all those processes stay up (hope the admission webhook controller for a critical resource doesn’t go down!). Then you’ve got to do a major upgrade on not only your cluster, but all of those controllers every ~3 months. And no one is shy about introducing breaking changes.
Also you’re introducing a huge amount of complexity with the k8s networking and dns layer that most startups have zero need for (if you’re on EKS, make sure to read about scaling and monitoring CoreDNS).
I think there is a real hole in the market for a simple solution that lets you deploy some containers to some instances in a declarative fashion without all of that complexity and does decent LTS versions. I imagine there’s something out there that does this, but k8s has really sucked up all the oxygen.
Everyone initially wants thing A. But then they want to customize it to do all permutations and combinations n of A, B, C. They want it to be extensible. They want redundancy. They want orchestration. They want integration.
It’s why practically every config file format eventually becomes its own scripting language. Even HTML started off simple — now ridiculously complex — all the more ironic since practically nobody writes it by hand. Instead of CSS simplifying it, it became more complex.
There is another thing that is extremely customizable and extensible. It’s called a programming language. People write programs to solve specific problems.
There seems to be a perverse trend of cobbling together a Byzantine mesh of libraries, plugins, and services with complex configuration files to make it do practically everything possible. We just used to write software for such purposes…
And for anyone who thinks HTML is simple… the A (anchor) tag has an “ping” attribute that results in POST requests to a list of URLs when a link is clicked ! The list of attributes and resulting variations in behavior is quite mind boggling. It was supposed to be a damn link! https://html.spec.whatwg.org/multipage/links.html
Fargate and Cloud Run first come to mind.
https://github.com/openrundev/openrun is a project I am building. It supports declarative deployments, on a single-node with Docker or onto Kubernetes. The target use cases is limited to standalone web app, like internal tools. No support for stateful services, you manage stateful services yourself. With that simplification, OpenRun provides a much easier developer experience.
One of the main problems here is that programming languages typically have lots of tools to help validate correctness, whereas configuration tools are typically either much less mature or woefully underused.
There is nothing more frustrating in something failing due to a misconfiguration - but you've no idea what the correct value should be.
Advertisers have really shaped the Web right down to it's core specifications.
Just to provide a similar example. Linux system is insanely complicated. Kernel alone has thousands of options. Distos have tens of thousands of packages. Wherever you look at, everything is hard and complicated. Firewall, containers, init system, filesystem hierarchy, storage layers. One would think that some people desire simpler operating system. But everybody uses Linux despite all these complexities. Try to find OpenBSD in production, for example. It's not easy.
1. No overlay networks. 1 IP per machine. pods use dynamically allocated ports, and the kubelet enforces pods listen only on their assigned ports using seccomp.
2. No kube-proxy or equivalent Layer-4 "load-balancer". It's not good, but it's often used. You should use some kind of Layer-7 load balancing instead. Also you need to look up the port number from (1). This also greatly lessens the need for DNS.
3. A better config language. YAML and helm templates are terrible. kustomize is built into kubectl, but it's frustratingly limiting and also still very complicated. Something like nix would have been great. This can make it easier to upgrade third party configs since you can have more logic to validate and merge your settings with upstream defaults or templates.
4. Maybe an EBF-like for the api server? If the built-in k8s objects don't have a setting for something, then you need to write an operator or control loop yourself and then run that too, which is a big lift. Over time, k8s just keeps adding more and more built-in things and then revising them, which creates a ton of churn. If you could easily script simple operations, then they wouldn't have to build in every permutation ahead of time. E.g. the HorizontalPodAutoscaler has 24 config object types with several fields each, but all it does is set replicas based on data read from the api-server, so it could be replaced by some kind of flexible script that runs in the control plane.
2. Does not work for all protocols. Again your solution restricts the number of protocols to HTTP protocols. Might work for many uses, but still this restriction doesn't sound very good. Universal load balancer is much simpler conceptually.
3. YAML is not terrible. YAML is awesome. Kubernetes manifests are terrible, that's I agree with. Docker compose is nice, for example. Kubernetes manifests felt like they were designed to be generated from something, but everyone ended up writing them directly or with templates. Though I think that XML generally is superior format so I'd vote for XML in the end.
Overall your suggestions look like you want to shift complexity from cluster operator to software developer. I'm not sure industry supports that, recently it seems to move in the opposite direction, but that's interesting perspective. I guess with some wrappers for some containers it could be made usable.
But honestly you just want to throw away years of progress in containers and network namespaces. I understand that kubernetes mechanisms are somewhat complicated, but the core idea is to make pods look like virtual machines and I think this is very worthy idea.
CNPG is an absolute monster (in a good way). cert-manager is easier than the docker alternative, calico has never failed me (except in bgp mode which has some footguns like not being able to come back from a dead state since it has a chicken and an egg problem unless you point it to the external load balancer which I would have known if I read the documentation). trafeik is all you need. talosos largely mitigates the bare metal problems and comes pre-hardened and pre-optimized.
I solo most of my development projects and have used k3s for all of them. The only complaint is that cert-manager by default will fail silently and your certificates will expire. I largely mitigated this by having proper visibility setup via grafana and automated alerts (warns if certificates are about to expire) which should have been done by me anyway.
Two years ago I'd agree, today with LLMs everything I have runs talos with fully automated updates and I haven't had to be on-call for almost a year.
K8s is easier at smaller scales (I understand k3s as a packaged version ?), but you still need one or two people in your team to properly understands all of the concepts and inner workings of k8s, and be able to neck deep into if/when shit hits the fan.
For a small team that's a lot of commitment for something that is usually not their bread and butter and wish they could build once and only slightly tweak every year or so.
Hashicorp's Nomad basically is just that, supports various way of running stuff too which is neat. Shame about the license change which basically killed all my interest in it, so seems the hole is indeed still unfilled.
You can still add pods if needed and the systemd integration works.
Plus you can actually improve isolation by co-hosting services under separate UIDs.
Like any container it is just co-hosting, and elasticity is a bit slower with autoscaling instances, but it removes most of the complexity of K8s which very few org benefit from or have the culture to support.
Though as I recall, it makes heavy use of consul, which I have used in anger, and makes me a little weary (though that experience is likely very out of date).
Some self-reloading HAProxy in nomad to automatically assign URLs to services when needed. Could have used Consul but meh.
Tailscale for private networking.
And if you can do this again, what's your solution to reverse proxy, certificate management, DNS...etc? I guess you can docker-compose some custom stack on a single machine, maybe add one more machine then you can say it's HA enough for small scale. But you can also spend the same amount of time to install those kubernetes controllers with zero customization. In my experience, if you go with the default configuration, most of the well-maintained k8s components are boring as hell these days.
> (if you’re on EKS, make sure to read about scaling and monitoring CoreDNS)
If load to your service increases, you need to scale up/out your service. This is universally true. Do you have a proprietary solution that's easier and more reliable than bumping up the replicas count in kubernetes?
There are lots of design decisions in Kubernetes that I hate. But if you want me to choose between Kubernetes and any proprietary stack, in 2026, I would definitely choose Kubernetes.
I have a strong preference for renting bare metal and it has served me extremely well.
Personally, I think the complexity is on the same level.
As for EKS, having to monitor and manually scale the built in DNS service or else my queries are just going to stop resolving is not the type of thing I expect to have to manage on a managed service. I see they have finally released autoscaling for CoreDNS, though it took them 6 years.
[1] https://en.wikipedia.org/wiki/No_Silver_Bullet
That is, k8s is probably best considered when you are beginning to consider having an infrastructure department, or if one of your early hires knows Kubernetes and is opinionated in a way that is less "throw cool and complex stuff at the wall"* and more "the 5 things I want in a k8s cluster that I don't want to spend much time on and should just work"
My understanding of the 2000s and 2010s was that there was a big focus on inventing self service deployment systems for developers, and k8s is that solution(!), for the same scale that would begin considering re-inventing the wheel internally anyways
https://www.macchaffee.com/blog/2024/you-have-built-a-kubern...
Perhaps those days are behind us.
Just use ECS / Fargate with an ALB in front if you need a simpler use case.
That's how I see it as well but it's really tough to go against the grain. I have a small enthusiastic community of users around Uncloud (https://github.com/psviderski/uncloud) who went full circle - fed up with k8s and came back to simple, boring declarative Compose deployments across a handful of interconnected hosts.
Uncloud is essentially a cluster version of Docker Compose without a control plane and cluster management overhead.
We're moving our non-critical components onto EKS (pipelines, tooling, etc). We had one outage from runaway IP allocation in a subnet, but otherwise it's been pretty stable.
I do hear vague horror stories so I'm really not excited about moving our prod stack to it, but it's actually been really good for installing 3rd party software so far.
Unless of course, all of the busywork that comes with kubernetes IS the value (to the engineer). Perhaps a bunch of engineers know at some level that locking the company into an overcomplicated cloud-within-a-cloud setup that has all sorts of weekly issues and requires constant work gives them a lot of job safety that they wouldn't get if they just used an AWS autoscaling group and you're done for the next 5 years.
Because simpler solutions DO exist (like a loadbalancer in front of an autoscale group, and not making a giant SOA for an app that orders you taxis, or books you a bnb or whatever nonsense).
There's Nomad for this; I wish more teams would run Nomad.
It was glorious.
I mean, it's CDK and whatever equivalents other providers have, isn't it? If you fully embrace all the stuff they give you then it's straightforward to declare everything and it all works together. The downside is the vendor lock-in but unless you actively deploy to multiple environments, which most people don't, you're probably locked in in various ways without knowing about it.
Because anything else involves making opinionated decisions that will be wrong for many users.
People who don’t understand why k8s is so widespread don’t understand all the problems it’s solving.
They’ve announced persistent “instances” recently which solves a big problem for us - sometimes you want continual long running workloads.
The problem is that when you run this long enough you want K8s features anyway.
As someone who has productionized and maintained truly hundreds of those clusters across several jobs, it is hard at this point for me to recommend Consul, Nomad, or Vault to anyone serious about building reliable applications. Too many broken upgrades and manual click-ops tasks just to keep them online. (…and I’ve said nothing of the actual product!)
I personally will be using more resource efficient approaches in everything I do. Question is just what provides the closest set of benefits without the full k8s weight.
Ask your favorite GPT to generate manifests, get primary app into cluster with telepresence or execute straight from container and switch contexts and clusters like it's 90s again.
One reason I dislike Docker Compose and Docker is lack of isolation. Yes sure if you put your arm deep enough you can get it, but on local k8s I can spin cluster per workspace and not worry about conflicting ports between PostgreSQL instances.
Before LLMs writing consistent YAMLs was PITA but today on low/development scale it's pretty much free lunch.
Now am laid off, and hard to find a job...
Unfortunately it's an industry wide problem, and it touches many areas and levels of expertise. Some believed that AI can drop costs and compressed job spaces.
It starts to bounce off but it's not back to - what I could fall - normal baseline.
At any stage of https://www.macchaffee.com/blog/2024/you-have-built-a-kubern... a SOTA model can repackage it into Kubernetes.
If you're feeling extra spicy you don't even need the deploy scripts. Just a `llm` user account with the right permissions & ssh keys on all your servers.
Writing manifests seems like a trivial thing to focus on. Who operates the k8s cluster in production? Who runs upgrades? Who's on call to monitor the system? Of course if someone else is doing all the work for you, it feels like free lunch!
With managed k8s, your host upgrades the control plane. And then you can upgrade your PHP, Python, Node, what have you, by flipping a number in your Dockerfile.
Not like other forms of sever infra don't need monitoring and upgrades anyway.
Meanwhile, the update stress of core k8s - even managed - is much higher than a good managed old fashioned (immutable) infrastructure.
I think diy homelab/hosting is more accessible than ever.
Cut costs on cloud spend and invest into AI spend.
For a solo dev on a budget, I think it just makes sense.
Using Kubernetes because you're unable to grok docker's networking enough so you can't run multiple containers using their own ports and not conflicting with other stuff sounds like a recipe for disaster, even (especially?) if you use agents for this. Particularly if you let them manage a production environment, you're bound to lose important data eventually.
> pretty much free lunch.
Aah, famous last words of the young :)
K8s is incredibly deep and complex but with AI it's finally easy to just hello world it.
I mostly agree it's an area that's risky to wander into mindlessly but it is much more easier to validate knowledge than to practice it.
E.g. I can't write Chinese but can validate if piece of Chinese is a valid one (by feeding to N translators, other LLMs or asking a friend who knows Chinese).
Under assumption of "LLM output is false until proven otherwise" it's not a bad approach and worked for me in various scenarios. (E.g. I asked for implementation of algorithm in Rust and then validated it against base definition).
Until you physically see it running learning is slow.
I learned k8s through many months of study and pain pre AI. Once I actually got it up learning was FAR easier.
This is like using a jupyter notebook to learn python and is always the first thing I point to for someone just starting to learn. Only after should you learn venv, pip install, classes ect.
100% use AI to get started on something you don't understand. I will literally never start to learn about a technical system again without first doing a hello world with AI.
Beyond that, there are massive holes of despair to fall down if a novice team starts to engage with extensive operators (starving the control plane), DB operators (distributed persistence) and build operators (spikey, expensive loads). At least, I know that I've had to dig out of those holes.
I just hope people don't use k8s in the same way many use microservices: as a way to introduce complexity for complexity's sake.
But I found funny that the OP summarized to use Kubernetes when CTO is no longer the only dev.
I just want to point out that you can totally still do this with Kubernetes. Of course it's not correct, but you can save that unencrypted secret in a .env file right into your container while you're building it - no need to use Kubernetes's support for supplying environment variables from the manifest. And of course, you don't even need a Dockerfile to build that container - you can just exec into a running container, paste it in, and then docker save.
Kubernetes doesn't save you from making stupid decisions, it just makes it easier to make better ones.
You can actually treat kubernetes as a glorified docker compose engine. Deploy pods, deploy nginx instead of ingress controller, deploy certbot cronjob instead of cert-manager, and believe it or not, it'll work! On a single server!
People often compare Kubernetes with thousands of additional services to a simple VPS, but that's not apples to apples comparison.
I would not advise asking the majority of CTOs these questions either. Many got to that position by saying what people want to hear, which is the "average" safe answer. They will parrot whatever is "hot" at that time because it's the least risky response. They are not your friend nor a reliable source.
That the tech benefits may not be there, but they’re using it for the non-tech benefits
My read of the article is that this is correct, but that the benefits they're using it for are the operational, and organisational.
I think the comment you're replying to is arguing that those benefits don't really matter or outweigh the additional complexity costs when N=2 (engineers). I think I'd probably agree.
> My personal threshold would be the moment the CTO isn't the only engineer anymore. As soon as a second person shows up, the problems K8s solves become real.
It is nice to be able to have a consistent deployment pattern, with traceability, rollback support, and production approval checks. It’s nice to not have some archaic something stuck in someone’s head. It’s also nice to be able to see how something works by reading the code, which is usually up to date and deployable.
I’d like to gently push back on that. ;-D
Terraform, when committed to git, provides organisational memory. But less so uniformity, since all providers are different (and you should expect different things when applying). No tracing besides git. And tfstate is hard to share between developers, unlike kube state.
Kubernetes is more the same across providers. And it manages drift after something is applied, which is not a direct argument of OP, but a strong reason over other IAC.
And yes, I also enjoy how well deploying works. And how things generally fit together. Liking the networking complexity less so.
Really? For years and years we put our tfstate files into private S3 buckets at $DAYJOB and it seemed to work just fine. We didn't even take pains to ensure that everyone was on the very same version of the Terraform CLI. What problems did you guys run into?
It gives most of the benefits the author mentions (traceability of changes, clearly written down infrastructure), without the complexity of k8s.
Pretty much, almost. Have spent a bunch of time in my career working on the "VM + systemd" setups, stuff running on a rack, or in an ec2 on cloud - managed kubernetes is a lot better for me than those cobbled together messes. There's "easier" setups but usually end up costing me a lot more in time and $.
To answer simply, it became good + convenient. I could complain about plenty, and people here like to, but honestly you couldn't pay me to go back to the old way. The one legitimate gripe is the upgrade schedule is exhausting, on AWS it's about every 6 months before you go into extended support. I also hate being at the mercy of arbitrary decisions like "ok we know a huge chunk of the web going back a decade has architected off our Ingress API, but recently we decided we dont really like that way anymore and we want you to use Gateway API instead, so, um, like ya we know it just killed off one of the most used open source ingress configs (ingress-nginx) but yea trust us bro this is going to be so much better" kind of thing.
I'll admit I'm dreading switching over to the gateway api, but by the time I get forced off ingresses it should be a stable/mature ecosystem. That's still a ways out though.
I don't know anyone still dealing with VMs anymore, except our IT guy who manages a couple of pet servers for random executives from the before times. In the last year k8s has started absorbing executive pet processes and the number of VMs our IT guy manages has dropped by about half.
While I'm here spouting stuff, yeah hiring for k8s is real easy, if our SRE gets hit by a bus, he can be replaced in a week, and we can probably struggle through using opus until that happens. K8s being he lingua franca of git ops IaC makes it real easy for the new guy to parachute in and start working. Every VM thing is going to be totally bespoke and have the personality of the guy who designed it, which is rarely a good thing.
I to this date have not seen a viable drop in replacement to how I’ve seen big orgs use the ingress controller stack with the gateway api and what i understand currently is ingate is basically DOA.
Even on AWS EKS, you will run into bullshit with their network overlay. Egress policies are a mess (at least half a year ago, you were not able to say something like "allow pod A to egress traffic to service (!) B" despite a service resolving down to an IP address in the end.
And that's before going into the unholy mess that is getting connectivity to and from the external world to your cluster. Cloudfront, ACM certificates, ALB, ALB-EKS integration, Route53, Route53-EKS integration, EFS, EFS-EKS integration, EBS, EBS-EKS integration, RDS, RDS-EKS integration, IAM-EKS integration, SSM, SSM-EKS integration, autoscaling... and if you want more pain and don't already wince, try setting that up across regions or, as I had to do once, across account boundaries.
Kubernetes is powerful. But do not make the mistake of assuming it's easy to get started with, at least on the admin side. Even if you got prior AWS experience, getting it all integrated into EKS so you don't have to deal with Terraform and helm/k8s for a full deployment of a piece of software will take you an awful lot of time.
For users though? It's a breeze, I will admit as much. Everything down to the firewall rules can be encoded in k8s spec files.
Oh it's not necessary per se but if you want to host a web service with any sort of state and not having to do stuff in parallel either by hand or by terraform, I'd consider the integrations pretty vital.
It's easy enough (well, it's still addons whose versions you have to keep updated each on their own) once it is set up, but getting to the point where you have something reproducibly running for the first time is annoying as hell.
> do stuff in parallel either by hand or by terraform
…specifically by terraform. Making k8s own the provisioning and management of external infrastructure on principle (as opposed to when that makes sense, e.g. load balancers/gateway/CSI providers) is not a good approach. Sure, it feels unified, but the cost of unification is incredibly not worth it.
Unrelated to the content of the article, this sentence structure is a dead giveaway of LLM writing.
Given the number of moving parts, I would be terrified to have to look under the hood of what Talos deployed for me.
Exactly why I hate CloudFormation, K8S, GitHub Actions, etc. yaml is a terrible format for the knowledge encoded in these artifacts.
My current company makes this claim, but it's not true. They also have serverless apps, and also have some services running directly on EC2.
They just think of the Kubernetes deployments as the "standard" way.
> Second was shared, hireable knowledge. K8s is basically a lingua franca now.
People were demanding experience with Kubernetes, long before it was reasonable to expect it. Everyone added it to their resume, because they had to.
To use it is a whole different question, and not in any way related to job interviews. I have worked in places that are crazy for not using it and others where using it was even crazier.
That makes it a no brainer for me for basically any sized project.
Small project? -> minikube single node deploy it.
Tiny project? -> minimum a docker container
I cringe watching anyone build and run code on a raw machine even locally without atleast a container. The endless hours of headaches you avoid is obvious k8s is just the natural extension from this.
I worked once at a bankm fully kubernetes, the amount of problems were out of reality from this world.
Complexities are being added for no reason at all.
I ended up in a different non-SRE role but if you're interested in working on it, please let me know and I'd love to walk you through it.
K8s is a complicated beast. CTOs hiring for their 10 person company because of its "used everywhere" is a bad reason to adopt a major piece of technology. You can always graduate to it later if need be.
It removes the overhead of a lot of what sysadmins and devs of yesteryear did by hand or had to have a career's worth of experience to do quickly.
That's not to say that people don't need to know what they're getting into when they adopt kubernetes but especially when you're using a managed offering and not on the bleeding edge of what it supports it's pretty easy in terms of overhead and maintenance.
There's a certain type of engineer (maybe 25% of them) who does "hype-driven-development." No matter the technology, they are huge advocates for the technology. The hype may be absolutely real, complete nonsense (e.g. mongodb), or somewhere in between (ai). The vast majority of the time it's hype for a new technology that feels 90% the same from the end-user perspective (react vs vue, docker vs colima, go vs other, whatever vs whatever).
These engineers though, only care about something when it's new and trendy enough to be a differentiator. This is because they don't give any hoots about the actual usefulness of anything, they are just trying to differentiate themselves in a market by leveraging vibes rather than raw competence. I think these types of engineer drove kubernetes for companies that don't need it, but tipped the scales enough that it has critical mass.
The irony being kubernetes is way too heavy/clumsy an abstraction for most companies. The savings of packing pods onto the same node is usually a tiny fraction of the engineers' salaries who are managing it.
The other irony is now that kubernetes isn't the new sexy thing, but a standard tool that AI or a normie can do all the hard work for, the hype driven engineers are off looking for the next thing.
And I do think there is a way to use kubernetes with minimal damage, but it requires making firm rules about not focusing on things that aren't needed yet (e.g. istio) and making firm hiring choices about only people who understand that such optimizations are complete wastes of time for a series A startup.
Right now, I’m one dinosaur managing a startup’s tech portfolio. Everything lives in my head first, then in my break-glass vault for addressing the bus problem. Our public cloud footprint is a single KMS for backups. We have no VMs, everything is a cloud service.
The literal fucking second we have real infrastructure requirements for compute, it’s right to GCE. No ifs, ands, or buts. Here’s our Git Repo, here’s the managed K8s control plane, make it work.
If (or when) we need on-prem compute, we add them to the K8s control plane as worker nodes and taint accordingly.
It’s just so much more interchangeable, even if the learning curve for non-SDEs can be a little steeper than VMs.
Their identified reasons are OK though.
[1] https://youtu.be/Iv9hoYTQp_8?si=5YsUxYayFUY-RfKC
I think what you hear is never the whole story, there is much more going on.