Making my homelab Kubernetes rack-aware
By Daniel Samson · 2026-05-18
I made my homelab Kubernetes cluster rack-aware, and the headline reason isn't surviving a disaster — though I get that for free. It's so I can take an entire rackmount offline for maintenance, deliberately, any afternoon I like, without anything going down.
Maintenance is the real driver
Homelab hardware needs hands-on work far more often than it actually fails: swapping a disk, adding RAM, re-cabling, a firmware or OS update, physically shuffling a unit, blowing the dust out. If powering off one rack takes the whole cluster down with it, that work can only happen in a dreaded maintenance window — or, realistically, never. I wanted to be able to pull a rack on a Tuesday afternoon and have nobody notice.
A failure domain is also a maintenance domain
The neat part is that the property which survives a rack losing power is exactly the same one that survives me switching it off on purpose. A rack is a unit of planned maintenance just as much as it's a unit of failure. Make the cluster spread its replicas across racks and "cordon and drain a whole rack" stops being scary and becomes routine.
Label the topology
Each node carries its rack identifier on two labels — the standard topology.kubernetes.io/zone key that Longhorn and topologySpreadConstraints understand, plus a convenience rack label for eyeballing things:
kubectl label node <node> topology.kubernetes.io/zone=rm-N rack=rm-N --overwrite
Spread the replicas across racks
With zones in place, Longhorn's zone anti-affinity keeps each volume's replicas in different racks, and topologySpreadConstraints do the same for pods. That's the bit that makes maintenance safe: when I drain a whole rack's nodes, every workload has a copy living in another rack, and storage still has quorum elsewhere. Nothing it was running has nowhere to go.
The maintenance flow
Taking a rack down is now a handful of boring commands:
kubectl cordonthe rack's nodes so nothing new schedules onto them.kubectl drainthem — workloads reschedule onto the other racks.Power the rack off and do the actual work: disks, RAM, cables, updates, whatever.
Power back up,
kubectl uncordon, and let Longhorn rebuild any replicas that fell behind while the rack was dark.
No user-facing downtime at any point, because a majority of everything was always live in the racks I left switched on.
The caveat
This only works if you have enough racks and enough spare capacity that one rack's workloads actually fit on the others — pulling a rack when the rest are already full just moves the outage. And the labels have to be in place before you need them; mid-reorganisation I had half my workers SchedulingDisabled, so I labelled everything up front so the constraints would take effect the moment nodes came back.
Rack-awareness turned maintenance from an all-or-nothing event I'd put off for months into "I'll take rm-2 down after lunch". That, far more than disaster resilience, is why it was worth doing.