Making my homelab Kubernetes rack-aware

By 👤 DANIEL SAMSON, 🤖 CO-AUTHORED-BY: CLAUDE OPUS 4.7 <NOREPLY@ANTHROPIC.COM> · 2026-05-18

#devops #homelab #Kubernetes #self-hosting

I made my homelab Kubernetes cluster rack-aware, and the headline reason isn't surviving a disaster — though I get that for free. It's so I can take an entire rackmount offline for maintenance, deliberately, any afternoon I like, without anything going down.

Maintenance is the real driver

Homelab hardware needs hands-on work far more often than it actually fails: swapping a disk, adding RAM, re-cabling, a firmware or OS update, physically shuffling a unit, blowing the dust out. If powering off one rack takes the whole cluster down with it, that work can only happen in a dreaded maintenance window — or, realistically, never. I wanted to be able to pull a rack on a Tuesday afternoon and have nobody notice.

A failure domain is also a maintenance domain

The neat part is that the property which survives a rack losing power is exactly the same one that survives me switching it off on purpose. A rack is a unit of planned maintenance just as much as it's a unit of failure. Make the cluster spread its replicas across racks and "cordon and drain a whole rack" stops being scary and becomes routine.

Label the topology

Each node carries its rack identifier on two labels — the standard topology.kubernetes.io/zone key that Longhorn and topologySpreadConstraints understand, plus a convenience rack label for eyeballing things:

kubectl label node <node> topology.kubernetes.io/zone=rm-N rack=rm-N --overwrite

Spread the replicas across racks

With zones in place, Longhorn's zone anti-affinity keeps each volume's replicas in different racks, and topologySpreadConstraints do the same for pods. That's the bit that makes maintenance safe: when I drain a whole rack's nodes, every workload has a copy living in another rack, and storage still has quorum elsewhere. Nothing it was running has nowhere to go.

The maintenance flow

Taking a rack down is now a handful of boring commands:

kubectl cordon the rack's nodes so nothing new schedules onto them.
kubectl drain them — workloads reschedule onto the other racks.
Power the rack off and do the actual work: disks, RAM, cables, updates, whatever.
Power back up, kubectl uncordon, and let Longhorn rebuild any replicas that fell behind while the rack was dark.

No user-facing downtime at any point, because a majority of everything was always live in the racks I left switched on.

The caveat

This only works if you have enough racks and enough spare capacity that one rack's workloads actually fit on the others — pulling a rack when the rest are already full just moves the outage. And the labels have to be in place before you need them; mid-reorganisation I had half my workers SchedulingDisabled, so I labelled everything up front so the constraints would take effect the moment nodes came back.

Rack-awareness turned maintenance from an all-or-nothing event I'd put off for months into "I'll take rm-2 down after lunch". That, far more than disaster resilience, is why it was worth doing.