Three Replicas, One Node, Zero Availability: Why HA in Kubernetes Is About Placement
April 17, 2026
A team set replicas: 3 on their checkout deployment, watched all three pods come up green, and called the service highly available. Six weeks later one node hardware-failed at 2am. All three pods were on that node. Checkout was down for nine minutes while the scheduler recreated them elsewhere.
That is the gap. Replica count tells Kubernetes how many copies you want. It says nothing about where they should live. By default the scheduler optimizes for resource fit, not failure isolation. Three small pods that all fit comfortably on the same big node will, by default, land on the same big node. Now your three replicas share a fate.
The controls that fix this come in three layers.
Pod anti-affinity is the first. You declare that pods with a given label should not be co-scheduled on the same node, or the same zone, or the same rack. The common pattern is a preferredDuringSchedulingIgnoredDuringExecution rule with topologyKey: kubernetes.io/hostname. The scheduler will spread replicas across nodes when it can, and fall back gracefully when it cannot. The stricter requiredDuringScheduling variant refuses to schedule rather than break the rule, which is what you want for stateful workloads where colocation is a real outage risk.
Topology spread constraints are the second and more modern tool. They let you express the rule directly: "across zones, the difference between the most and least loaded zone should be at most one." That is what maxSkew: 1 with topologyKey: topology.kubernetes.io/zone buys. It scales better than anti-affinity when you have many replicas and many zones, because it thinks in terms of distribution rather than pairwise exclusion.
PodDisruptionBudgets are the third layer, and they protect what the first two built. A PDB tells Kubernetes the minimum availability you require during voluntary disruptions. A node drain for an upgrade is a voluntary disruption. So is a cluster autoscaler scale-down. Without a PDB, a kubectl drain of two nodes can happily evict five of your six replicas at once, because nothing told the eviction API to slow down. With minAvailable: 2 or maxUnavailable: 1, the drain pauses until the deployment can replace pods elsewhere first.
The subtle production failure here. PDBs only protect against voluntary disruptions. A node crash is involuntary. The PDB does nothing. So a PDB plus three replicas on one node still gets you a full outage when the node dies. The PDB enforces what the scheduler placed, it does not improve the placement. You need anti-affinity or spread constraints to do the actual spreading, and the PDB to keep the spread intact during planned operations.
Replicas give you redundancy on paper. Placement gives you redundancy in production.
High availability is a property of placement, not replica count. Anti-affinity spreads pods across nodes, topology spread constraints spread across zones, and PodDisruptionBudgets protect that spread during voluntary disruptions.
Originally posted on LinkedIn. View original.