aks reporting Insufficient pods
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
When working with Azure Kubernetes Service (AKS), encountering an "Insufficient pods" warning can block your deployments and prevent applications from scaling. This error means Kubernetes cannot schedule new pods because the cluster has hit a resource or configuration limit. This article walks through the root causes, diagnostic commands, and fixes to resolve the issue.
Understanding the "Insufficient Pods" Error
Kubernetes assigns pods to nodes through the scheduler. When the scheduler cannot find a suitable node for a pending pod, it marks the pod as Pending and emits an event describing the constraint that was violated. The "Insufficient pods" event specifically means one or more nodes have reached their maximum pod count.
Each Azure VM size has a maximum number of pods it can run. This limit depends on the node's networking configuration (Azure CNI vs. kubenet) and the VM size itself. For example, a Standard_DS2_v2 node using Azure CNI defaults to a maximum of 30 pods, while kubenet defaults to 110 pods per node.
Diagnosing the Problem
Start by checking which pods are stuck in Pending state and reading their events:
Then inspect a specific pending pod for scheduling failure details:
Look for events like 0/3 nodes are available: 3 Too many pods in the output. This confirms the "Insufficient pods" condition.
Next, check how many pods each node is currently running and what the maximum is:
You can also get a comprehensive view of node resource allocation:
Common Causes and Fixes
1. Max Pods Per Node Limit
The most frequent cause is hitting the maxPods limit on your nodes. When you create an AKS cluster, this value is set based on the network plugin.
To check the current max pods setting:
If you are using Azure CNI with the default of 30 pods per node, that limit can be reached quickly, especially with system pods (kube-proxy, CoreDNS, etc.) consuming several slots. You can increase this by creating a new node pool with a higher maxPods value:
Note that you cannot change maxPods on an existing node pool. You must create a new pool, migrate workloads, and then delete the old pool.
2. Cluster Autoscaler Not Configured
If the cluster autoscaler is disabled or misconfigured, AKS will not add nodes when existing ones are full. Enable it on your node pool:
Verify the autoscaler is working by checking its status:
3. Resource Requests Too Large
Even if pod count limits are not reached, overly generous CPU or memory requests can prevent scheduling. Review your deployment specs and right-size resource requests based on actual usage:
Use kubectl top pods and metrics from Azure Monitor to determine realistic request values.
4. Node Taints and Affinity Rules
Taints on nodes or strict affinity rules in pod specs can restrict which nodes a pod can be scheduled on. Check for taints:
If a node has a taint that your pod does not tolerate, the scheduler will skip that node entirely. Either add a matching toleration to your pod spec or remove the taint if it is no longer needed.
Common Pitfalls
A frequent mistake is assuming that the "Insufficient pods" error is about CPU or memory. It is specifically about the pod count limit per node, which is a separate constraint from compute resources. You can have plenty of CPU and memory available but still hit this error.
Another pitfall is forgetting to account for system pods. DaemonSets like kube-proxy, azure-cni, and monitoring agents each consume a pod slot on every node. On a node with maxPods set to 30, you may only have 24 or 25 slots available for your application workloads.
When using Azure CNI, each pod gets its own IP address from the subnet. If your subnet is too small, you may run out of IP addresses before hitting the pod count limit. Plan your subnet CIDR range to accommodate the maximum number of pods across all nodes.
Summary
The "Insufficient pods" error in AKS is typically caused by reaching the maxPods limit on cluster nodes, not by CPU or memory exhaustion. Diagnose with kubectl describe pod and kubectl describe node to confirm the constraint. Fix the issue by creating node pools with higher maxPods values, enabling the cluster autoscaler, right-sizing resource requests, or removing restrictive taints. Always account for system pod overhead when planning your cluster capacity.

