DNS resolve problem in kubernetes cluster
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
When DNS resolution fails inside a Kubernetes cluster, the real problem is usually not "DNS in general." It is usually one of a small set of concrete failures: CoreDNS is unhealthy, the test Pod is using the wrong resolver settings, the Service name is wrong for the namespace, or the Service has no endpoints behind it.
Start With a Known-Good Test Pod
The cleanest way to debug cluster DNS is to test from inside the cluster. Launch a utility Pod and query DNS directly:
If this fails, you know the issue is cluster DNS or Pod DNS configuration, not just one application container.
Confirm the DNS Service Exists
Check that the cluster DNS Service is present:
On many clusters the service name is still kube-dns even when the actual deployment is CoreDNS. If the DNS Service is missing, broken, or has the wrong cluster IP, application Pods will not resolve service names reliably.
Check CoreDNS or kube-dns Health
Now inspect the DNS Pods:
If the Pods are crash-looping, not ready, or logging upstream resolver errors, the problem is on the DNS service side rather than on the workload side.
You should also verify that the DNS Service has endpoints:
No endpoints means the service name exists, but nothing is actually answering the queries.
Inspect Pod Resolver Settings
Inside a failing Pod, inspect /etc/resolv.conf:
You usually expect to see:
- a cluster DNS nameserver
- a search path containing the Pod namespace and
svc - an option such as
ndots:5
If the Pod uses the wrong nameserver, then queries never reach cluster DNS correctly.
Confirm the Name Format
Sometimes DNS is healthy and the application is just asking for the wrong name. Remember the common patterns:
- same namespace:
api - different namespace:
api.demo - full in-cluster name:
api.demo.svc.cluster.local
So from a Pod in namespace tools, this command is a better test than nslookup api:
If the short name fails but the fully qualified name works, the issue is naming scope rather than broken DNS.
Service and Endpoint Problems Can Look Like DNS Problems
A common trap is resolving the name successfully but still failing to connect. That usually means:
- the Service exists
- DNS resolution is fine
- the Service has no ready endpoints
Check:
If endpoints are empty, fix the Service selector or Pod readiness first. Do not keep debugging DNS after name resolution already works.
Cluster-Level Known Issues
A few environment problems show up repeatedly:
- node resolver configuration forwarding loops
- too many nameservers in the node
resolv.conf - broken upstream DNS for external lookups
- older container images with resolver limitations
These issues matter because cluster DNS often forwards non-cluster queries upstream. A cluster can resolve internal names correctly while failing on external names, or the reverse.
A Practical Debug Order
Use a narrow sequence instead of random commands:
- test from a utility Pod
- check
kube-dnsservice and endpoints - inspect CoreDNS Pod health and logs
- inspect Pod
resolv.conf - test short name and fully qualified name
- verify the target Service has endpoints
That order usually narrows the problem quickly.
Common Pitfalls
The most common mistake is debugging the application before confirming that kubernetes.default resolves from a test Pod. If basic cluster DNS is broken, application-specific debugging is wasted effort.
Another issue is confusing connection failures with DNS failures. A name can resolve correctly while the Service still has no ready endpoints.
People also test only the short service name from the wrong namespace and conclude that cluster DNS is broken. Cross-namespace callers need a namespace-qualified name or the full in-cluster name.
Summary
- Debug Kubernetes DNS from inside the cluster with a test Pod.
- Confirm the
kube-dnsservice, endpoints, and CoreDNS Pods are healthy. - Inspect Pod
resolv.confto verify the cluster nameserver and search path. - Test fully qualified service names when namespace scope is unclear.
- If resolution works but connections fail, check Service endpoints rather than DNS.

