Kubernetes
DNS
Troubleshooting
Cluster Management
Networking

DNS resolve problem in kubernetes cluster

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

When DNS resolution fails inside a Kubernetes cluster, the real problem is usually not "DNS in general." It is usually one of a small set of concrete failures: CoreDNS is unhealthy, the test Pod is using the wrong resolver settings, the Service name is wrong for the namespace, or the Service has no endpoints behind it.

Start With a Known-Good Test Pod

The cleanest way to debug cluster DNS is to test from inside the cluster. Launch a utility Pod and query DNS directly:

bash
kubectl run dnsutils --image=registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3 --restart=Never -- sleep 3600
kubectl exec -it dnsutils -- nslookup kubernetes.default

If this fails, you know the issue is cluster DNS or Pod DNS configuration, not just one application container.

Confirm the DNS Service Exists

Check that the cluster DNS Service is present:

bash
kubectl get svc -n kube-system

On many clusters the service name is still kube-dns even when the actual deployment is CoreDNS. If the DNS Service is missing, broken, or has the wrong cluster IP, application Pods will not resolve service names reliably.

Check CoreDNS or kube-dns Health

Now inspect the DNS Pods:

bash
kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl logs -n kube-system deployment/coredns

If the Pods are crash-looping, not ready, or logging upstream resolver errors, the problem is on the DNS service side rather than on the workload side.

You should also verify that the DNS Service has endpoints:

bash
kubectl get endpointslice -n kube-system -l k8s.io/service-name=kube-dns

No endpoints means the service name exists, but nothing is actually answering the queries.

Inspect Pod Resolver Settings

Inside a failing Pod, inspect /etc/resolv.conf:

bash
kubectl exec -it dnsutils -- cat /etc/resolv.conf

You usually expect to see:

  • a cluster DNS nameserver
  • a search path containing the Pod namespace and svc
  • an option such as ndots:5

If the Pod uses the wrong nameserver, then queries never reach cluster DNS correctly.

Confirm the Name Format

Sometimes DNS is healthy and the application is just asking for the wrong name. Remember the common patterns:

  • same namespace: api
  • different namespace: api.demo
  • full in-cluster name: api.demo.svc.cluster.local

So from a Pod in namespace tools, this command is a better test than nslookup api:

bash
kubectl exec -n tools dnsutils -- nslookup api.demo.svc.cluster.local

If the short name fails but the fully qualified name works, the issue is naming scope rather than broken DNS.

Service and Endpoint Problems Can Look Like DNS Problems

A common trap is resolving the name successfully but still failing to connect. That usually means:

  • the Service exists
  • DNS resolution is fine
  • the Service has no ready endpoints

Check:

bash
kubectl get svc -n demo api
kubectl get endpoints -n demo api
kubectl get pods -n demo -l app=api

If endpoints are empty, fix the Service selector or Pod readiness first. Do not keep debugging DNS after name resolution already works.

Cluster-Level Known Issues

A few environment problems show up repeatedly:

  • node resolver configuration forwarding loops
  • too many nameservers in the node resolv.conf
  • broken upstream DNS for external lookups
  • older container images with resolver limitations

These issues matter because cluster DNS often forwards non-cluster queries upstream. A cluster can resolve internal names correctly while failing on external names, or the reverse.

A Practical Debug Order

Use a narrow sequence instead of random commands:

  1. test from a utility Pod
  2. check kube-dns service and endpoints
  3. inspect CoreDNS Pod health and logs
  4. inspect Pod resolv.conf
  5. test short name and fully qualified name
  6. verify the target Service has endpoints

That order usually narrows the problem quickly.

Common Pitfalls

The most common mistake is debugging the application before confirming that kubernetes.default resolves from a test Pod. If basic cluster DNS is broken, application-specific debugging is wasted effort.

Another issue is confusing connection failures with DNS failures. A name can resolve correctly while the Service still has no ready endpoints.

People also test only the short service name from the wrong namespace and conclude that cluster DNS is broken. Cross-namespace callers need a namespace-qualified name or the full in-cluster name.

Summary

  • Debug Kubernetes DNS from inside the cluster with a test Pod.
  • Confirm the kube-dns service, endpoints, and CoreDNS Pods are healthy.
  • Inspect Pod resolv.conf to verify the cluster nameserver and search path.
  • Test fully qualified service names when namespace scope is unclear.
  • If resolution works but connections fail, check Service endpoints rather than DNS.

Course illustration
Course illustration

All Rights Reserved.