Kubernetes
DNS
Debugging
Networking
Troubleshooting

Debugging DNS resolutions in kubernetes

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

When DNS breaks in Kubernetes, the symptom is usually simple: a Pod cannot resolve a service name or an external host. The fix is rarely one command, because Kubernetes DNS depends on several layers working together: the Pod configuration, the cluster DNS service, network connectivity, and the target record itself.

Start from inside a Pod

The first question is whether the problem happens from the perspective of the workload. Launch a temporary Pod and test resolution there.

bash
kubectl run dns-debug --rm -it --image=busybox:1.36 --restart=Never -- sh

Inside the shell:

bash
1nslookup kubernetes.default.svc.cluster.local
2nslookup my-service.my-namespace.svc.cluster.local
3nslookup google.com
4cat /etc/resolv.conf

This tells you:

  • whether cluster-internal names resolve
  • whether external names resolve
  • which nameserver and search domains the Pod is using

Without this baseline, it is hard to know whether the problem is specific to one application or to the cluster DNS path more broadly.

Understand what /etc/resolv.conf should look like

A normal Pod often gets a resolver file containing:

  • the cluster DNS service IP as nameserver
  • search suffixes such as svc.cluster.local

Typical example:

text
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

If the nameserver is missing or unexpected, the Pod may not be using cluster DNS the way you think it is.

Check CoreDNS or kube-dns

Most modern clusters use CoreDNS. Verify that the DNS Pods are healthy:

bash
kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl get svc -n kube-system

Then inspect logs:

bash
kubectl logs -n kube-system deployment/coredns

If the CoreDNS Pods are crash-looping, throttled, or logging upstream failures, the problem may have nothing to do with the application Pod itself.

Test service resolution and endpoints together

If a service name resolves but traffic still fails, DNS may not be the real issue. Check whether the Service actually has endpoints.

bash
kubectl get svc my-service -n my-namespace
kubectl get endpoints my-service -n my-namespace
kubectl get endpointslice -n my-namespace

A DNS lookup can succeed while the Service still has no backing Pods. In that case, the resolver is fine and the application wiring is broken elsewhere.

Look for network policies and DNS port blocks

Cluster DNS usually depends on traffic to port 53 over UDP and sometimes TCP. A restrictive NetworkPolicy can block those requests.

Check policies in the namespace:

bash
kubectl get networkpolicy -n my-namespace

If egress is locked down, make sure the workload can reach the cluster DNS service on the needed ports.

This is a common cause when one namespace has DNS issues and another does not.

Check Pod DNS settings

Pods can override the default DNS behavior through fields such as:

  • 'dnsPolicy'
  • 'dnsConfig'

For example:

yaml
spec:
  dnsPolicy: ClusterFirst

If a Pod uses dnsPolicy: Default, it may inherit node-level DNS behavior instead of the normal Kubernetes cluster-first resolver configuration.

That can be correct for special workloads, but it is also a frequent cause of confusion when copied from another manifest without understanding the tradeoff.

Verify external resolution separately

If internal names resolve but external names do not, the cluster DNS service may be healthy while its upstream forwarding path is not.

From the test Pod:

bash
nslookup kubernetes.default.svc.cluster.local
nslookup example.com

If only the second query fails, inspect CoreDNS configuration and upstream reachability rather than the service-discovery side of Kubernetes.

Debug one layer at a time

A good DNS troubleshooting flow is:

  1. confirm the failure from inside a Pod
  2. inspect /etc/resolv.conf
  3. verify CoreDNS health and logs
  4. test the Service and its endpoints
  5. check namespace network policies
  6. review Pod DNS overrides

That order keeps you from guessing too early.

Common Pitfalls

The biggest mistake is assuming every "cannot reach service" issue is a DNS problem. If the service resolves but has no endpoints, DNS is not the root cause.

Another issue is testing only from your workstation or from a node rather than from inside an affected Pod. Kubernetes DNS behavior is defined at the Pod level, so the Pod perspective matters most.

Developers also forget about dnsPolicy and dnsConfig. A copied manifest can silently opt out of the normal cluster DNS behavior.

Finally, restrictive NetworkPolicy rules often block DNS traffic in ways that look like resolver failure. Always check whether the Pod can actually talk to the DNS service.

Summary

  • Start DNS debugging from inside an affected Pod.
  • Inspect /etc/resolv.conf to verify the nameserver and search domains.
  • Check CoreDNS health and logs in kube-system.
  • Validate Service endpoints so you do not confuse service wiring with DNS resolution.
  • Review network policies and Pod DNS settings when the failure is namespace-specific.

Course illustration
Course illustration

All Rights Reserved.