Debugging DNS resolutions in kubernetes

Kubernetes

DNS

Debugging

Networking

Troubleshooting

Debugging DNS resolutions in kubernetes

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

When DNS breaks in Kubernetes, the symptom is usually simple: a Pod cannot resolve a service name or an external host. The fix is rarely one command, because Kubernetes DNS depends on several layers working together: the Pod configuration, the cluster DNS service, network connectivity, and the target record itself.

Start from inside a Pod

The first question is whether the problem happens from the perspective of the workload. Launch a temporary Pod and test resolution there.

bash

kubectl run dns-debug --rm -it --image=busybox:1.36 --restart=Never -- sh

Inside the shell:

bash

1nslookup kubernetes.default.svc.cluster.local
2nslookup my-service.my-namespace.svc.cluster.local
3nslookup google.com
4cat /etc/resolv.conf

This tells you:

whether cluster-internal names resolve
whether external names resolve
which nameserver and search domains the Pod is using

Without this baseline, it is hard to know whether the problem is specific to one application or to the cluster DNS path more broadly.

Understand what `/etc/resolv.conf` should look like

A normal Pod often gets a resolver file containing:

the cluster DNS service IP as nameserver
search suffixes such as svc.cluster.local

Typical example:

text

nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

If the nameserver is missing or unexpected, the Pod may not be using cluster DNS the way you think it is.

Check CoreDNS or kube-dns

Most modern clusters use CoreDNS. Verify that the DNS Pods are healthy:

bash

kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl get svc -n kube-system

Then inspect logs:

bash

kubectl logs -n kube-system deployment/coredns

If the CoreDNS Pods are crash-looping, throttled, or logging upstream failures, the problem may have nothing to do with the application Pod itself.

Test service resolution and endpoints together

If a service name resolves but traffic still fails, DNS may not be the real issue. Check whether the Service actually has endpoints.

bash

kubectl get svc my-service -n my-namespace
kubectl get endpoints my-service -n my-namespace
kubectl get endpointslice -n my-namespace

A DNS lookup can succeed while the Service still has no backing Pods. In that case, the resolver is fine and the application wiring is broken elsewhere.

Look for network policies and DNS port blocks

Cluster DNS usually depends on traffic to port 53 over UDP and sometimes TCP. A restrictive NetworkPolicy can block those requests.

Check policies in the namespace:

bash

kubectl get networkpolicy -n my-namespace

If egress is locked down, make sure the workload can reach the cluster DNS service on the needed ports.

This is a common cause when one namespace has DNS issues and another does not.

Check Pod DNS settings

Pods can override the default DNS behavior through fields such as:

'dnsPolicy'
'dnsConfig'

For example:

yaml

spec:
  dnsPolicy: ClusterFirst

If a Pod uses dnsPolicy: Default, it may inherit node-level DNS behavior instead of the normal Kubernetes cluster-first resolver configuration.

That can be correct for special workloads, but it is also a frequent cause of confusion when copied from another manifest without understanding the tradeoff.

Verify external resolution separately

If internal names resolve but external names do not, the cluster DNS service may be healthy while its upstream forwarding path is not.

From the test Pod:

bash

nslookup kubernetes.default.svc.cluster.local
nslookup example.com

If only the second query fails, inspect CoreDNS configuration and upstream reachability rather than the service-discovery side of Kubernetes.

Debug one layer at a time

A good DNS troubleshooting flow is:

confirm the failure from inside a Pod
inspect /etc/resolv.conf
verify CoreDNS health and logs
test the Service and its endpoints
check namespace network policies
review Pod DNS overrides

That order keeps you from guessing too early.

Common Pitfalls

The biggest mistake is assuming every "cannot reach service" issue is a DNS problem. If the service resolves but has no endpoints, DNS is not the root cause.

Another issue is testing only from your workstation or from a node rather than from inside an affected Pod. Kubernetes DNS behavior is defined at the Pod level, so the Pod perspective matters most.

Developers also forget about dnsPolicy and dnsConfig. A copied manifest can silently opt out of the normal cluster DNS behavior.

Finally, restrictive NetworkPolicy rules often block DNS traffic in ways that look like resolver failure. Always check whether the Pod can actually talk to the DNS service.

Summary

Start DNS debugging from inside an affected Pod.
Inspect /etc/resolv.conf to verify the nameserver and search domains.
Check CoreDNS health and logs in kube-system.
Validate Service endpoints so you do not confuse service wiring with DNS resolution.
Review network policies and Pod DNS settings when the failure is namespace-specific.

Debugging DNS resolutions in kubernetes

Master System Design with Codemia

Introduction

Start from inside a Pod

Understand what /etc/resolv.conf should look like

Check CoreDNS or kube-dns

Test service resolution and endpoints together

Look for network policies and DNS port blocks

Check Pod DNS settings

Verify external resolution separately

Debug one layer at a time

Common Pitfalls

Summary

Understand what `/etc/resolv.conf` should look like