NGINX
Kubernetes
DNS Resolution
Troubleshooting
Networking Issues

DNS does not resolve with NGINX in Kubernetes

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

When DNS fails inside NGINX running on Kubernetes, the problem may be in cluster DNS, pod DNS settings, or NGINX's own resolution behavior. The key debugging step is to separate "the pod cannot resolve the name at all" from "NGINX resolved it once at startup and is not refreshing it the way you expected."

Verify DNS From Inside the Pod First

Before touching NGINX configuration, verify that the pod can resolve the target name using normal tools.

bash
kubectl exec -it deploy/nginx -- cat /etc/resolv.conf
kubectl exec -it deploy/nginx -- getent hosts my-service.default.svc.cluster.local

If getent hosts or nslookup fails inside the pod, NGINX is not the primary problem yet. That points to Kubernetes DNS itself, pod DNS policy, or network connectivity to CoreDNS.

If shell tools resolve the name but NGINX still fails, the issue is more likely in NGINX configuration semantics.

Know the Kubernetes Service Name Format

Inside a cluster, a service normally resolves as:

text
service-name.namespace.svc.cluster.local

Shorter forms also work depending on the caller's namespace and search domains. For example, from the same namespace, my-service may be enough.

A simple NGINX upstream config might look like this:

nginx
1upstream backend {
2    server my-service.default.svc.cluster.local:8080;
3}
4
5server {
6    listen 80;
7
8    location / {
9        proxy_pass http://backend;
10    }
11}

That works when the service name is valid and resolvable at the time NGINX loads the config.

NGINX Often Resolves Names at Startup Only

This is a major source of confusion. In many configurations, NGINX resolves hostnames when it starts or reloads, not continuously on every request. If the target service IP changes later and NGINX is using a resolved static upstream address, traffic may continue to point at stale endpoints.

If you need runtime DNS resolution, configure a resolver explicitly and use a variable-based proxy_pass pattern.

nginx
1resolver kube-dns.kube-system.svc.cluster.local valid=10s;
2
3server {
4    listen 80;
5
6    location / {
7        set $backend http://my-service.default.svc.cluster.local:8080;
8        proxy_pass $backend;
9    }
10}

The exact resolver service name depends on the cluster. Many clusters use CoreDNS rather than a service literally named kube-dns, so inspect the actual DNS service first.

bash
kubectl get svc -n kube-system

Check Pod DNS Policy and Namespace Assumptions

Most pods use dnsPolicy: ClusterFirst, which is what you want for standard in-cluster service discovery. If the pod has a custom DNS policy, or if it runs with hostNetwork: true, resolution behavior can change.

A normal pod spec usually leaves DNS policy alone:

yaml
spec:
  dnsPolicy: ClusterFirst

If a pod unexpectedly inherits host-level DNS settings, in-cluster service names may stop resolving the way you expect.

Inspect CoreDNS Health and Reachability

If DNS tools fail inside the pod, verify the cluster DNS service itself.

bash
kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl logs -n kube-system deploy/coredns
kubectl get svc -n kube-system

You should also consider network policies. If egress from the NGINX pod to the cluster DNS service is blocked, DNS lookups fail even when CoreDNS is healthy.

Debug the Right Layer

A clean debugging order is:

  1. resolve the service name from inside the pod using shell tools
  2. inspect /etc/resolv.conf and DNS policy
  3. verify CoreDNS service and pod health
  4. inspect NGINX upstream or resolver behavior
  5. reload NGINX if it resolved a stale address at startup

This order matters. Otherwise you can spend time editing NGINX configs for a cluster DNS outage, or debugging CoreDNS when the real issue is NGINX caching a startup resolution.

Common Pitfalls

  • Assuming an NGINX name-resolution error always means Kubernetes DNS is broken ignores the fact that NGINX may resolve upstream names only at startup.
  • Using short service names across namespaces can fail when the search domain assumptions are wrong. Use the full service FQDN while debugging.
  • Forgetting to configure a resolver for variable-based dynamic upstream resolution makes NGINX unable to refresh DNS the way you expect.
  • Custom DNS policies or hostNetwork settings can bypass the normal ClusterFirst behavior and break in-cluster service discovery.
  • Skipping pod-level DNS tests wastes time because nslookup or getent can quickly tell you whether the problem is Kubernetes DNS or NGINX configuration.

Summary

  • First verify whether the pod itself can resolve the service name.
  • If pod-level DNS works, inspect NGINX resolution timing and resolver configuration.
  • Use full Kubernetes service names while debugging to avoid namespace ambiguity.
  • Check CoreDNS health, pod DNS policy, and network policies when pod-level resolution fails.
  • Separate cluster DNS issues from NGINX caching or startup-resolution behavior.

Course illustration
Course illustration

All Rights Reserved.