DNS does not resolve with NGINX in Kubernetes
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
When DNS fails inside NGINX running on Kubernetes, the problem may be in cluster DNS, pod DNS settings, or NGINX's own resolution behavior. The key debugging step is to separate "the pod cannot resolve the name at all" from "NGINX resolved it once at startup and is not refreshing it the way you expected."
Verify DNS From Inside the Pod First
Before touching NGINX configuration, verify that the pod can resolve the target name using normal tools.
If getent hosts or nslookup fails inside the pod, NGINX is not the primary problem yet. That points to Kubernetes DNS itself, pod DNS policy, or network connectivity to CoreDNS.
If shell tools resolve the name but NGINX still fails, the issue is more likely in NGINX configuration semantics.
Know the Kubernetes Service Name Format
Inside a cluster, a service normally resolves as:
Shorter forms also work depending on the caller's namespace and search domains. For example, from the same namespace, my-service may be enough.
A simple NGINX upstream config might look like this:
That works when the service name is valid and resolvable at the time NGINX loads the config.
NGINX Often Resolves Names at Startup Only
This is a major source of confusion. In many configurations, NGINX resolves hostnames when it starts or reloads, not continuously on every request. If the target service IP changes later and NGINX is using a resolved static upstream address, traffic may continue to point at stale endpoints.
If you need runtime DNS resolution, configure a resolver explicitly and use a variable-based proxy_pass pattern.
The exact resolver service name depends on the cluster. Many clusters use CoreDNS rather than a service literally named kube-dns, so inspect the actual DNS service first.
Check Pod DNS Policy and Namespace Assumptions
Most pods use dnsPolicy: ClusterFirst, which is what you want for standard in-cluster service discovery. If the pod has a custom DNS policy, or if it runs with hostNetwork: true, resolution behavior can change.
A normal pod spec usually leaves DNS policy alone:
If a pod unexpectedly inherits host-level DNS settings, in-cluster service names may stop resolving the way you expect.
Inspect CoreDNS Health and Reachability
If DNS tools fail inside the pod, verify the cluster DNS service itself.
You should also consider network policies. If egress from the NGINX pod to the cluster DNS service is blocked, DNS lookups fail even when CoreDNS is healthy.
Debug the Right Layer
A clean debugging order is:
- resolve the service name from inside the pod using shell tools
- inspect
/etc/resolv.confand DNS policy - verify CoreDNS service and pod health
- inspect NGINX upstream or
resolverbehavior - reload NGINX if it resolved a stale address at startup
This order matters. Otherwise you can spend time editing NGINX configs for a cluster DNS outage, or debugging CoreDNS when the real issue is NGINX caching a startup resolution.
Common Pitfalls
- Assuming an NGINX name-resolution error always means Kubernetes DNS is broken ignores the fact that NGINX may resolve upstream names only at startup.
- Using short service names across namespaces can fail when the search domain assumptions are wrong. Use the full service FQDN while debugging.
- Forgetting to configure a
resolverfor variable-based dynamic upstream resolution makes NGINX unable to refresh DNS the way you expect. - Custom DNS policies or
hostNetworksettings can bypass the normalClusterFirstbehavior and break in-cluster service discovery. - Skipping pod-level DNS tests wastes time because
nslookuporgetentcan quickly tell you whether the problem is Kubernetes DNS or NGINX configuration.
Summary
- First verify whether the pod itself can resolve the service name.
- If pod-level DNS works, inspect NGINX resolution timing and resolver configuration.
- Use full Kubernetes service names while debugging to avoid namespace ambiguity.
- Check CoreDNS health, pod DNS policy, and network policies when pod-level resolution fails.
- Separate cluster DNS issues from NGINX caching or startup-resolution behavior.

