cert-manager letsencrypt issuing invalid certs
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
When cert-manager issues an invalid certificate, the problem is usually not that Let's Encrypt is random or broken. It is usually a mismatch between the requested hostname, the ACME challenge path, DNS, ingress routing, or the issuer configuration used during validation.
Start by identifying what "invalid" means
The word invalid can mean several different failure modes:
- the certificate is self-signed instead of from Let's Encrypt
- the certificate is for the wrong hostname
- the certificate chain is incomplete
- the certificate has already expired
- the browser sees a challenge or issuer failure and serves a fallback certificate
Before changing manifests, inspect the actual certificate presented by the service:
That tells you whether the problem is with the issued certificate itself or with which certificate your ingress is serving.
Check cert-manager resources in order
cert-manager creates several resources during issuance. Reading them in order makes debugging faster:
CertificateCertificateRequestOrderChallenge- ingress or DNS objects used for validation
Useful commands:
The describe output often contains the exact reason the ACME flow failed.
A correct HTTP-01 issuer example
A common configuration is an HTTP-01 challenge solved through an ingress controller:
If the ingress class does not match the real controller handling traffic, the ACME challenge may never be reachable even though the YAML looks correct.
DNS and ingress are the most common root causes
For HTTP-01:
- the domain must resolve to the ingress controller handling the challenge
- the
/.well-known/acme-challenge/path must be reachable from the public internet - no redirect or custom middleware should break that path
For DNS-01:
- the DNS provider credentials must be correct
- TXT records must be created in the correct zone
- propagation must complete before validation
Quick checks:
If DNS points somewhere else, cert-manager can complete internal steps while Let's Encrypt still validates against the wrong endpoint.
Watch for staging versus production confusion
Let's Encrypt has separate staging and production endpoints. Staging certificates are intentionally untrusted by browsers.
A staging issuer usually looks like:
If you accidentally use the staging endpoint in production, cert-manager may issue a technically valid certificate that browsers still reject. This is one of the easiest mistakes to miss when manifests are copied between environments.
Secret and ingress mismatches
Sometimes cert-manager successfully issues the right certificate, but the ingress still serves the wrong secret. Confirm that:
- the
Certificate.spec.secretNamematches the secret used by the ingress - the ingress is in the same namespace as the secret
- no older ingress object references another TLS secret
Example ingress TLS section:
If that secret name differs from the one cert-manager updates, clients will continue seeing an old or unrelated certificate.
Common Pitfalls
The most common mistake is debugging cert-manager first when the real problem is DNS pointing to the wrong load balancer. Another frequent issue is using the staging ACME server and then treating the resulting untrusted certificate as a signing failure. Teams also forget that ingress class names must match the actual controller, especially after cluster upgrades or chart changes. Secret name mismatches are another source of confusion because cert-manager can succeed while ingress still serves an old secret. Finally, people often test only inside the cluster, while ACME validation happens from the public internet and may see very different routing.
Summary
- Define "invalid" first by inspecting the certificate actually served to clients.
- Debug
Certificate,CertificateRequest,Order, andChallengeresources in sequence. - Verify DNS, ingress routing, and challenge reachability from the public internet.
- Make sure you are using the production ACME endpoint when you want browser-trusted certificates.
- Confirm that ingress references the same TLS secret cert-manager updates.
- Treat DNS and secret wiring issues as first-class suspects, not afterthoughts.

