what are the kubernetes/elb time outs for http requests?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
There is no single “Kubernetes plus ELB timeout” for HTTP requests. The effective timeout is usually the combination of several layers: the AWS load balancer idle timeout, the ingress controller proxy timeouts, and the application server’s own read or response limits. If one of those layers times out earlier than the others, that layer wins.
Start With the Request Path
A typical request path looks like this:
- client
- AWS load balancer
- Kubernetes ingress controller or service
- application pod
- application server inside the container
Each hop may have its own timeout rules. That is why the right answer is rarely one number. It is a timeout chain, and the shortest relevant timeout usually determines observed behavior.
AWS Load Balancer Timeouts Are Usually Idle Timeouts
For HTTP workloads on AWS, the most important load-balancer setting is often the idle timeout. That controls how long an open connection can sit without data before the load balancer closes it.
In Kubernetes-managed AWS setups, you often configure this through annotations.
For a classic load balancer created from a Service:
For an Application Load Balancer managed through ingress, the annotation style is different.
The important point is that these are AWS-side connection settings, not generic Kubernetes settings.
Ingress Controllers Add Their Own Proxy Timeouts
If you use an ingress controller such as NGINX, it may time out before the load balancer does. That is why you must also inspect ingress-level settings.
If the NGINX proxy-read timeout is 60 seconds while the AWS idle timeout is 120 seconds, your requests will still fail around 60 seconds from the ingress layer.
The Application Server May Time Out Even Earlier
Your application server may also enforce request, response, or worker timeouts. For example:
- Gunicorn has worker and request-related timeout behavior
- ASP.NET Core may be behind Kestrel or reverse-proxy limits
- Node.js apps may use server or upstream timeouts
That means the application itself may terminate or reset the request before Kubernetes or AWS ever gets involved.
Services Themselves Usually Do Not Define HTTP Timeouts
A plain Kubernetes Service is mostly a network abstraction and load-balancing layer. It does not usually define HTTP request timeout semantics by itself. Those semantics come from the ingress controller, the cloud load balancer, and the backend server.
That is an important design point: when people say “Kubernetes timed out,” they are often really describing ingress or cloud-LB behavior.
A Good Debugging Order
When debugging unexplained HTTP timeouts, inspect the layers in this order:
- application logs and server timeouts
- ingress-controller logs and annotations
- AWS load balancer attributes
- client-side timeout settings
This is better than starting with guesses about “the Kubernetes timeout,” because Kubernetes itself is usually not the component imposing the HTTP deadline.
Common Pitfalls
- Assuming Kubernetes has one built-in HTTP timeout for all request paths.
- Setting the AWS ELB idle timeout and forgetting the ingress controller has a shorter timeout.
- Blaming the load balancer when the application server timed out first.
- Looking only at
Serviceobjects when the real timeout is enforced by ingress or AWS. - Treating idle timeout and end-to-end request timeout as though they were the same thing.
Summary
- There is no single Kubernetes or ELB timeout number for HTTP requests.
- The effective timeout is usually the shortest relevant setting across AWS, ingress, and the application server.
- AWS load balancers typically expose idle-timeout settings through Kubernetes annotations.
- Ingress controllers often add proxy-read and proxy-send timeouts that must also be configured.
- Debug the request path layer by layer instead of looking for one global timeout setting.

