Controlling the number of spawned futures to create backpressure
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
When spawning async tasks or futures, an unbounded number of concurrent operations can exhaust memory, file descriptors, or network connections. Backpressure is the mechanism that limits how many futures run concurrently, forcing producers to slow down when consumers cannot keep up. Most async runtimes provide semaphores, buffered streams, or bounded channels to control concurrency.
The Problem: Unbounded Concurrency
Solution 1: Semaphore (Most Languages)
A semaphore limits the number of concurrent tasks by requiring each task to acquire a permit before running.
Python (asyncio)
Rust (Tokio)
JavaScript (Manual Semaphore)
Solution 2: Buffered Streams (Rust)
The futures crate provides buffer_unordered to process a stream of futures with bounded concurrency:
buffer_unordered returns results as they complete (out of order). Use buffered to preserve input order.
Solution 3: Bounded Channels
Producer-consumer patterns with a bounded channel naturally apply backpressure — the producer blocks when the channel is full.
Solution 4: Thread/Task Pool Executors
The pool's max_workers parameter limits concurrency. Excess submissions queue until a worker is free.
Choosing the Right Concurrency Limit
The optimal limit depends on the bottleneck:
| Bottleneck | Typical Limit | How to Determine |
| Network I/O (HTTP) | 20-100 | Server rate limits, connection pool size |
| Database connections | Pool size (e.g., 10-50) | Match connection pool max |
| CPU-bound work | Number of cores | os.cpu_count() or num_cpus::get() |
| File descriptors | ulimit -n minus headroom | Check OS limits |
| External API | API rate limit | Read API docs |
Common Pitfalls
- Creating all futures before limiting:
asyncio.gather(*[fetch(url) for url in urls])creates all coroutine objects immediately. The semaphore must be inside the coroutine, not outside. Otherwise all tasks are spawned first and the semaphore only gates execution. - Deadlocking with nested semaphore acquisition: If a task acquires a semaphore permit and then calls a function that also acquires from the same semaphore, it deadlocks when the pool is full. Use separate semaphores for separate resource types.
- Not handling errors inside limited tasks: When one task in a
buffer_unorderedstream panics or errors, it can stall the entire pipeline. Always handle errors inside each future and returnResulttypes instead of unwrapping. - Setting limits too low: An overly conservative limit (e.g., 1-2 concurrent requests) serializes work and wastes throughput. Benchmark with increasing limits to find the sweet spot where throughput plateaus.
- Ignoring downstream backpressure: Limiting request concurrency but buffering all responses in memory defeats the purpose. If results are large, process them incrementally instead of collecting into a
Vecor list.
Summary
- Unbounded future spawning exhausts system resources — always limit concurrency
- Use
asyncio.Semaphore(Python),tokio::sync::Semaphore(Rust), or manual promise pools (JavaScript) buffer_unorderedin Rust'sfuturescrate is the idiomatic stream-based approach- Bounded channels (
asyncio.Queue,tokio::sync::mpsc) provide natural backpressure - Thread pool executors (
ThreadPoolExecutor, Rayon) limit concurrency via pool size - Match the concurrency limit to the actual bottleneck (network, DB, CPU, OS limits)

