Redis Streams Make Backpressure Visible, and That Is the Whole Point

January 17, 2026


Redis Streams are an append-only log that lives inside Redis. You write with XADD, you read with XREADGROUP, and a consumer group tracks which entries each consumer is currently working on. The data structure that does that tracking is the Pending Entries List, or PEL. The PEL is the thing engineers either learn to love or learn to fear, depending on how their first incident goes.

Here is what makes Streams different from Kafka. Kafka hides backpressure behind a long retention window: a slow consumer just falls further behind in offsets, and the broker does not really care. Redis Streams expose backpressure as a number you can query. Every entry delivered to a consumer stays in the PEL until that consumer calls XACK. A slow consumer accumulates entries in the PEL, and XPENDING tells you exactly how many, how old, and which consumer owns them. Lag is not an inferred metric. It is a field.

That visibility is what makes recovery possible. XAUTOCLAIM walks the PEL and transfers entries idle for longer than a threshold to a different consumer in the same group. A crashed consumer does not lose its in-flight work, it just stops responding, and another consumer takes over after the idle window. This is the closest thing in the Redis world to Kafka's rebalance protocol, and it is much simpler to reason about.

The production failure I keep seeing comes from forgetting that the PEL lives in the same Redis instance as everything else. A team I worked with used Redis Streams for an OTP delivery queue. Their SMS provider had a 10x latency spike one evening. The consumer kept pulling from the stream, but it could not ACK fast enough, so the PEL grew. Redis hit maxmemory. The eviction policy was allkeys-lru, inherited from when this instance was a cache. Redis started evicting the stream's own entries to make room for new writes. In-flight OTPs vanished. Users who had requested a code never got one, and the system had no record they ever asked.

The fix had three parts. Streams moved to their own Redis instance with maxmemory-policy noeviction, so backpressure surfaces as write rejection rather than silent data loss. A janitor consumer was added to dead-letter any PEL entry older than 60 seconds, so a stalled downstream cannot grow the PEL forever. Alerts were wired to XPENDING summary counts, not just CPU.

Streams give you the primitives. They do not give you a policy. The policy is yours to write.

Key takeaway

The PEL is a backpressure dashboard, not a buffer. Treat unbounded PEL growth as a memory bug, not a queueing one, and isolate streams from the rest of your cache.

Originally posted on LinkedIn. View original.


All Rights Reserved.