Finding the Top-K Heavy Hitters in an Unbounded Stream

March 9, 2026

The Top-K problem in batch is easy. Sort by frequency, take the first K. The same problem in a stream is hard for one reason: you cannot store every distinct key. A trending search service sees billions of unique queries a day. A click stream behind an ad network has a long tail that never fits in RAM. You need a structure that gives useful answers in fixed memory.

The standard pattern is a Count-Min Sketch in front of a min-heap of size K.

Here is the flow. An event arrives. You hash its key with d different hash functions and increment one counter in each of d rows of a width-w table. To query the frequency, you read the same d counters and take the minimum. That minimum is your estimated count. The sketch never undercounts, because every real increment touched every row. It only overcounts when collisions stack other items on top of your counters. Pick w and d to bound that error, typically w = ceil(e / epsilon) and d = ceil(ln(1/delta)).

Now the heap. Keep a min-heap of size K, keyed by estimated frequency. When the sketch returns an estimate, compare it to the root. If it beats the root, push the new key and pop the smallest. If the key is already in the heap, update its count and sift down. The heap root is always your weakest current candidate, so the comparison is constant time.

The combination works because you only need accuracy where it matters. The sketch is sloppy about cold keys (they collide with each other in noise) and tight on hot keys (they dominate their own counters). Heavy hitters self-promote.

Where this breaks in production is the warmup window. Right after a deploy or a reset, the sketch has no signal, and the heap fills with whatever arrived first. A small burst of low-value traffic can park itself at the top until real hitters arrive and out-count it. Two fixes are common. Use a sliding window of multiple sketches (drop the oldest every minute) so cold keys age out. Or accept a delay: do not publish Top-K until total event count crosses a threshold.

The mental model: the sketch answers "how often, roughly?" and the heap answers "is this in the top K?" Together they give you trending searches, hot endpoints, or noisy log patterns in megabytes instead of terabytes, with bounded error you can actually reason about.

Key takeaway

Count-Min Sketch gives you cheap approximate counts in fixed memory, and a min-heap of size K turns those counts into a live Top-K. The sketch only ever overestimates, which is exactly the bias you want for heavy hitters.

Originally posted on LinkedIn. View original.