Stateful and Stateless consumer on Kafka

Kafka

Stateful Consumers

Stateless Consumers

Data Streaming

Distributed Systems

Stateful and Stateless consumer on Kafka

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

In Kafka, the difference between a stateless consumer and a stateful consumer is not about the broker API alone. It is about whether the application needs memory of previous records in order to process the current one.

What a stateless consumer does

A stateless consumer handles each record independently. The logic for one message does not depend on previously seen messages.

Typical examples include:

forwarding an event to another service,
validating and logging a record,
or converting a payload from one format to another.

java

1while (true) {
2    ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
3    for (ConsumerRecord<String, String> record : records) {
4        System.out.println(record.value());
5    }
6}

This loop consumes messages, but it does not keep application state derived from prior messages. If the process restarts, the business behavior does not depend on restoring any accumulated in-memory data.

What a stateful consumer does

A stateful consumer needs information carried across records. Examples include counts, aggregates, session windows, fraud rules, inventory totals, or joins against previously seen events.

java

1Map<String, Integer> counts = new HashMap<>();
2
3while (true) {
4    ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
5    for (ConsumerRecord<String, String> record : records) {
6        String word = record.value();
7        counts.put(word, counts.getOrDefault(word, 0) + 1);
8    }
9}

This consumer is stateful because processing the next message depends on the current contents of counts.

Why offsets are not the same as business state

Every Kafka consumer group tracks offsets, but offset tracking alone does not make a consumer stateful in the application-design sense. Offsets tell Kafka where consumption should resume. Business state is the application data you maintain to compute results.

For example:

stored offset = "resume after record 1500"
application state = "customer A has purchased 12 items this hour"

That distinction matters because stateful processing needs recovery plans for both offsets and state.

Where state usually lives

Simple examples keep state in memory, but production systems usually need something more durable. Common choices are:

an embedded state store managed by Kafka Streams,
a database,
Redis,
or compacted Kafka topics used as a changelog.

If the state disappears on restart, the consumer may no longer be able to compute correct results.

Kafka Streams makes stateful processing easier

With the low-level KafkaConsumer API, you manage state yourself. Kafka Streams adds built-in support for aggregations, windows, repartitioning, and local state stores.

java

1KStream<String, String> input = builder.stream("words");
2KTable<String, Long> counts = input
3    .groupBy((key, value) -> value)
4    .count();

This is still Kafka-based consumption, but the framework handles much of the state-management complexity for you.

Common Pitfalls

The biggest mistake is calling a consumer "stateless" just because it does not write to a database. If it keeps meaningful in-memory aggregates that affect future processing, it is still stateful.

Another common issue is storing state only in memory and forgetting recovery. That may work in tests, but it breaks correctness after restarts or rebalances.

Be careful with partitioning as well. Stateful logic often assumes that related records arrive at the same partition. If keys are inconsistent, the state model breaks down across consumer instances.

Finally, remember that scaling a stateful consumer is harder than scaling a stateless one. You have to think about partition ownership, local state placement, and how that state is restored after failure.

Summary

Stateless consumers process each record independently.
Stateful consumers need previously accumulated information to process new records.
Kafka offsets are not the same thing as business state.
Durable state management matters for correctness and recovery.
Kafka Streams is often the simplest way to build stateful Kafka applications.