Kafka how to read from __consumer_offsets topic
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
The fastest way to read from Kafka's __consumer_offsets topic is with the built-in kafka-console-consumer using Kafka's internal GroupMetadataManager formatter:
This decodes the internal binary format into human-readable output showing consumer group, topic, partition, and committed offset. The rest of this article covers why you would read this topic, what the records contain, how to decode them programmatically, and how to use modern alternatives like kafka-consumer-groups for common operational tasks.
What Is __consumer_offsets?
__consumer_offsets is an internal Kafka topic that stores consumer group offset commits. When a consumer calls commitSync() or commitAsync(), Kafka writes a record to this topic recording where that consumer group has read up to for each partition.
Before Kafka 0.9, offsets were stored in ZooKeeper. The migration to an internal topic improved performance and scalability because Kafka's own log-based storage handles high-throughput writes more efficiently than ZooKeeper.
Key Properties
| Property | Value |
| Topic name | __consumer_offsets |
| Default partitions | 50 |
| Cleanup policy | Compact (retains latest offset per key) |
| Replication factor | Matches offsets.topic.replication.factor (default 3) |
| Created automatically | Yes, on first consumer group commit |
| Internal topic | Yes (hidden from default topic listings) |
The topic has 50 partitions by default. Each consumer group is assigned to a specific partition based on a hash of the group ID. This is the "group coordinator" partition for that group.
Reading with kafka-console-consumer
Basic Command
Output looks like:
Each line shows: [group, topic, partition] :: offset details.
Reading Group Metadata (Not Just Offsets)
The __consumer_offsets topic stores two types of records: offset commits and group metadata (member assignments, protocol, leader info). To read group metadata:
This shows consumer group membership, rebalance information, and assignment details.
Filtering for a Specific Consumer Group
The console consumer does not support filtering by key, but you can pipe the output through grep:
For more precise filtering, read the topic programmatically (covered below).
Reading Programmatically with Java
To read __consumer_offsets in a Java application, you need to handle the internal binary serialization format:
Decoding the Binary Format
The key and value use Kafka's internal serialization schemas. Decoding them requires Kafka's internal classes:
In practice, using GroupMetadataManager.readMessageKey() and GroupMetadataManager.readOffsetMessageValue() from Kafka's server module is the reliable way to decode these records. However, this creates a dependency on kafka-server artifacts, which is heavy.
Python Alternative with kafka-python
The Easier Alternative: kafka-consumer-groups
For most operational tasks, you do not need to read __consumer_offsets directly. The kafka-consumer-groups CLI tool provides a structured view:
List All Consumer Groups
Describe a Consumer Group
Output:
This shows current offsets, end offsets, and lag per partition without needing to decode binary records.
Reset Offsets
Comparison: Direct Read vs CLI Tool
| Task | Direct __consumer_offsets | kafka-consumer-groups CLI |
| View current offsets | Requires binary decoding | --describe |
| View lag | Must compute manually | Shown automatically |
| Historical offset changes | Yes (read from beginning) | No (current state only) |
| Reset offsets | Not directly | --reset-offsets |
| Custom monitoring | Yes (programmatic access) | Limited |
| Audit trail | Full history in topic | Current snapshot only |
Partition Assignment
Each consumer group's offsets are stored in a specific partition of __consumer_offsets, determined by:
With the default 50 partitions, you can predict which partition holds a group's data:
This is useful for debugging: if a specific __consumer_offsets partition is experiencing high latency, you can identify which consumer groups are affected.
Security and Access Control
Reading __consumer_offsets requires specific ACLs in secured Kafka clusters:
In most production environments, direct access to internal topics is restricted to operators and monitoring tools. Application code should use the kafka-consumer-groups API or Kafka AdminClient instead.
Common Pitfalls
Using string deserializers. The __consumer_offsets topic stores keys and values in Kafka's internal binary format. Using StringDeserializer produces garbled output. Always use ByteArrayDeserializer and decode manually, or use the built-in formatters.
Forgetting exclude.internal.topics=false. By default, Kafka consumers skip internal topics. You must explicitly set this config to false to subscribe to __consumer_offsets.
Reading directly in production for monitoring. Reading the entire __consumer_offsets topic from the beginning on a busy cluster generates significant I/O. For monitoring, use kafka-consumer-groups --describe or JMX metrics (kafka.consumer:type=consumer-fetch-manager-metrics,*) instead.
Committing offsets from the reader consumer group. If your offset-reading consumer commits its own offsets, it writes records back to __consumer_offsets, creating noise. Set enable.auto.commit=false and do not call commitSync().
Assuming compaction removes all old records immediately. Log compaction keeps the latest record per key, but compaction runs asynchronously. You may see multiple records for the same group/topic/partition when reading from the beginning. Use the latest record for each key.
Not accounting for tombstones. When a consumer group is deleted or its offsets expire, Kafka writes a tombstone (null value) to __consumer_offsets. Your consumer code must handle null values gracefully.
Summary
- Use
kafka-console-consumerwith theOffsetsMessageFormatterfor quick inspection of__consumer_offsets. - The topic stores two record types: offset commits (group, topic, partition, offset) and group metadata (membership, assignments).
- For most operational tasks,
kafka-consumer-groups --describeis simpler and provides lag calculations automatically. - Reading programmatically requires
ByteArrayDeserializerandexclude.internal.topics=false. - Decoding the binary format requires Kafka's internal classes or manual struct parsing.
- Consumer group data is partitioned across 50 default partitions using a hash of the group ID.
- Direct reads are best suited for auditing, custom monitoring, and debugging. For day-to-day operations, prefer the CLI tools or AdminClient API.

