Kafka Broker vs Topic

Kafka Broker

Kafka Topic

Apache Kafka

Data Streaming

Distributed Systems

Kafka Broker vs Topic

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

In Kafka, a broker and a topic are not rival names for the same thing. A broker is a Kafka server in the cluster, while a topic is a logical stream of records. One is infrastructure, the other is a data contract.

Broker and Topic in Plain Terms

A broker is a running Kafka node. It has machine-level concerns such as:

CPU
memory
disk
network
partition leadership

A topic is a named stream of messages. It has stream-level concerns such as:

retention
partition count
schemas
permissions
producer and consumer ownership

A useful mental model is:

broker answers “where does the data live”
topic answers “what stream of data is this”

Keeping those two ideas separate makes Kafka much easier to reason about.

The Bridge Between Them: Partitions

Topics do not live on brokers as one single monolithic unit. Topics are divided into partitions, and those partitions are hosted on brokers.

For example:

topic orders has 6 partitions
cluster has 3 brokers
each partition has one leader and one or more replicas

So the actual relationship is:

topics are logical
partitions are the sharding unit
brokers host partition leaders and replicas

This is why Kafka architecture discussions often sound like they are switching levels. Application developers talk about topics, while operators often talk about brokers and partition placement.

A Concrete Example

Create and inspect a topic:

bash

1kafka-topics.sh \
2  --bootstrap-server localhost:9092 \
3  --create \
4  --topic orders \
5  --partitions 6 \
6  --replication-factor 3
7
8kafka-topics.sh \
9  --bootstrap-server localhost:9092 \
10  --describe \
11  --topic orders

The description shows partition leaders and replicas on specific brokers. That output makes the distinction clear:

the topic is the named stream orders
the brokers are the servers responsible for specific partitions of that stream

Producer and Consumer Perspective

Application code usually works with topics, not brokers directly.

Producer example:

java

ProducerRecord<String, String> record =
    new ProducerRecord<>("orders", "customer-42", "order-created");
producer.send(record);

Consumer example:

java

1consumer.subscribe(List.of("orders"));
2while (true) {
3    ConsumerRecords<String, String> records =
4        consumer.poll(Duration.ofMillis(500));
5
6    for (ConsumerRecord<String, String> record : records) {
7        System.out.println(record.partition() + " " + record.value());
8    }
9}

The producer says “send to topic orders.” Kafka then chooses a partition, and that partition’s leader broker handles the write.

So client code reasons in terms of topics, while the cluster executes through brokers and partitions.

Capacity Planning Needs Both Concepts

When throughput or latency is poor, the bottleneck can live at either layer.

Broker-level problems include:

disk saturation
network saturation
uneven partition leadership
JVM pressure

Topic-level problems include:

too few partitions
skewed partition keys
excessive retention
hot partitions from uneven producer behavior

Creating more topics does not automatically improve throughput. Often the real lever is partition count and better distribution across brokers.

Reliability Also Spans Both Layers

Kafka durability happens at the partition-replication level.

That means:

the topic defines partition count and replication expectations
brokers store leaders and followers for those partitions
broker failure causes leader election for affected partitions

So topic configuration expresses the desired policy, while brokers provide the physical storage and failover behavior that make the policy real.

Common Pitfalls

The biggest mistake is treating a topic like a physical server or treating a broker like a logical data stream. They belong to different layers of the system.

Another issue is thinking topic count alone controls consumer parallelism. Parallelism depends heavily on partition count, not just how many topic names exist.

People also often ignore bad partition-key design, which can overload one broker even when the cluster looks healthy overall.

Finally, troubleshooting becomes confusing when stream design problems and infrastructure problems are mixed together. Keep topic questions and broker questions separate whenever possible.

Summary

A broker is a Kafka server, while a topic is a logical stream of records.
Topics are implemented through partitions, and partitions are hosted on brokers.
Producers and consumers mostly work with topics, not brokers.
Capacity and reliability depend on both topic design and broker health.
Keeping the concepts separate leads to clearer Kafka architecture and troubleshooting.