Kafka Broker vs Topic
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
In Kafka, a broker and a topic are not rival names for the same thing. A broker is a Kafka server in the cluster, while a topic is a logical stream of records. One is infrastructure, the other is a data contract.
Broker and Topic in Plain Terms
A broker is a running Kafka node. It has machine-level concerns such as:
- CPU
- memory
- disk
- network
- partition leadership
A topic is a named stream of messages. It has stream-level concerns such as:
- retention
- partition count
- schemas
- permissions
- producer and consumer ownership
A useful mental model is:
- broker answers “where does the data live”
- topic answers “what stream of data is this”
Keeping those two ideas separate makes Kafka much easier to reason about.
The Bridge Between Them: Partitions
Topics do not live on brokers as one single monolithic unit. Topics are divided into partitions, and those partitions are hosted on brokers.
For example:
- topic
ordershas 6 partitions - cluster has 3 brokers
- each partition has one leader and one or more replicas
So the actual relationship is:
- topics are logical
- partitions are the sharding unit
- brokers host partition leaders and replicas
This is why Kafka architecture discussions often sound like they are switching levels. Application developers talk about topics, while operators often talk about brokers and partition placement.
A Concrete Example
Create and inspect a topic:
The description shows partition leaders and replicas on specific brokers. That output makes the distinction clear:
- the topic is the named stream
orders - the brokers are the servers responsible for specific partitions of that stream
Producer and Consumer Perspective
Application code usually works with topics, not brokers directly.
Producer example:
Consumer example:
The producer says “send to topic orders.” Kafka then chooses a partition, and that partition’s leader broker handles the write.
So client code reasons in terms of topics, while the cluster executes through brokers and partitions.
Capacity Planning Needs Both Concepts
When throughput or latency is poor, the bottleneck can live at either layer.
Broker-level problems include:
- disk saturation
- network saturation
- uneven partition leadership
- JVM pressure
Topic-level problems include:
- too few partitions
- skewed partition keys
- excessive retention
- hot partitions from uneven producer behavior
Creating more topics does not automatically improve throughput. Often the real lever is partition count and better distribution across brokers.
Reliability Also Spans Both Layers
Kafka durability happens at the partition-replication level.
That means:
- the topic defines partition count and replication expectations
- brokers store leaders and followers for those partitions
- broker failure causes leader election for affected partitions
So topic configuration expresses the desired policy, while brokers provide the physical storage and failover behavior that make the policy real.
Common Pitfalls
The biggest mistake is treating a topic like a physical server or treating a broker like a logical data stream. They belong to different layers of the system.
Another issue is thinking topic count alone controls consumer parallelism. Parallelism depends heavily on partition count, not just how many topic names exist.
People also often ignore bad partition-key design, which can overload one broker even when the cluster looks healthy overall.
Finally, troubleshooting becomes confusing when stream design problems and infrastructure problems are mixed together. Keep topic questions and broker questions separate whenever possible.
Summary
- A broker is a Kafka server, while a topic is a logical stream of records.
- Topics are implemented through partitions, and partitions are hosted on brokers.
- Producers and consumers mostly work with topics, not brokers.
- Capacity and reliability depend on both topic design and broker health.
- Keeping the concepts separate leads to clearer Kafka architecture and troubleshooting.

