Kafka Broker
Kafka Topic
Apache Kafka
Data Streaming
Distributed Systems

Kafka Broker vs Topic

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

In Kafka, a broker and a topic are not rival names for the same thing. A broker is a Kafka server in the cluster, while a topic is a logical stream of records. One is infrastructure, the other is a data contract.

Broker and Topic in Plain Terms

A broker is a running Kafka node. It has machine-level concerns such as:

  • CPU
  • memory
  • disk
  • network
  • partition leadership

A topic is a named stream of messages. It has stream-level concerns such as:

  • retention
  • partition count
  • schemas
  • permissions
  • producer and consumer ownership

A useful mental model is:

  • broker answers “where does the data live”
  • topic answers “what stream of data is this”

Keeping those two ideas separate makes Kafka much easier to reason about.

The Bridge Between Them: Partitions

Topics do not live on brokers as one single monolithic unit. Topics are divided into partitions, and those partitions are hosted on brokers.

For example:

  • topic orders has 6 partitions
  • cluster has 3 brokers
  • each partition has one leader and one or more replicas

So the actual relationship is:

  • topics are logical
  • partitions are the sharding unit
  • brokers host partition leaders and replicas

This is why Kafka architecture discussions often sound like they are switching levels. Application developers talk about topics, while operators often talk about brokers and partition placement.

A Concrete Example

Create and inspect a topic:

bash
1kafka-topics.sh \
2  --bootstrap-server localhost:9092 \
3  --create \
4  --topic orders \
5  --partitions 6 \
6  --replication-factor 3
7
8kafka-topics.sh \
9  --bootstrap-server localhost:9092 \
10  --describe \
11  --topic orders

The description shows partition leaders and replicas on specific brokers. That output makes the distinction clear:

  • the topic is the named stream orders
  • the brokers are the servers responsible for specific partitions of that stream

Producer and Consumer Perspective

Application code usually works with topics, not brokers directly.

Producer example:

java
ProducerRecord<String, String> record =
    new ProducerRecord<>("orders", "customer-42", "order-created");
producer.send(record);

Consumer example:

java
1consumer.subscribe(List.of("orders"));
2while (true) {
3    ConsumerRecords<String, String> records =
4        consumer.poll(Duration.ofMillis(500));
5
6    for (ConsumerRecord<String, String> record : records) {
7        System.out.println(record.partition() + " " + record.value());
8    }
9}

The producer says “send to topic orders.” Kafka then chooses a partition, and that partition’s leader broker handles the write.

So client code reasons in terms of topics, while the cluster executes through brokers and partitions.

Capacity Planning Needs Both Concepts

When throughput or latency is poor, the bottleneck can live at either layer.

Broker-level problems include:

  • disk saturation
  • network saturation
  • uneven partition leadership
  • JVM pressure

Topic-level problems include:

  • too few partitions
  • skewed partition keys
  • excessive retention
  • hot partitions from uneven producer behavior

Creating more topics does not automatically improve throughput. Often the real lever is partition count and better distribution across brokers.

Reliability Also Spans Both Layers

Kafka durability happens at the partition-replication level.

That means:

  • the topic defines partition count and replication expectations
  • brokers store leaders and followers for those partitions
  • broker failure causes leader election for affected partitions

So topic configuration expresses the desired policy, while brokers provide the physical storage and failover behavior that make the policy real.

Common Pitfalls

The biggest mistake is treating a topic like a physical server or treating a broker like a logical data stream. They belong to different layers of the system.

Another issue is thinking topic count alone controls consumer parallelism. Parallelism depends heavily on partition count, not just how many topic names exist.

People also often ignore bad partition-key design, which can overload one broker even when the cluster looks healthy overall.

Finally, troubleshooting becomes confusing when stream design problems and infrastructure problems are mixed together. Keep topic questions and broker questions separate whenever possible.

Summary

  • A broker is a Kafka server, while a topic is a logical stream of records.
  • Topics are implemented through partitions, and partitions are hosted on brokers.
  • Producers and consumers mostly work with topics, not brokers.
  • Capacity and reliability depend on both topic design and broker health.
  • Keeping the concepts separate leads to clearer Kafka architecture and troubleshooting.

Course illustration
Course illustration

All Rights Reserved.