How to connect Kafka with Elasticsearch?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
The most common way to connect Kafka to Elasticsearch is to use Kafka Connect with an Elasticsearch sink connector. That setup lets Kafka remain the durable event log while Elasticsearch becomes the searchable projection layer for indexing and analytics.
Use Kafka Connect as the Integration Layer
Trying to write directly from every producer into Elasticsearch usually creates duplicate delivery logic, retry complexity, and operational sprawl. Kafka Connect centralizes that work in one managed path.
A basic connector configuration looks like this:
The connector reads messages from the orders topic and writes them into Elasticsearch. In practice, key.ignore is an important decision because it affects document identity and update behavior.
Stand Up the Pieces in a Predictable Order
At minimum, you need:
- a running Kafka broker
- a running Elasticsearch cluster
- a Kafka Connect worker with the Elasticsearch sink plugin installed
A common local workflow is:
After that, produce a test record:
Then verify the document landed in Elasticsearch:
Think About Document Keys Early
If Kafka messages represent updates to the same logical entity, stable document ids matter. Otherwise, every event can become a new Elasticsearch document even when you intended an upsert.
For example, if the record key is the business id, keep it:
That lets later events update the same Elasticsearch document instead of creating duplicates. If you ignore keys, search results often look inflated because every change event becomes another indexed row.
Map and Transform the Data Carefully
Elasticsearch cares about field types, and Kafka events are not always shaped for search as-is. Single Message Transforms can help normalize the event before indexing.
Example transform section:
This is useful when the message envelope contains metadata plus a nested payload and you only want the payload indexed.
You should also decide whether the index mapping is controlled manually or inferred dynamically. Dynamic mapping is convenient at first, but it can create messy field types if event shapes drift.
Add Operational Guardrails
A working connector is not the same as a production-ready connector. Add guardrails early:
- dead-letter queue for malformed records
- connector metrics and lag monitoring
- explicit index templates or mappings
- retry policy compatible with your failure model
For example, malformed JSON should not silently disappear. It should go to a dead-letter topic or fail loudly enough that operators can see it.
Common Pitfalls
- Writing directly from producers to Elasticsearch instead of using Kafka Connect creates duplicated integration logic.
- Ignoring Kafka keys often causes duplicate documents instead of deterministic updates.
- Letting Elasticsearch dynamic mapping decide everything can produce unstable index schemas.
- Skipping dead-letter handling makes bad records much harder to diagnose.
- Assuming Kafka ordering guarantees automatically translate into Elasticsearch query semantics is usually wrong once multiple partitions are involved.
Summary
- Kafka Connect is the standard way to connect Kafka topics to Elasticsearch.
- Configure the Elasticsearch sink connector with clear decisions about keys and schemas.
- Verify end-to-end flow with a test event and an Elasticsearch query.
- Treat document identity and index mapping as first-class design choices.
- Add operational controls such as DLQ handling and monitoring before calling the pipeline done.

