how to compress data in producers when using spring kafka

Spring Kafka

Data Compression

Kafka Producers

Data Management

Programming Tips

how to compress data in producers when using spring kafka

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Producer-side compression in Spring Kafka is configured through normal Kafka producer properties. The most important setting is compression.type, which tells the producer to compress message batches before sending them to the broker, reducing network usage and often improving throughput for compressible payloads.

The Core Setting

Spring Kafka does not invent its own compression API here. It passes Kafka producer settings through to the underlying client, so you configure compression by setting ProducerConfig.COMPRESSION_TYPE_CONFIG.

java

1import org.apache.kafka.clients.producer.ProducerConfig;
2import org.apache.kafka.common.serialization.StringSerializer;
3import org.springframework.context.annotation.Bean;
4import org.springframework.context.annotation.Configuration;
5import org.springframework.kafka.core.DefaultKafkaProducerFactory;
6import org.springframework.kafka.core.KafkaTemplate;
7import org.springframework.kafka.core.ProducerFactory;
8
9import java.util.HashMap;
10import java.util.Map;
11
12@Configuration
13public class KafkaProducerConfig {
14
15    @Bean
16    public ProducerFactory<String, String> producerFactory() {
17        Map<String, Object> props = new HashMap<>();
18        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
19        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
20        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
21        props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy");
22
23        return new DefaultKafkaProducerFactory<>(props);
24    }
25
26    @Bean
27    public KafkaTemplate<String, String> kafkaTemplate() {
28        return new KafkaTemplate<>(producerFactory());
29    }
30}

Once configured, KafkaTemplate sends compressed batches automatically.

Supported Compression Types

Kafka commonly supports these compression types:

'gzip'
'snappy'
'lz4'
'zstd'
'none'

Each choice trades CPU cost against compression ratio and decompression speed. There is no universally best answer.

Why Compression Usually Works on Batches

Kafka producers compress record batches, not just isolated single records. That means compression becomes more effective when the producer can accumulate multiple records before sending them.

That is why batching-related settings such as linger.ms and batch.size often matter alongside compression.

java

props.put(ProducerConfig.LINGER_MS_CONFIG, 20);
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 32_768);
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "zstd");

If messages are sent completely one by one with tiny batches, the compression benefit may be smaller.

Spring Boot Property Style

If you are configuring Spring Kafka through application properties instead of Java config, the same producer option can be set there.

properties

spring.kafka.producer.properties.compression.type=snappy
spring.kafka.producer.properties.linger.ms=20
spring.kafka.producer.properties.batch.size=32768

This is often the simplest setup in Spring Boot applications.

Choosing a Compression Type

A practical rule of thumb is:

'snappy for a low-overhead, widely used default'
'gzip when compression ratio matters more than CPU'
'lz4 or zstd when you want strong modern performance and should benchmark in your environment'

The only reliable answer for production is measurement. Payload shape, message size, CPU budget, and network constraints all matter.

What Compression Does Not Change

Compression does not remove the need to size your Kafka deployment correctly. It helps with transport and storage efficiency, but it does not fix:

oversized messages
poor partitioning
slow serializers
overloaded brokers

It is one tuning lever, not a cure-all.

Consumers Read Compressed Data Transparently

You do not usually need special consumer logic just because the producer compressed the records. Kafka handles decompression as part of normal broker and client processing.

Common Pitfalls

The biggest mistake is enabling compression and assuming every workload will improve automatically. If payloads are already small or poorly compressible, the CPU overhead may outweigh the savings.

Another issue is forgetting that compression benefits are tied to batching. Tiny batches reduce the value of the codec.

Developers also sometimes set the property on the wrong side. Producer compression belongs in producer configuration, not in consumer configuration.

Finally, do not choose a codec based only on blog posts. Benchmark with your own payload sizes, throughput targets, and infrastructure constraints.

Summary

In Spring Kafka, producer compression is configured through Kafka producer properties.
Set ProducerConfig.COMPRESSION_TYPE_CONFIG or the equivalent Spring Boot property.
Compression is applied to record batches, so batching settings influence the payoff.
Common codecs include gzip, snappy, lz4, and zstd.
Benchmark the chosen codec in your real environment instead of assuming one option is always best.