Improving performance in Apache Kafkaguide hero background, gray with some polygons

Kafka latency

Kafka latency is the time it takes a message to be produced by a producer and consumed by a consumer. It’s an essential aspect of Apache Kafka® because low latency enables near-real-time data processing and analysis.

Because of the various Kafka configurations and causes of latency that can impact an environment, performance optimization to reduce latency can be complex. For example, in some cases, engineers drastically improve Kafka latency through configuration modifications. Other optimizations may require hardware changes.

This article explores Kafka latency in-depth, including possible causes of high latency and how to reduce Kafka latency.

Summary of key Kafka latency causes

The table below summarizes common causes of Kafka latency. The following sections will explore techniques and configurations that help address latency.

Kafka latency cause


Distributed architecture

Kafka stores and processes data using a distributed architecture, allowing it to scale horizontally and handle high volumes of data with low latency. Kafka can also provide fault tolerance and high availability thanks to this architecture.

Memory-based caching

Kafka stores data in memory, enabling quick read and write operations with low latency. This method also helps to reduce disc I/O overhead, which can slow down performance.

Batch processing

Kafka employs batch processing to improve data ingestion and processing throughput. This method enables Kafka to process large amounts of data simultaneously, reducing the overhead of handling individual messages.

Asynchronous communication

Kafka employs asynchronous communication between producers and consumers, allowing it to handle many concurrent requests while maintaining low latency. Kafka can also provide scalable and fault-tolerant communication using this approach.

Understanding Kafka latency

In the context of Kafka, "latency" is the time it takes for a message to be published by a producer and then delivered to a consumer. In other words, it is the delay between the time a message is produced and the time a consumer consumes it.

Kafka provides various mechanisms to reduce latency, such as batching messages, compression, and network configurations. Additionally, consumers can use techniques like increasing the number of consumer instances or partitioning to improve their ability to process messages more quickly.

Kafka latency vs. throughput

Latency and throughput are two related but distinct key performance metrics in Kafka. The time it takes for a message to be delivered from a producer to a consumer is called latency. Messages with low latency are delivered quickly and efficiently, whereas messages with high latency are delivered slowly. In real-time systems where data must be rapidly processed, such as financial trading systems or real-time monitoring applications, latency is critical.

Throughput, on the other hand, is the rate at which the system can process messages. High throughput indicates the system can handle a large volume of messages, whereas low throughput indicates that the system is congested and messages are being processed.

Why is low latency important?

Low latency is critical for real-time applications like financial trading, online gaming, and media streaming due to the following reasons:

  • User experience: Low-latency apps respond to user actions quickly, which is critical for a smooth user experience.

  • Real-time data processing: Real-time data processing applications, such as fraud detection or predictive maintenance, require low latency to ensure the processing is as close to real-time as possible.

  • Competitive advantage: Low latency can provide a competitive advantage in industries where speed is critical, such as high-frequency trading.

When to choose Redpanda over Apache Kafka

Start streaming data like it's 2024.

How to reduce Kafka latency

Optimizing for low latency in Kafka entails adjusting various configurations, such as batch or buffer size. On the other hand, optimizing for high throughput entails increasing the number of partitions or changing the replication factor.

Balancing latency and throughput is critical, as increasing one often means sacrificing the other. Kafka offers several tools and features, such as delivery semantics and partitioning, to assist in balancing these metrics and ensuring that messages are delivered reliably and efficiently.

Increase the partition

Increasing the number of partitions in a Kafka topic improves low-latency message delivery by increasing the parallelism of message processing. Administrators use the NewPartitions class to create a new NewPartitions object with the desired partition count and add it to a Map that maps the topic name to the new NewPartitions object. Below is an example.

Properties props = new Properties();
    int numPartitions = 8;
    String topicName = "my-topic";
Map<String, TopicPartitionInfo> topicPartitionInfoMap = adminClient.describeTopics(topicName).all().get().get(topicName).partitions();
 int currentPartitionCount = topicPartitionInfoMap.size();
        int newPartitionCount = Math.max(currentPartitionCount, numPartitions);

NewPartitions newPartitions = NewPartitions.increaseTo(newPartitionCount);
        Map<String, NewPartitions> newPartitionsMap = new HashMap<>();
 newPartitionsMap.put(topicName, newPartitions);

Reduce the batch size

In this code snippet, the ProducerConfig.BATCH_SIZE_CONFIG property is set to 16KB and the ProducerConfig.LINGER_MS_CONFIG property is set to 100ms.

These values determine the maximum size of a message batch and the maximum wait time for more messages to be added to a batch.

ProducerConfig.BATCH_SIZE_CONFIG: This property configures the batch size in bytes that the producer will use when sending messages. It determines how much data is accumulated in a batch before being sent to the Kafka broker. In this case, the value 16384 indicates a batch size of 16 kilobytes.

ProducerConfig.LINGER_MS_CONFIG: This property configures the amount of time in milliseconds that the producer will wait before sending a batch of messages. If a batch is not full and the specified linger time has not elapsed, the producer will hold the batch to accumulate more messages. After the specified time elapses or the batch is full, the producer will send the batch. In this case, the value 100 indicates a linger time of 100 milliseconds.

Properties props = new Properties();
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
props.put(ProducerConfig.LINGER_MS_CONFIG, 100); // 100ms

Increase the Kafka producer’s buffer size

In this code snippet, the ProducerConfig.BUFFER_MEMORY_CONFIG property is set to 64MB. This determines the total amount of memory that the producer can use to buffer unsent records before blocking.

Properties props = new Properties();
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 67108864); // 64MB

Increasing the buffer size can improve producer performance by reducing the frequency of network calls and reducing the likelihood of producer blocking due to insufficient memory.

However, it also comes at the cost of increased memory usage.

Note that the optimal buffer size may vary depending on the specific use case and workload, so it may require some experimentation and tuning to find the optimal value.

Use compression in Kafka producers

Compression is a technique used to reduce the size of data that is transmitted over the network or stored on disk. Kafka provides support for data compression in its producer API, allowing you to configure the compression algorithm used for each message sent to a Kafka topic.

To enable compression in a Kafka producer, you can set the compression.type configuration property to the desired compression algorithm. The supported compression types are:

  • none: No compression

  • gzip: Gzip compression

  • snappy: Snappy compression

  • lz4: LZ4 compression

Properties props = new Properties();
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "gzip");

Redpanda: a powerful Kafka alternative

Fully Kafka API compatible. 6x faster. 100% easier to use.

Kafka performance configurations

Several necessary configurations can be used to improve Kafka's performance. The sections below explain some of the most common configuration changes for reducing Kafka latency.

Broker configurations

Because the Kafka broker handles client requests, optimizing its configuration can significantly impact performance. Consider the following key configuration options:

  • This parameter specifies the number of threads that Kafka will use for handling network communication tasks. These tasks involve processing incoming requests from clients, sending responses, and handling network I/O. It is important to have a sufficient number of network threads to handle concurrent client connections effectively.

  • This specifies the number of threads the broker uses to handle disc I/O requests. When there are a large number of producers or consumers writing to or reading from Kafka, increasing this value can help improve performance.

Producer setup

Producers are in charge of writing data to Kafka. Consider the following key configuration options:

  • acks: This specifies how many acknowledgments a producer must receive from Kafka before a message is considered successfully sent. In a broker failure, a value of all can help ensure that data is not lost.

The different acknowledgment settings, such as "acks=0", "acks=1", and "acks=all", control the level of acknowledgment the producer expects from the broker before considering a message as sent.

  • batch.size and These options control how many messages a producer sends at once and how long it waits before sending a batch. Changing these values can aid in optimizing network usage and reducing the number of requests that must be sent.

Delivery semantics

In Kafka, delivery semantics are message reliability guarantees for messages sent from producers to consumers. Kafka supports the following delivery semantics:

At the most, once

Kafka does not guarantee message delivery. Producers send messages once and do not wait for acknowledgment from the broker before sending the following message. This mode is typically used when it is acceptable to lose messages, such as in systems where the same messages are frequently produced.

At least once

Kafka's at-least-once delivery ensures that messages are delivered to consumers at least once. In this mode, producers wait for the broker's acknowledgment before sending the following message.

Exactly once

"Exactly once delivery semantics" in Kafka refers to the promise that a message will be processed and delivered just once to a consumer, even if there are failures or retries.

Consumer setup

Consumers are in charge of reading data from Kafka. Consider the following key configuration options:

  • fetch.min.bytes and fetch.max.bytes: These options govern how much data a consumer can obtain in a single request. Setting these values correctly can aid in optimizing network usage and reducing the number of requests that must be sent.

  • max.poll.records: This specifies the maximum number of records that a consumer will return in response to a single poll request. When there are a large number of messages to process, increasing this value can help improve performance.

Cluster configuration

Because Kafka clusters can have multiple brokers, optimizing the cluster's overall configuration can help improve performance. Consider the following key configuration options:

  • replication.factor: This determines the number of replicas of each partition stored in the cluster. Increasing this value improves fault tolerance but may increase network traffic and disc usage.

  • min.insync.replicas: This specifies the number of replicas that must be in sync before a producer considers a message to have been successfully sent. Increasing this value can improve durability but also increase latency.

Have questions about Kafka or streaming data?

Join a global community and chat with the experts on Slack.


When designing and optimizing a Kafka cluster, latency is an important factor. Network configuration, broker and consumer settings, and hardware resources are all factors that affect Kafka latency. Kafka users can achieve optimal latency and ensure their data pipelines perform optimally by carefully tuning these factors.

A few specific strategies to reduce Kafka latency include: optimizing network settings, increasing hardware resources, and configuring Kafka producers and consumers to operate more efficiently.

Overall, effective latency management requires careful monitoring, experimentation, and ongoing optimization to ensure that Kafka continues to provide fast, reliable, and efficient data processing capabilities.

Redpanda Serverless: from zero to streaming in 5 seconds

Just sign up, spin up, and start streaming data!


Four factors affecting Kafka performance

Learn how to optimize Kafka performance with producer, consumer, broker, and message configurations.

Kafka Burrow partition lag

Learn how to use the Kafka Burrow monitoring tool to monitor Kafka cluster performance, consumer groups, and partition lag.

Kafka consumer lag

Learn how to optimize Apache Kafka for maximum performance and scalability with tuning tips and best practices.

Kafka monitoring

Learn how to monitor Apache Kafka for performance and behavior with metrics, tools, and best practices.

Kafka latency

Learn how to reduce Kafka latency, understand common causes, and explore key performance metrics to optimize real-time applications.

Kafka optimization

Learn how to optimize Apache Kafka performance by adjusting partitions and consumer groups, cluster size, producer, and consumer parameters.

Kafka performance tuning

Learn how to optimize Apache Kafka® for maximum performance and scalability with tuning tips and best practices.

Kafka log compaction

Learn how to optimize Kafka-based systems' stability, efficiency, and longevity with log compaction.

Kafka rebalancing

Learn Kafka rebalancing, the process of redistributing partitions across consumers to ensure efficient data processing.

Kafka logs

Learn how Kafka's log structure facilitates its reliable architecture and parameters related to Kafka logs that affect its performance.

Kafka lag

Learn how to diagnose and reduce Kafka lag to optimize performance of real-time streaming applications.

Kafka monitoring tools

Learn how to monitor Kafka performance metrics and the best monitoring tools to maximize Kakfa performance.