Kafka latency is the time it takes a message to be produced by a producer and consumed by a consumer. It’s an essential aspect of Apache Kafka® because low latency enables near-real-time data processing and analysis.
Because of the various Kafka configurations and causes of latency that can impact an environment, performance optimization to reduce latency can be complex. For example, in some cases, engineers drastically improve Kafka latency through configuration modifications. Other optimizations may require hardware changes.
This article explores Kafka latency in-depth, including possible causes of high latency and how to reduce Kafka latency.
Summary of key Kafka latency causes
The table below summarizes common causes of Kafka latency. The following sections will explore techniques and configurations that help address latency.
Kafka latency cause
Kafka stores and processes data using a distributed architecture, allowing it to scale horizontally and handle high volumes of data with low latency. Kafka can also provide fault tolerance and high availability thanks to this architecture.
Kafka stores data in memory, enabling quick read and write operations with low latency. This method also helps to reduce disc I/O overhead, which can slow down performance.
Kafka employs batch processing to improve data ingestion and processing throughput. This method enables Kafka to process large amounts of data simultaneously, reducing the overhead of handling individual messages.
Kafka employs asynchronous communication between producers and consumers, allowing it to handle many concurrent requests while maintaining low latency. Kafka can also provide scalable and fault-tolerant communication using this approach.
Understanding Kafka latency
In the context of Kafka, "latency" is the time it takes for a message to be published by a producer and then delivered to a consumer. In other words, it is the delay between the time a message is produced and the time a consumer consumes it.
Kafka provides various mechanisms to reduce latency, such as batching messages, compression, and network configurations. Additionally, consumers can use techniques like increasing the number of consumer instances or partitioning to improve their ability to process messages more quickly.
Kafka latency vs. throughput
Latency and throughput are two related but distinct key performance metrics in Kafka. The time it takes for a message to be delivered from a producer to a consumer is called latency. Messages with low latency are delivered quickly and efficiently, whereas messages with high latency are delivered slowly. In real-time systems where data must be rapidly processed, such as financial trading systems or real-time monitoring applications, latency is critical.
Throughput, on the other hand, is the rate at which the system can process messages. High throughput indicates the system can handle a large volume of messages, whereas low throughput indicates that the system is congested and messages are being processed.
Why is low latency important?
User experience: Low-latency apps respond to user actions quickly, which is critical for a smooth user experience.
Real-time data processing: Real-time data processing applications, such as fraud detection or predictive maintenance, require low latency to ensure the processing is as close to real-time as possible.
Competitive advantage: Low latency can provide a competitive advantage in industries where speed is critical, such as high-frequency trading.
Free report: Why low latency matters
Find out what use cases need low latency—and how to get it.
How to reduce Kafka latency
Optimizing for low latency in Kafka entails adjusting various configurations, such as batch or buffer size. On the other hand, optimizing for high throughput entails increasing the number of partitions or changing the replication factor.
Balancing latency and throughput is critical, as increasing one often means sacrificing the other. Kafka offers several tools and features, such as delivery semantics and partitioning, to assist in balancing these metrics and ensuring that messages are delivered reliably and efficiently.
Increase the partition
Increasing the number of partitions in a Kafka topic improves low-latency message delivery by increasing the parallelism of message processing. Administrators use the
NewPartitions class to create a new
NewPartitions object with the desired partition count and add it to a Map that maps the topic name to the new
NewPartitions object. Below is an example.
Properties props = new Properties(); int numPartitions = 8; String topicName = "my-topic"; Map<String, TopicPartitionInfo> topicPartitionInfoMap = adminClient.describeTopics(topicName).all().get().get(topicName).partitions(); int currentPartitionCount = topicPartitionInfoMap.size(); int newPartitionCount = Math.max(currentPartitionCount, numPartitions); NewPartitions newPartitions = NewPartitions.increaseTo(newPartitionCount); Map<String, NewPartitions> newPartitionsMap = new HashMap<>(); newPartitionsMap.put(topicName, newPartitions); adminClient.createPartitions(newPartitionsMap).all().get(); adminClient.close();
Reduce the batch size
In this code snippet, the
ProducerConfig.BATCH_SIZE_CONFIG property is set to 16KB and the
ProducerConfig.LINGER_MS_CONFIG property is set to 100ms.
These values determine the maximum size of a message batch and the maximum wait time for more messages to be added to a batch.
ProducerConfig.BATCH_SIZE_CONFIG: This property configures the batch size in bytes that the producer will use when sending messages. It determines how much data is accumulated in a batch before being sent to the Kafka broker. In this case, the value 16384 indicates a batch size of 16 kilobytes.
ProducerConfig.LINGER_MS_CONFIG: This property configures the amount of time in milliseconds that the producer will wait before sending a batch of messages. If a batch is not full and the specified linger time has not elapsed, the producer will hold the batch to accumulate more messages. After the specified time elapses or the batch is full, the producer will send the batch. In this case, the value 100 indicates a linger time of 100 milliseconds.
Properties props = new Properties(); props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384); props.put(ProducerConfig.LINGER_MS_CONFIG, 100); // 100ms
Increase the Kafka producer’s buffer size
In this code snippet, the
ProducerConfig.BUFFER_MEMORY_CONFIG property is set to 64MB. This determines the total amount of memory that the producer can use to buffer unsent records before blocking.
Properties props = new Properties(); props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 67108864); // 64MB
Increasing the buffer size can improve producer performance by reducing the frequency of network calls and reducing the likelihood of producer blocking due to insufficient memory.
However, it also comes at the cost of increased memory usage.
Note that the optimal buffer size may vary depending on the specific use case and workload, so it may require some experimentation and tuning to find the optimal value.
Use compression in Kafka producers
Compression is a technique used to reduce the size of data that is transmitted over the network or stored on disk. Kafka provides support for data compression in its producer API, allowing you to configure the compression algorithm used for each message sent to a Kafka topic.
To enable compression in a Kafka producer, you can set the compression.type configuration property to the desired compression algorithm. The supported compression types are:
none: No compression
gzip: Gzip compression
snappy: Snappy compression
lz4: LZ4 compression
Properties props = new Properties(); props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "gzip");
Kafka performance configurations
Several necessary configurations can be used to improve Kafka's performance. The sections below explain some of the most common configuration changes for reducing Kafka latency.
Because the Kafka broker handles client requests, optimizing its configuration can significantly impact performance. Consider the following key configuration options:
num.network.threads: This parameter specifies the number of threads that Kafka will use for handling network communication tasks. These tasks involve processing incoming requests from clients, sending responses, and handling network I/O. It is important to have a sufficient number of network threads to handle concurrent client connections effectively.
num.io.threads: This specifies the number of threads the broker uses to handle disc I/O requests. When there are a large number of producers or consumers writing to or reading from Kafka, increasing this value can help improve performance.
Producers are in charge of writing data to Kafka. Consider the following key configuration options:
acks: This specifies how many acknowledgments a producer must receive from Kafka before a message is considered successfully sent. In a broker failure, a value of all can help ensure that data is not lost.
The different acknowledgment settings, such as "acks=0", "acks=1", and "acks=all", control the level of acknowledgment the producer expects from the broker before considering a message as sent.
batch.size and linger.ms: These options control how many messages a producer sends at once and how long it waits before sending a batch. Changing these values can aid in optimizing network usage and reducing the number of requests that must be sent.
In Kafka, delivery semantics are message reliability guarantees for messages sent from producers to consumers. Kafka supports the following delivery semantics:
At the most, once
Kafka does not guarantee message delivery. Producers send messages once and do not wait for acknowledgment from the broker before sending the following message. This mode is typically used when it is acceptable to lose messages, such as in systems where the same messages are frequently produced.
At least once
Kafka's at-least-once delivery ensures that messages are delivered to consumers at least once. In this mode, producers wait for the broker's acknowledgment before sending the following message.
"Exactly once delivery semantics" in Kafka refers to the promise that a message will be processed and delivered just once to a consumer, even if there are failures or retries.
Consumers are in charge of reading data from Kafka. Consider the following key configuration options:
fetch.min.bytes and fetch.max.bytes: These options govern how much data a consumer can obtain in a single request. Setting these values correctly can aid in optimizing network usage and reducing the number of requests that must be sent.
max.poll.records: This specifies the maximum number of records that a consumer will return in response to a single poll request. When there are a large number of messages to process, increasing this value can help improve performance.
Because Kafka clusters can have multiple brokers, optimizing the cluster's overall configuration can help improve performance. Consider the following key configuration options:
replication.factor: This determines the number of replicas of each partition stored in the cluster. Increasing this value improves fault tolerance but may increase network traffic and disc usage.
min.insync.replicas: This specifies the number of replicas that must be in sync before a producer considers a message to have been successfully sent. Increasing this value can improve durability but also increase latency.
When designing and optimizing a Kafka cluster, latency is an important factor. Network configuration, broker and consumer settings, and hardware resources are all factors that affect Kafka latency. Kafka users can achieve optimal latency and ensure their data pipelines perform optimally by carefully tuning these factors.
A few specific strategies to reduce Kafka latency include: optimizing network settings, increasing hardware resources, and configuring Kafka producers and consumers to operate more efficiently.
Overall, effective latency management requires careful monitoring, experimentation, and ongoing optimization to ensure that Kafka continues to provide fast, reliable, and efficient data processing capabilities.