Improving performance in Apache Kafka

Kafka optimization

High throughput is a key objective of a distributed real-time data processing platform such as Apache Kafka®. Which approach to optimize performance you want to take depends on where the bottleneck is.

In this article, we discuss ways to optimize the performance of the Kafka cluster and the clients that connect to it. We also emphasize monitoring and provide hints to identify specific problems. We also provide an overview of the parameters you can adjust for Kafka optimization.

Summary of key Kafka optimization concepts

Considering and optimizing the below four core aspects helps you meet the performance requirements of any Apache Kafka use case.

Partitions and Consumer GroupsHigh throughput via a higher degree of parallelism, leveraging the foundational concept of scaling out Apache Kafka.
Cluster OptimizationAppropriate cluster sizes are the foundation for meeting your performance objectives.
Producer OptimizationOn the client, we have several options to fine-tune the performance of producers via configuration parameters, e.g., ‘linger.ms.’
Consumer OptimizationWe have options to tune the performance just by configuration.

Partitions and consumer groups

In Apache Kafka, the consumer group protocol enables scalable consumption of data from topics by multiple consumers. Partitions allow parallel data processing, and you can combine them with consumer groups to scale out consumption. If reading the data is the bottleneck, scaling your partitions and consumer groups is a straightforward Kafka optimization technique.

How partitions work

Kafka-Topics have only one partition by default. However, you can divide each topic in Kafka into multiple partitions for parallel data processing. A starting point is to create topics with more partitions to increase performance. You can have 3, 6, or even 20 partitions in your production setup, depending on your requirements.

How consumers work

A consumer group is a collection of consumers that work together to consume data from a topic. Kafka assigns partitions to each consumer in the group for reading data. The assignment is done in a way that balances the load among all consumers in the group.

The following code snippet shows a simple Java consumer. A unique group ID identifies the consumer group. Please note that we set a " test-group " value for the configuration “group.id”.

import java.util.Arrays;
import java.util.Properties;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;

public class ConsumerExample {
	public static void main(String[] args) {
    	Properties consumerConfig = new Properties();
    	consumerConfig.put("bootstrap.servers", "localhost:9092");
    	consumerConfig.put("group.id", "test-group");
    	consumerConfig.put("auto.offset.reset", "earliest");
    	consumerConfig.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
    	consumerConfig.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");

    	KafkaConsumer<String, String> consumer = new KafkaConsumer<>(consumerConfig);
    	consumer.subscribe(Arrays.asList("test-topic"));

    	while (true) {
        	ConsumerRecords<String, String> records = consumer.poll(100);
        	for (ConsumerRecord<String, String> record : records) {
            	System.out.println("Message: " + record.value());
        	}
    	}
	}
}

Here are the two steps to add more consumers to the consumer group “test-group”.

  • Make sure to run new consumer instances using the same group.id (“test-group”) as the existing consumer in the group. You can run new consumers on a different thread or different server.
  • The Kafka cluster automatically detects the new consumers and balances partitions among all consumers in the group. This is called partition rebalancing.

[CTA_MODULE]

Cluster optimization

If you can always add more partitions and consumers, can you scale almost indefinitely? In theory, yes, but the hardware resources of the Kafka cluster practically limit this. Therefore, monitoring the cluster and identifying potential performance bottlenecks is very important to ensure its reliability and availability.

Monitor cluster performance

Kafka comes with several built-in tools that you can use to monitor the cluster, such as the kafka-cluster.sh or kafka-broker-api-versions.sh. However, they do not generate in-depth insights.

However, Kafka exposes several JMX MBeans, allowing you to scrape them to access key JMX metrics. Then you can analyze them with third-party tools like Grafana for Kafka optimization. They provide a more comprehensive view, and you can observe the following aspects.

  • Broker metrics include the number of messages in, the number of messages out, and the number of active connections.
  • Consumer metrics include the number of messages consumed and the lag behind the head of the partition.
  • Producer metrics such as the number of messages produced and the time taken to produce messages.

In particular, you should also monitor the cluster hardware components. The memory, CPU usage, and network bandwidth can become a bottleneck as the load increases.

Horizontal and vertical cluster scaling

If there are no means to reduce the hardware requirements on the client side, you may need to scale the cluster horizontally or vertically. Horizontal scaling involves adding more brokers to the cluster and rebalancing the partitions among them. It increases the capacity and improves the performance of the cluster. Vertical scaling involves upgrading the hardware of the existing nodes in the cluster, such as increasing the amount of memory or CPU resources.

Of course, when performance bottlenecks occur, we should not keep adding more hardware to solve the Kafka optimization problem because it increases costs. In the next section, we give an overview of optimizing the performance of Kafka clients like Producer, Consumer, or Kafka Streams.

[CTA_MODULE]

Producer optimization

There are several ways to optimize Kafka producers to reduce the load on the Kafka cluster.

Batch messages

To optimize a Kafka producer, one option is to batch multiple small messages into a single message to reduce the overhead of individual message production, network bandwidth consumption for outgoing messages, and network trips. In this context, you must configure the following parameters:

  • Batch size setting determines the maximum number of messages the producer sends in a single request.
  • The linger time setting determines the time the producer waits before sending a batch of messages, even if the batch size has not been reached.

The default for batch size is 16 kb, and for the linger time, it is 0, so the producer transmits each message individually. When you increase the setting, the producer waits longer to send a bigger message. You thus increase producer efficiency by reducing the overhead of multiple requests.

Compression

Another option for optimizing producers is compression. You can activate message compression by setting the ‘compression.type’ configuration property, for example, to ‘snappy’ or ‘gzip.’ Compressing messages before sending reduces message size and network overheads. However, compression comes with the cost of increased CPU utilization, and you must consider the trade-offs before deciding on Kafka optimization.

Acks configuration settings

Finally, you can adjust the ‘Acks’ configuration property to control the number of acknowledgments the producer requires the broker to receive before considering a request as complete. Often, ‘Acks’ is set to ‘all’ in a production setup, so the producer waits for acknowledgments from all in-sync replicas before considering the request complete.

The ‘all’ acks setting provides the highest reliability and latency. You can simply set this value to ‘1’ to increase the performance. In this case, the producer waits for a single acknowledgment from the broker and considers the request complete once it receives the acknowledgment. Again, the Kafka optimization tradeoff is between reliability and throughput, and you decide for your use case.

Optimized producer code example

The following code snippet shows a producer with the Kafka optimization configuration we described above.

import java.util.Properties;

import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.StringSerializer;

public class KafkaProducerExample {

    public static void main(String[] args) {
        Properties properties = new Properties();
        properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        properties.setProperty(ProducerConfig.ACKS_CONFIG, "1");
        properties.setProperty(ProducerConfig.LINGER_MS_CONFIG, "20");
        properties.setProperty(ProducerConfig.BATCH_SIZE_CONFIG, "32");
        properties.setProperty(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy");
        properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());

        KafkaProducer<String, String> producer = new KafkaProducer<>(properties);
        ProducerRecord<String, String> record = new ProducerRecord<>("test-topic", "key", "value");

        producer.send(record);
        producer.flush();
        producer.close();
    }


Consumer optimization

Regarding consumer applications, you can leverage the consumer groups to take advantage of the parallelism provided by partitioning. Consumer optimization has the greatest Kafka optimization effect and is central to scaling seamlessly. This section looks at a few more aspects of optimizing consumer applications.

Batch messages

Just as with producers, processing messages in batches also improves the performance of a Kafka consumer by reducing the overheads of individual message processing. The max.poll.records configuration property controls the maximum number of records returned in a single poll of a Kafka consumer. The default value for max.poll.records is 500. Depending on your requirements, you should configure a higher value for better performance, such as 1000 or 2000.

Fetch minimum bytes

Another option with a similar effect is to adjust fetch.min.bytes which defaults to 1 byte. The default setting means that the Kafka server answers fetch requests as soon as a single byte of data is available, or the fetch request times out waiting for data to arrive. Setting this to something greater than one causes the server to wait for more significant amounts of data to accumulate. Waiting improves server throughput at the cost of some additional latency. A reasonable value for the setting would be 16 kb.

Optimized consumer code example

Concerning the consumer example from the first section, we can add the following two configs described in this section.

properties.setProperty(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, "16384");
properties.setProperty(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "1000");

In addition to these Kafka-side configurations, it is essential to set up appropriate monitoring for your Kafka clients, just as with the Kafka cluster itself. Especially consumer applications, which run on a separate server, can quickly escalate memory requirements or CPU load.

You should monitor and control how much data the application consumes and what it does, for example, whether it holds data in memory or performs expensive data transformations.

[CTA_MODULE]

Conclusion

The fundamental concept to scale out in Apache Kafka is via partitions and consumer groups. Some specific causes of poor performance include data ingestion bottlenecks, producer and consumer settings, and cluster configurations. Kafka optimization includes assessing the problem area and choosing the best solution for your use case.

You can increase partitions, scale your clusters or optimize your Kafka clients. However, most throughput optimization solutions have a trade-off, like system reliability or latency. Monitoring your system, identifying benchmarks, and optimizing performance based on data is essential.

[CTA_MODULE]

When to choose Redpanda over Apache Kafka
Start streaming data like it's 2024.
Redpanda: a powerful Kafka alternative
Fully Kafka API compatible. 6x faster. 100% easier to use.
Have questions about Kafka or streaming data?
Join a global community and chat with the experts on Slack.
Redpanda Serverless: from zero to streaming in 5 seconds
Just sign up, spin up, and start streaming data!

Chapters