Comparing Apache Kafka alternativesguide hero background, gray with some polygons

Kafka benchmark

Data streaming platforms have become a critical component in today's application architecture, especially in cases requiring continuous data analysis and transformation in near real-time. That is why solutions like Apache Kafka® are used across various applications, from e-commerce websites to social networking platforms. A component that is so critical will naturally be scrutinized to ensure optimal performance. So, it becomes essential to ensure that this performance is sustained over long periods of time as the application continues to be used and even scaled up. 

This is where benchmarking comes in—it provides a reliable and repeatable way to measure and compare the performance of a tool with later versions of itself and with other tools that perform the same function. 

In this article, we look at some nuances of benchmarking the performance of a data streaming platform like Apache Kafka® and use common benchmarks to measure it. We also demonstrate how to use the benchmarks to compare Kafka with Redpanda. Redpanda is a simple, powerful, and cost-efficient streaming data platform that's compatible with Kafka® APIs while eliminating Kafka complexity.

Summary of key Kafka benchmark concepts

General software benchmarking

Software benchmarking enables the creation of standardized tests that analyze the performance of a given tool and compare it with others.

Creating benchmarks for Kafka

Benchmarking a data streaming solution like Kafka entails measuring cluster latencies while varying factors like message throughput, cluster configurations, and hardware.

Well-known Kafka benchmarks

Several benchmarking frameworks have been created for Kafka, like the OpenMessaging Benchmark Framework.

Comparing Kafka to Redpanda

Using benchmarks to compare Redpanda with Kafka shows that Redpanda is consistently faster.

Redpanda's architectural differences

Redpanda is aligned with modern systems and uses its proximity to the hardware to extract its faster performance.

What is benchmarking?

Benchmarking software is the process of creating a standard, repeatable set of tests that you can use to measure the performance of the software. This test can then be used to compare other versions of the software or other software applications that perform the same function.

Though simple in principle, benchmarking software is challenging because of the number of possible variables that affect the performance of a software application. Everything from the actual task given to the software to the underlying processor, memory, and other configurations on which the tests run can impact performance. Hence, each variable factor must be carefully measured and standardized before a test can be considered a true benchmark. This usually involves patiently running the same test repeatedly to verify that the performance of the software is not changing and no variables have been missed. 

If done properly, though, software benchmarking can be helpful in several ways:

Demonstrating performance

You can use a benchmark to determine how well a particular tool performs. Because of its repeatable nature, claims made using a benchmark are verifiable independently of the party making the claims and hence are more trustworthy. Creating a benchmark can also help the software makers better understand their own software and its performance, uncovering previously overlooked behavioral traits and helping to fix them.

Quality-testing software updates

Every person involved in operating an IT environment knows the challenges of upgrading software in a running environment. You can use a well-defined benchmark to test any new version of an existing tool before upgrading to a production environment. Benchmarking is the best way to determine if performance is unchanged or improved after an upgrade.

Comparing tools

Multiple software solutions exist for the same purpose, and the existence of some features may lead to a push to change to a different tool. Before doing this, it is always useful to test both tools using a similar benchmark to ensure that the new tool does not degrade performance in your environment. 

Benchmarking Kafka

Data streaming applications like Kafka are regularly used to process and transport vast amounts of data. For instance, it is common for a Kafka cluster in a production deployment of large SaaS products to be loaded with a throughput of 1 GB/second. Enabling this scale requires us to be more specific and precise while defining Kafka performance expectations under a given load.

Measuring performance - throughput and latency

The performance of a data streaming solution like Kafka boils down to how fast messages written to the cluster become available to consumers. This can be measured in terms of the latency, or delay, introduced by the cluster. Latency is measured by comparing the time a producer attempts to write a message to the cluster with when it first becomes available to a consumer. Latencies are generally measured in terms of a percentile score. 

For example, a P99 latency of 10ms indicates that 99% of all messages processed by the cluster encounter a latency of less than 10ms. This may seem like a pretty stringent measure in itself, but it is not nearly enough to measure Kafka's performance. In a setup where a million messages per second are being written to the cluster, ten thousand messages per second would encounter latencies higher than the P99 latency! This can have a large impact on the overall performance of our application. So we must consider even more demanding latency guarantees (called "tail latencies"), such as P99.99.

Benchmark parameters

Now that we know what to measure, it's time to define the benchmarks. Several factors can affect the performance of a Kafka cluster, and we'll have to define each to create a benchmark. We will then measure the latencies of our Kafka cluster against this benchmark. We can create several benchmarks like this, each having different values for the variables affecting performance. Some of the variables we should consider are:

  • The underlying (virtualized) hardware that we run our clusters on. These could be certain flavors of virtual machines in the cloud or standard CPU and memory resources.

  • Throughput of data written to the cluster

  • Number of producers writing messages in parallel

  • Number of consumers

  • Number of topics

  • The number of nodes in the cluster

  • The partition count per node in the cluster

Each combination of these variables constitutes a different test and results in different message latencies. These latencies can be measured and used to make general claims about Kafka's performance.

Have questions about benchmarking? 🤔

Ask our global developer community on Slack.

Example benchmarks

Several benchmarks have been created for data streaming solutions like Kafka and made publicly available so that different solutions can be tested against them and their performances compared. The OpenMessaging Benchmark Framework is a framework that defines different benchmark tests and provides scripts that you can use to run tests on multiple well-known data streaming solutions, such as Apache Pulsar, Amazon Kinesis, etc.

Some of the tests defined by this framework, with set values of different parameters, are as follows:

Topics

Partitions per topic

Message size

Subscriptions per topic

Producers per topic

Producer rate (per second)

1

10

1 kB

1

1

10000

1

16

1 kB

1

1

50000

1

3

100 bytes

1

3

100000

As you can see, these tests vary the number of partitions, producers, rate of data being produced, and so on. You can run a Kafka cluster under these loads for a fixed amount of time and measure throughput and latencies obtained from it. You also have to ensure that you run tests on the same or comparable underlying hardware so that results are not skewed because of that.

We used this framework and ran the following tests on Kafka with varying hardware, write throughput, and cluster configurations. Each test was run with a single topic, four producers, and four consumers. We got the following results:

Instance type

Write throughput (MB/s)

Partitions

Number of nodes in cluster

Result (p99.9 latency)

i3en.large(16GB RAM, 2 CPU)

50

48

3

60 ms

i3en.3xlarge (96 GiB RAM, 12 CPU)

500

144

4

220 ms

i3en.3xlarge (96 GiB RAM, 12 CPU)

500

144

9

80 ms

i3en.6xlarge (192 GiB RAM, 24 CPU)

1000

288

6

4000 ms

i3en.6xlarge (192 GiB RAM, 24 CPU)

1000

288

9

500 ms

Increasing write throughput causes latencies to increase. This is expected since the Kafka cluster is under more load figuring out where the messages need to be routed and making them available from the right node and partition. Using more powerful machines and larger clusters helps improve performance. This is how analyzing benchmark test results can help you understand the performance of a tool and set expectations in an actual production environment.

Redpanda - a simple yet powerful alternative

Redpanda is a streaming data platform that’s fully compatible with the Kafka API, but built with a focus on making powerful steaming data simple. Any existing Kafka producers and consumers can work out of the box with a Redpanda cluster with no code changes. We ran the same tests mentioned above on Redpanda on the same AWS instance types and got significantly better performance.

Comparing Redpanda and Kafka

Instance type

Write throughput (MB/s)

Partitions

Number of nodes in cluster

Result (p99.9 latency)

i3en.large(16GB RAM, 2 CPU)

50

48

3

10 ms

is4gen.medium(6 GiB RAM, 1 CPU)

50

48

3

12 ms

i3en.3xlarge (96 GiB RAM, 12 CPU)

500

144

3

10 ms

i3en.6xlarge (192 GiB RAM, 24 CPU)

1000

288

3

100 ms

A detailed analysis and comparison of the test results can be found at Redpanda vs. Kafka - Performance benchmark.

In short, Redpanda is much more performant than Kafka in all our tests, even with small cluster configurations (three nodes in each test). More importantly, the difference in performance increases with increasing throughput—the higher the load on the cluster, the more reason to use Redpanda!

Why Redpanda is more efficient (in every way)

Redpanda has been architected from scratch to make the Kafka API more efficient in every way—from performance to resource usage. Several features help it to achieve this:

  • Redpanda is implemented in C++, enabling much tighter coupling with the underlying operating system and leveraging lower-level mechanisms to extract performance out of modern hardware.

  • Redpanda provides an autotune feature that scans details of the hardware it is running on and optimizes Linux kernel settings for maximum performance.

  • Redpanda is optimized to push the potential of modern systems and does away with traditional CPU-heavy measures like context switching and thread swapping to complete individual processes faster.

Conclusion

Benchmarking is a powerful way to analyze the performance of software applications. The huge load that data streaming solutions like Kafka are regularly expected to sustain brings about additional challenges while testing and comparing the performance of these tools. This, in turn, raises the need for even more thorough benchmarking practices. 

Using benchmarks to compare Kafka and Redpanda, a new and faster Kafka API-compatible data streaming solution, shows that Redpanda is consistently faster than Kafka. This is especially true under large loads. Redpanda's architectural features and close communication with the underlying systems make this performance possible.

Streaming data. Simplified.

Try Redpanda—the most powerful platform for streaming data.

Chapters