Improve Apache Kafka application testing by using Redpanda and Testcontainers.

ByOleg ŠelajevonOctober 4, 2022
Simplify development of Kafka applications with Redpanda and Testcontainers

Redpanda is a developer-first streaming data platform compatible with the Apache Kafka API. It also has a number of advantages like being ZooKeeper-free, deployed as a single native binary (which helps with various Kubernetes-based deployment scenarios) and so on. Redpanda delivers up to 10x lower average latencies and up to 6x faster transactions than with Kafka.

kafka testing with redpanda

Testing is an essential part of the DevOps cycle, but without proper tools, it can quickly become a chore and slow down your development process. In this article, we look at how you can test your Kafka applications using Testcontainers and Redpanda for the best developer experience and efficiency.

Redpanda offers a fast-starting single binary that is friendly for containerized workloads, and a powerful, easy CLI to integrate with the other technologies.

Testcontainers by AtomicJar reinvents the developer experience of creating reliable functional tests. It provides a programmatic API to create, configure and manage the lifecycle of lightweight, throwaway instances of common technologies applications rely on: databases, message brokers, or anything else that can run in a Docker container. This is a versatile approach to creating integration tests with the real technologies used in production.

For example, you can use the Testcontainers-java library to ensure your Spring application works well against an instance of Redpanda running in a Docker container. You can ensure the drivers work, the API your code is using returns the expected responses, the data marshalling mechanisms are compatible, and so on.

Using Redpanda for the tests helps you avoid issues that might be hard to reproduce with mocks, embedded Kafka, or other means. And Testcontainers removes the complexity from that setup and makes tests reproducible on any team-member workstation and in CI.

Let's take a look at how you can run integration tests against an instance of Redpanda using Testcontainers. There are three main things Testcontainers excels at:

  • Container lifecycle & cleanup
  • Container & service configuration in the container
  • Integration with application or test frameworks

The following line is all you need to create an object representing the Redpanda container:

RedpandaContainer kafka = new RedpandaContainer("docker.redpanda.com/vectorized/redpanda:v22.2.1");

The API exposes the lifecycle methods. You can start and stop the container. It also allows you to programmatically configure both the container and the service running in the container. For example, publishing the required ports or setting environment varibles in the container, or copying the configuration files into the containers, or creating the database schema in a freshly created database.

testcontainers testing

Testcontainers-java has different ways to manage the lifecycle, for example tying it to the lifecycle of the JUnit tests.

After starting the container, the last thing in the test setup is to make sure your application is aware where Redpanda is running. For Kafka compatible technologies, this is the location of the bootstrap-servers that clients connect to. You can inquery this information directly from the container object you created:

kafka.getBootstrapServers();

For a Spring Boot app, you can use the @DynamicPropertySource mechanism to propagate this data to the Spring Boot context in a very idiomatic way:

@DynamicPropertySource public static void setupthings(DynamicPropertyRegistry registry) { registry.add("spring.kafka.bootstrap-servers", kafka::getBootstrapServers); }

After that, you can run the tests normally and Testcontainers will pull the Docker image, create the container, configure it, and run it. It also cleans everything up so the enviroment can be used for the next test run.

Running the tests

Let's look at a sample run of the tests, which include using a Redpanda container via Testcontainers.

First, we add a dependency to the org.testcontainers:redpanda artifact.

<dependency> <groupId>org.testcontainers</groupId> <artifactId>redpanda</artifactId> <version>1.17.5</version> <scope>test</scope> </dependency>

Then, we define the container instance exactly how we saw it above and call .start() on it.

RedpandaContainer kafka = new RedpandaContainer("docker.redpanda.com/vectorized/redpanda:v22.2.1"); kafka.start();

Let's look at the logs of the test run and check what's happening with the Redpanda container:

2022-09-10 21:00:47.926 INFO 80778 --- [ers-lifecycle-1] ?.r.com/vectorized/redpanda:v22.2.1] : Pulling docker image: docker.redpanda.com/vectorized/redpanda:v22.2.1. Please be patient; this may take some time but only needs to be done once. 2022-09-10 21:00:48.234 INFO 80778 --- [ers-lifecycle-1] o.t.utility.RegistryAuthLocator : Credential helper/store (docker-credential-desktop) does not have credentials for docker.redpanda.com 2022-09-10 21:00:50.990 INFO 80778 --- [tream--85575341] ?.r.com/vectorized/redpanda:v22.2.1] : Starting to pull image 2022-09-10 21:00:50.999 INFO 80778 --- [tream--85575341] ?.r.com/vectorized/redpanda:v22.2.1] : Pulling image layers: 0 pending, 0 downloaded, 0 extracted, (0 bytes/0 bytes) 2022-09-10 21:01:03.636 INFO 80778 --- [tream--85575341] ?.r.com/vectorized/redpanda:v22.2.1] : Pulling image layers: 0 pending, 7 downloaded, 7 extracted, (128 MB/128 MB) 2022-09-10 21:01:03.666 INFO 80778 --- [tream--85575341] ?.r.com/vectorized/redpanda:v22.2.1] : Pull complete. 7 layers, pulled in 12s (downloaded 128 MB at 10 MB/s) 2022-09-10 21:01:03.692 INFO 80778 --- [ers-lifecycle-1] ?.r.com/vectorized/redpanda:v22.2.1] : Creating container for image: docker.redpanda.com/vectorized/redpanda:v22.2.1 2022-09-10 21:01:04.223 INFO 80778 --- [ers-lifecycle-1] ?.r.com/vectorized/redpanda:v22.2.1] : Container docker.redpanda.com/vectorized/redpanda:v22.2.1 is starting: cd842c5676f3a87c4ce749b04d725bb57a7efb0b78a4e4ca0edcc5dc46e8652b 2022-09-10 21:01:05.801 INFO 80778 --- [ers-lifecycle-1] ?.r.com/vectorized/redpanda:v22.2.1] : Container docker.redpanda.com/vectorized/redpanda:v22.2.1 started in PT18.080461S

Testcontainers donwloads the Docker image for the container. Luckily, it's a fairly modest 128M, which can be pulled pretty quickly (pulled in 12s (downloaded 128 MB at 10 MB/s)).

The consequitive runs will reuse the local cache and won't spend time pulling the image again, considerably improving the run time:

Container docker.redpanda.com/vectorized/redpanda:v22.2.1 started in PT3.244128S

The tests pass normally. The application is using Redpanda via its Kafka-compatible API, sending and receiving the messages.

kafka testing testcontainers

Another curious observation is that, given the compatibility, you can easily replace the container definition with another Kafka implementation and rerun the tests. Substitute the Redpanda container instantiation with a more traditional one:

KafkaContainer kafka = new KafkaContainer(DockerImageName.parse("confluentinc/cp-kafka:5.4.6"));

We don't have to change another line in the tests and they run against a Kafka instance running in an ephemeral Docker container.

The tests pass, but a curious thing to note: From the logs, we see that the Redpanda container starts approximately twice as fast as Kafka:

2022-09-10 21:46:03.783 INFO 81186 --- [ers-lifecycle-0] 🐳 [confluentinc/cp-kafka:5.4.6] : Container confluentinc/cp-kafka:5.4.6 started in PT6.561209S

A curious detail about this is that having a Testcontainers module helps with it quite a bit. The RedpandaContainer class encapsulates information on the best practices, both from the testing point of view and using Redpanda efficiently. For example, the module starts Redpanda in the dev-container mode.

command = command + "/usr/bin/rpk redpanda start --mode dev-container ";

The dev-container flag is an umbrella switch for configuring Redpanda with the most sensible config for the tests. You can check this issue for more details but, in a nutshell, the container for the tests can make different default tradeoff decisions compared to running in production. For example, like being corrupted if everything crashes, or run faster than otherwise by limiting replication, or mapping some files into a memory filesystem, and so on.

In the tests, you can override all and any config, so if you want particular tests run against the Redpanda config without the dev-container mode, it's possible.

This is a great approach to improve developer experience further for developers testing their apps with Testcontainers and Redpanda. What’s also great is that testing will also benefit from the optimizations added in the future without code changes in the tests.

Note that other Testcontainers modules also use similar optimizations because it's really the most convenient integration point to specify that test-specific configuration without making default production config more confusing!

Conclusion

We've looked at how simple it is to use Testcontainers for enabling integration tests against Redpanda running in Docker. We looked at a sample Spring Boot application running the tests, and compared the startup times for Redpanda and Apache Kafka containers.

In our tests, Redpanda starts twice as fast as Apache Kafka, which is definitely an improvement for integration tests, especially if you want good levels of isolation and start containers more frequently.

One of the reasons for the better startup time is that Testcontainers-provided abstractions are ideal to enable test-specific config, like the dev-container flag in the Redpanda case.

You can look at the sample application, run the tests, and do your own measurements in the GitHub repo.

Take Redpanda for a test drive here. Check out the documentation to understand the nuts and bolts of how the platform works, or read more blogs to see the plethora of ways to integrate with Redpanda. To ask Solution Architects and Core Engineers questions and interact with other Redpanda users, join the Redpanda Community on Slack.

Happy testing!

Let's keep in touch

Subscribe and never miss another blog post, announcement, or community event. We hate spam and will never sell your contact information.