This blog explores MQTT, one of the most popular IoT protocols, and the combination with Redpanda to build a robust and scalable solution to power real-time data streaming for IoT use cases. Waterstream is a novel approach in this landscape that turns Redpanda into a fully fledged MQTT broker. This article describes how Waterstream works with Redpanda and presents an edge setup to make things more practical.
Applications of the Internet of Things
The number of Internet of Things (IoT) devices is expected to grow to dozens of billions in a few years and accelerate the adoption of IoT use cases in different contexts. IoT technologies are adaptable to nearly any process that can provide pertinent information about its operation and environmental conditions. Several IoT use cases aim to make operations smart, thus improving companies' production processes, enhancing or predicting maintenance operations, or optimizing resource use.
A non-exhaustive list of interesting and relevant cases includes:
Fleet management: fleet vehicles with sensors installed make it easier to create an efficient connection between the vehicles, their management, and drivers. The software in charge of gathering, analyzing, and arranging the data may be accessed by both the driver and manager/owner to provide a wealth of information about the vehicle's state, operation, and requirements.
Smart agriculture: a large amount of information on the soil's condition and phases may be gathered using IoT sensors. Farmers can control irrigation and use water more effectively. Machine learning can determine the best times to start sowing and even identify the presence of diseases in plants by using information about soil moisture, acidity level, specific nutrients, temperature, and many other chemical characteristics.
Energy grid: better monitoring and control of the electrical network is made possible by the progressive use of intelligent energy meters, or meters equipped with sensors, and the placement of sensors at various important locations along the routes from manufacturing facilities to different distribution points. For defect detection, decision-making, and fault repair, a bidirectional communication channel between the service provider firm and the end user can be established and give information of immense value. Additionally, it provides the user with helpful information regarding their consumption habits and the most effective strategies to reduce or modify their energy usage.
Maintenance management: real-time monitoring of physical assets enables condition-based maintenance (CBM) to be performed when a measurement is out of range. The application of artificial intelligence (AI) methods like Machine Learning or Deep Learning can predict failures before they occur.
Health: doctors can stay updated on a patient's condition outside the hospital by using wearables or sensors attached to the patient. The Internet of Things assists in improving quality care and preventing fatal occurrences in high-risk patients by continually monitoring specific metrics and sending automatic alerts on their vital signs.
All these use cases can benefit from large-scale events processing from tens of thousands or millions of users, machines, and device interfaces. Real-time data streaming enables data analytics, smart alerting, complex event processing, and AI.
An introduction to MQTT and Redpanda
MQTT is a lightweight messaging protocol that is frequently used in IoT applications. It's designed to work in slow and unstable networks, so it has exceptional fault tolerance and adapts well to the trade-offs associated with speed versus delivery guarantees: "at most once", "at least once" or "exactly once". The International Standard Organization (ISO) standardized the protocol, and several libraries are widely available in all major programming languages. This makes MQTT the most common way IoT devices communicate to the servers.
On the other hand, MQTT isn't a good choice for building the internals of the enterprise infrastructure, which is intended for handling IoT data or controlling IoT devices. MQTT does not come with a streaming data platform or streaming processing tools.
In the past, Apache Kafka® has been the go-to streaming data platform of choice. Kafka can scale for high throughput applications, support reprocessing of events, and has a rich set of readily available clients and integrations. Kafka is now over a decade old, and the fundamental principles of its design are less suited to running low latency applications on modern hardware platforms. Welcome Redpanda, the modern streaming data platform built from the ground up in C++. It uses advanced software libraries and principles to deliver greater performance and data safety from modern hardware, providing operational simplicity and a great developer experience. Redpanda is Kafka API-compatible, so, it is easy for developers to adopt and use it as a drop-in replacement for Kafka.
There are some expectations held by Kafka and Redpanda that make them less suitable for connecting IoT devices directly. They expect stable client connections for reliable data transfer, and there are no IoT-specific features such as Keep Alives or Last Will. There is a need for an integrated solution to improve how MQTT devices and Redpanda communicate with each other.
Combining Redpanda and MQTT
Several options allow combining Redpanda with MQTT. Before examining them, it is essential to know the features of an MQTT offering. Like those provided by the cloud vendors, some services are often not true MQTT brokers but are integration layers built upon existing services.
Some key features to consider are:
conformity to the MQTT protocol
support to protocol version 3 and version 5 (released in 2019)
unidirectional or bidirectional communication with the clients
Quality of Service (QoS) levels
implementation of retained messages, Last Will and Testament, Persistent Sessions, Keep Alive, Client Take-Over
Moreover, some non-functional characteristics, like scalability, security, and resilience, are crucial for correctly implementing the use case.
Let us consider now a list of possible solutions for MQTT implementations and integrations with Redpanda:
MQTT broker with custom integration: some MQTT providers like EMQ have already implemented extensions for their MQTT brokers that allows them to transfer data from/to Redpanda
MQTT Proxies are lightweight services that allow MQTT clients to produce messages using the Kafka protocol. Sometimes, these solutions are not bidirectional and do not fully support the MQTT protocol.
Kafka Connect is an extension framework providing different source connectors for data ingestion into Redpanda or sink connectors to extract data to a target destination.
Kafka-native MQTT broker: an MQTT broker is built on top of the Kafka protocol and runs as a native Redpanda application using Kafka consumers and producers. The native broker leverages Redpanda to store MQTT client messages, states, or both. One key advantage of this solution is that Redpanda provides low latency, high throughput, and scalability to the MQTT broker that runs on top of it.
In the following sections, this blog post introduces Waterstream: a novel approach that falls in the last of the categories mentioned.
Waterstream enters the game
Waterstream brings MQTT as close as possible to Redpanda. In a nutshell, it's an MQTT broker which uses Kafka or Redpanda as its only persistence. All the MQTT messages sent to Waterstream immediately go to Redpanda without local persistence. So do the messages published to Redpanda. Waterstream reads them directly and sends them to any subscribed clients. MQTT session state (subscriptions, in-flight message status, retained messages, etc.) are also persisted to the Redpanda - thus, Waterstream nodes remain stateless, and in case of the node failure, the clients can reconnect to any other Waterstream node and keep working.
Additional features offered by Waterstream beyond the basic MQTT specification requirements are the following:
MQTT over WebSockets
Flexible authorization rules
Bridge mode - connect to the existing MQTT broker and replicate the messages with it bidirectionally
A typical deployment of Waterstream with Redpanda looks like this:
A TCP load balancer spreads connections from the MQTT clients across the multiple Waterstream nodes. Each Waterstream node connects to Redpanda, which stores both the messages and MQTT clients’ states. To accommodate workload changes, Waterstream nodes may be started and stopped at any time.
Any program that implements a Kafka consumer or producer can be used to consume or publish messages in Kafka and make them available to the MQTT devices as well. In particular, stream processing engines, such as Materialized, may be used to process messages received from MQTT devices and send the processing result back without custom code. For example, such a streaming query may join the events from the vehicles that are searching the parking spot with parking metrics suggesting the least busy place.
Zooming in into the Waterstream node internals shows how parts are connected:
Waterstream maintains an in-memory session for each connected MQTT client and backs it up to the compacted topic in Redpanda. These changes are picked up by the Kafka Streams component, which is compatible with Redpanda, so Waterstream can restore the session state when the client reconnects. This enables such features as QoS 1 (at least once) and QoS 2 (exactly once) delivery of the MQTT messages even if a node crashes while the message is being transferred.
MQTT messages are published/consumed to/from one or more Redpanda topics. There must always be one default topic, while additional topics for a specific MQTT topic may be configured as needed, either with a pattern or with a prefix (see this documentation for the details).
As Kafka and Redpanda have much stricter rules for the topic name than the MQTT protocol, an MQTT topic can't go into the Kafka topic name. Therefore, the MQTT topic name becomes the key of the Redpanda message. This can be customized with the mapping rules as needed.
Waterstream is the best fit if you want your devices to be closely integrated with Redpanda, with minimal overhead. Being a relatively thin layer, Waterstream ensures low latency between MQTT and Kafka. Another reason for Waterstream adoption is the ease of deployment and operation. Waterstream nodes are stateless. You can quickly scale out or in as needed, from hundreds to hundreds of thousands to millions of devices.
Of course, every technology has its weak spots too. As all the messages in the Waterstream have to go through Redpanda for consistency reasons, in an MQTT to MQTT only scenario, it has higher latency than the classic MQTT brokers.
Waterstream and Redpanda: A live demo
Integrating Waterstream with Redpanda is straightforward; this blog post is accompanied by a live demo that simulates some taxis driving in New York. Each car belongs to a company and continuously sends an MQTT message with its position and the count of current passengers.
For the benefit of a more realistic simulation, starting from the current position, each taxi is assigned an arbitrary point of interest to reach and a random number of passengers. OpenStreetMap provides a path to the simulated vehicle’s destination. The taxi follows the route, sending metrics during the trip. Upon arriving at the destination, the cab stops for a random amount of time to simulate the wait for a new passenger.
The final result is similar to a real-life use case; you can follow some cars while moving across the city on the map.
An MQTT client simulates each device and sends data using its MQTT topic, which consists of concatenating a fixed prefix with the taxi plate. Those topics are mapped into a single Kafka topic to make it easier to analyze the incoming data stream.
The payload of each MQTT message is in JSON format and includes:
The number plate, as the identifier of the taxi
The current position and the next waypoint
The current speed
The number of carried passengers
The company name and the placeholder features for the map
In the user interface, the demo presents the markers showing moving cars, while the sidebar shows an aggregation of the taxi companies summing the number of passengers.
The following diagram shows the data flow:
The taxi simulator process sends data to Waterstream via MQTT, which is in turn saved into Kafka topics. The UI accesses the taxi data using MQTT over WebSocket to connect to the Waterstream without intermediaries. Authorization rules in Waterstream are in such a way that the UI has read-only access to Waterstream while the taxi simulator also has write access.
The markers are placed on the map reading a given MQTT topic that includes only the visible taxis. Creating the live aggregation graph required a streaming engine capable of real-time analytics. This demo uses Materialize, which allows you to write easy SQL queries. Using Materialize, passenger count by the taxi company is saved into another Kafka topic. This topic is translated by Waterstream into an MQTT topic that the UI subscribes to before showing the diagram with aggregated data. More details are provided in the source code on GitHub.
Finally, the demo is completed with Waterstrean metrics ingested into Prometheus. An embedded Grafana dashboard shows the message rate, the message count, and the number of connected clients.
Running on the edge
The second part of this post explores using Redpanda at the edge in combination with Waterstream. For the sake of clarity, in this scenario, a tiny Redpanda cluster runs on-site together with Waterstream to provide MQTT connectivity to clients. The MQTT clients can also run on-site or close to the edge location.
In some cases, such a deployment could consist of only one instance of Redpanda and Waterstream on the same machine because high availability is not always a requirement in edge computing. If edge sites are hundreds or thousands, high availability can be sacrificed for a more manageable and cheaper solution where only one hardware appliance is used at each site. Of course, there are some drawbacks to be considered. There’s no replication and downtime in case the non-replicated machine fails.
An edge setup with Waterstream and Redpanda offers many benefits:
Devices are closer to Waterstream with respect to a configuration where Waterstream is in the cloud, reducing the latency and the reaction time to events.
Data is saved and eventually processed locally at scale. Only the information required elsewhere for further analysis is transferred to a cloud location, thus reducing data storage and network traffic. One exciting option is the Redpanda Edge Agent, a lightweight IoT agent that runs alongside Redpanda at the edge and forwards data to a central Redpanda cluster.
The edge site can regularly work even without connecting to the cloud, providing data backpressure. Such scenarios can happen not only in case of failures but also in some specific use cases like the transportation one.
This post shows the performance tests of two deployments that simulate a realistic edge setup. The first has a bigger machine with Redpanda and Waterstream co-located. The second scenario has Redpanda and Waterstream deployed on separate smaller machines. Both tests use Redpanda version 22.1.4 and Waterstream version 1.4.0.
The testing uses the Simplematter MQTT Test Suite, which supports testing interaction between MQTT and Kafka. The test suite runs on an AWS EKS cluster made of m5.large (2 CPU, 8 GB RAM) machines - one machine per test node plus one additional machine for the aggregation of the test data. Each test node hosts 5000 MQTT clients for this test. To increase the overall number of the clients, more test nodes are created. Each test client sends one QoS0 (at most once) message per second. So, the number of MQTT clients is the same as the intended messages per second count. The average size of the message is 1 KB.
The first deployment uses a 4gen.2xlarge (8 vCPU, 48 GB RAM, ARM architecture) AWS EC2 machine type. In this example, we set Redpanda to have a memory limit of 32 GB, and set the # of CPU cores to 4. Waterstream has heap size limited to 25% of the machine RAM with
-XX:MaxRAMPercentage=25.0. The tests consider clients’ numbers ranging from 25k to 100k, and the system handles the load well. In the table and diagram below, you can see the results.
P50 latency, ms
P99 latency, ms
The second deployment uses two AWS EC2 machines of lm4gen.large type (2 vCPU, 8 GB RAM): one for Redpanda, another for Waterstream. Redpanda runs with the default setting. That is, it consumes all computing resources. Waterstream has heap size set to 75% of the available RAM, and
MQTT_MAX_QUEUED_INCOMING_MESSAGES is set to 25 (down from the default 1000) so that the incoming messages queues wouldn't grow too big as there are a lot of high-intensity MQTT publishers. The tests run with 25k, 40K, and 45k clients, and the system performs as expected.
P50 latency, ms
P99 latency, ms
As you can see, in the single-machine case, larger total available memory allows the system to handle more clients. In the second case, having independent servers helps to improve the latency. Both setups can handle the workloads you may expect from the edge deployment and convert IoT data into Kafka-compatible format for further processing with widely available tools.
As the importance of IoT and sensor data grows, so does the adoption of the MQTT protocol. Waterstream is the simplest way to provide Redpanda users with a modern, solid, and production-ready solution to support MQTT and high messages per second rates.
However, from a source code perspective, nothing has changed to have Waterstream running on top of Redpanda rather than another Kafka platform or installation and get the same performance.
Let's keep in touch
Subscribe and never miss another blog post, announcement, or community event. We hate spam and will never sell your contact information.