This second post in our multi-part series covers pros/cons, infrastructure, Redpanda configuration, and resilience testing in High Availability environments.
Welcome back to another blog post on high availability (HA) and disaster recovery (DR) with Redpanda. This is our second blog post in this series. As a reminder, part one in this series focuses on failure scenarios, deployment patterns, and other considerations such as partition leadership/rebalancing and rack awareness (to name a few).
This post will focus on considerations when deploying a cluster in a single availability zone (AZ) where HA is important—specifically, ensuring that within an AZ, there are no single points of failure and that Redpanda is configured to spread its replicas across our failure domains. We will go over pros and cons of this deployment, discuss this deployment’s infrastructure, present deployment automation scripts, and then conclude with a summary and details on the next HA blog’s focus.
The basis for this architecture is Redpanda’s rack awareness feature, which means deploying n Redpanda brokers across three or more racks, or failure domains within a single availability zone.
Single AZ deployments (Figure 1) in the cloud do not attract cross-AZ network charges in the way that multi-AZ deployments, yet we can leverage resilient power supplies and networking infrastructure within an AZ to mitigate against all but total-AZ failure scenarios. This is in contrast to a cross-AZ deployment (Figure 2) which is designed to have fewer common-cause failures, yet at the expense of increased networking costs.
Pros and cons of high-availability deployment
Whenever you move from a simple software deployment towards a fully geo-resilient and redundant architecture, there is an increase in cost, complexity, latency, and performance overhead. All of these need to be balanced against the benefits of increased availability and fault tolerance. That said, Redpanda’s use of the Raft consensus algorithm is designed to ensure that much of the cost and complexity of the system isn't put back onto the user or operations engineer and a single slow node or link doesn’t impact that performance.
Let’s evaluate how a simple high-availability solution for Redpanda affects the running of the system:
Cost: Redpanda operates the same Raft algorithm irrespective of using HA mode or not, so introducing fault tolerance does not increase Redpanda costs. There may be infrastructure costs when deploying across multiple racks, but these would normally be amortized across a wider data-center operations program.
Performance: Spreading Redpanda replicas across racks and switches will increase the number of network hops between Redpanda brokers, however, normal intra-datacenter network latency should be measured in microseconds rather than milliseconds. Care may be required to ensure that there is sufficient bandwidth between brokers to handle replication traffic.
Complexity: Redpanda’s deployment automation tooling makes HA deployments less labor intensive than they might be if you were deploying by hand, and Redpanda does not require external tooling for partition management (for example as Cruise Control) as it comes complete with Continuous Data Balancing and Automatic Leadership Rebalancing.
When deploying Redpanda for high availability within a data center, we want to ensure that nodes are spread across at least three failure domains. This would normally mean separate racks, under separate switches, ideally powered by separate electrical feeds or circuits. It’s also important to ensure there's sufficient network bandwidth between nodes, particularly considering shared uplinks, which may now be subject to high throughput intra-cluster replication traffic.
In an on-premises network, this HA configuration refers to separate racks or data halls within a DC (Figure 1).
There are many HA configurations available from various cloud providers (Figure 3). We’ll explore some of these options in the following sections.
All of the concepts below can be automated using Terraform or a similar infrastructure-as-code (IaC) tool. We recently updated our Terraform scripts to take advantage of these availability deployment configurations within AWS, Azure, and Google Cloud.
AWS provides Partition placement groups, which allow spreading hosts across multiple partitions (or failure domains) within an AZ. The default number of partitions is three (with a max of seven). This can be combined with Redpanda’s replication factor setting so that each topic partition replica is guaranteed to be isolated from the impact of hardware failure.
Microsoft Azure provides Flexible scale sets, which allow you to assign VMs to specific fault domains. Each scale set can have up to five fault domains (depending on your region). Keep in mind that not all VM types support Flexible orchestration. For example, the Lsv2-series only supports Uniform scale sets.
Google Cloud has Instance Placement Policies. When using the Spread Instance Placement Policy you can specify how many availability domains you can have (up to a maximum of eight).
Note: Google Cloud doesn’t give you transparency as to which availability domain an instance has been placed into, which means that we need to have an availability domain per Redpanda broker. Essentially, this isn't running with rack awareness enabled, but is the only option available for clusters larger than three nodes.
Redpanda topology configuration
To make Redpanda aware of the topology it's running on, we’ll need to configure the cluster to enable rack awareness and then configure each node with the identifier of the rack.
Firstly the cluster-property enable_rack_awareness, which can either be set in the
/etc/redpanda/.bootstrap.yaml or can be set using the following command:
rpk cluster config set enable_rack_awareness true
Secondly, on a per-node basis, the rack id needs to be set in the
/etc/redpanda/redpanda.yaml file, or via:
rpk redpanda config set redpanda.rack <rackid>
As with the Terraform scripts, we have modified our Ansible playbooks to take a per-instance rack variable from the Terraform output and use that to set the relevant cluster and node configuration. Our deployment automation can now provision public cloud infrastructure with discrete failure domains (
-var=ha=true) and use the resulting inventory to provision rack-aware clusters using Ansible.
Once Redpanda has rack awareness enabled, it will ensure that no more than a minority of replicas appear on a single rack. Even when performing rebalancing operations, Redpanda will ensure that the leader assignment and replica assignment sticks to this rule.
In this example, we will deploy a high-availability cluster into AWS, Azure, or GCP using the provided Terraform and Ansible capabilities.
Let's keep in touch
Subscribe and never miss another blog post, announcement, or community event. We hate spam and will never sell your contact information.