
How to build a governed Agentic AI pipeline with Redpanda
Everything you need to move agentic AI initiatives to production — safely
Get the data privacy and sovereignty of self-hosting with the ease and scalability of fully-managed services—all in your own cloud
[CTA_MODULE]
As you embark on the journey of modernizing your digital footprint from a legacy system for greater autonomy and faster service delivery, the pressure is on when choosing the right people and places to manage the heart of your data pipelines—your streaming data platform, which is in charge of keeping your data flowing seamlessly throughout your entire system.
Currently, the most common offerings on the market to manage these platforms are: self-hosted, or managed and hosted by the vendors’ cloud (PaaS). Both have their own pros and cons.
With self-hosted, there’s a hefty upfront investment and you’re on the hook for planning the necessary infrastructure and human resources to keep everything up and running. It’s also less flexible to scale since you’d need to do everything in-house. The upside is that you’d have full control over your data, which makes self-hosting an attractive option for companies focused on data privacy, data security, and data sovereignty.
On the other hand, platform as a service (PaaS) is where everything is managed for you in the cloud. It’s like a one-stop-shop where setup, monitoring, maintenance, and scaling is all taken care of. However, not every company trusts third parties with their data, and the lack of transparency, access control and residency can be a major deal-breaker.
With today’s companies feeling the heat when it comes to guaranteeing data privacy and security, it’s no surprise they want to provision their clusters within their own virtual private cloud (VPC) and keep their data contained in their own environment. However, they also want to offload operations and maintenance to free up their teams. Basically, they want the best parts of both self-hosted and PaaS.
That’s why we developed Bring Your Own Cloud (BYOC)—a fully managed Redpanda cluster hosted on your cloud while Redpanda takes care of operations, monitoring, and maintenance. That means complete control over your data in the cloud along with the relief of fully-managed services. Best of both worlds, indeed!
Furthermore, BYOC’s privacy-first architecture drives compliance for streaming data, and allows you to scale on your own infrastructure while maintaining data sovereignty requirements. If you’d like a more detailed explanation of how BYOC works, read our post on why we think data sovereignty is the future of cloud.
Now, let’s go over how you can set up your Redpanda BYOC cluster, and a few different ways you can connect to your cluster to stream data.
Installing a BYOC cluster is pretty simple. We can break it down into three steps.
1. Provision appropriate access credentials in your cloud
2. Choose your preferences
3. Initiate deployment
rpk (Redpanda’s command line interface tool) in your local laptop. Then log into your cloud to start bootstrapping with a simple command.The agent will first be deployed in your cloud and kickstart the installation. During this installation, the agent will set up the network across multiple availability zones, but install on a single AZ (if you choose to do so). It’ll also provision an Amazon S3 or GCS service for Tiered Storage, as well as a Kubernetes cluster, Redpanda cluster, and Redpanda Console.
If you’re more of a visual learner, watch this short video explaining what BYOC is and how to install a BYOC cluster:
Depending on your preference, you can access the cluster either through the internet or VPC Peerings, then you can start streaming data into the cluster. We created a quick demo showcasing different ways to connect to your BYOC cluster. Below is a diagram of the setup.

Basically, a simulator microservices (Python) is deployed in Kubernetes (K8s) and continuously publishes signal events. The Kubernetes cluster sits in its own VPC and connects to the BYOC cluster via VPC peering.
Another consumer client (Quarkus) consumes the events externally. We set up the BYOC in a public subnet so it can connect via the internet gateway. The signal triggers a Lambda serverless application, instead of using an MSK or SNS. The Lambda service also sits in its own VPC to connect them, and—similar to AWS MSK and AWS Kinesis—establishing a VPC peering connection will do the trick.
In the demo, we enabled SASL for authentication purposes and used the secret manager to store credentials for Lambda triggers. In this case, make sure you update the access policy for your Lambda role so it has permission to access the stored credentials.
Here’s a video demo on how to connect to your BYOC cluster. To get your hands on the code used in this demo, visit this GitHub repo.
BYOC is a fully managed service, but that doesn’t mean your clusters completely rely on it. The separation of the control plane (in Redpanda Cloud) and data plane (in your own cloud) allows the system to run as usual when the control plane is down.

The agent installed not only takes care of bootstrapping, but also configures and maintains the cloud infrastructure, K8s resources, and software artifacts. Rather than sending commands from the control plane, the agent pulls a document that specifies the cluster’s shape. This ensures the Redpanda control plane doesn’t have credentials or excessive permissions. Lastly, the agent doesn’t collect or distribute any metrics, since the metrics are all collected via the cloud provider’s API endpoint.
Redpanda updates with rolling upgrades. You can always do a blue/green deployment or canary release with a cluster running different versions, introducing the upgrades at your own pace to ensure zero application downtime.
With the basics and infrastructure all taken care of, you can focus on important cluster administration tasks, like:
These configurations and credentials all reside in your own cloud. To see what’s happening with all your clusters, Redpanda Console is a developer-friendly tool that’s also hosted within your cloud. From the dashboard, you can check in on your topic to see the payload at a glance—a useful ability for developers debugging or generally designing their applications.

Now that you’re up to speed on everything BYOC can do, let’s end with a few considerations to keep in mind before you start spinning up a BYOC cluster.
If you’re still on the fence about whether the BYOC deployment model is right for you, we’ll give you a quick cheat sheet. BYOC is the best choice if you’re looking for the following:
To give you a little extra reassurance, here’s a quote from one of our happy customers.
"Redpanda BYOC gives us a fully managed Kafka service running on our own cloud servers, balancing our internal compliance requirements with ease of use, and without compromising performance or compatibility."
— Kannan D.R., Enterprise Data Architect, LiveRamp
To experience a fully managed Redpanda cluster where your data always stays in your environment—get started with BYOC here. If you get stuck, have a question, or want to chat with our engineers and fellow Redpanda users, join our Redpanda Community on Slack.
Chat with our team, ask industry experts, and meet fellow data streaming enthusiasts.
Subscribe to our VIP (very important panda) mailing list to pounce on the latest blogs, surprise announcements, and community events!
Opt out anytime.