A tour of Redpanda Streamfest 2024

Iceberg, inspiration, and plenty of swag — read the highlights from our biggest streaming data event

By
on
December 18, 2024

Cameras were ready, the virtual stages were set, and the Redpanda team was buzzing with excitement for our very first Streamfest: a half-day virtual event packed with data streaming expertise for real-time applications, analytics, and AI.

But Redpanda Streamfest wasn’t just another tech event—it was a learning experience that cut through the swarms of information and focused on what developers really want to know from experts who have walked in their shoes.  

With over 2,000 registrants, 14 topics, and a hands-on workshop, Streamfest was brimming with fresh perspectives, thoughtful insights, and a vibrant chat that showcased the passion and power of the dev community. For those who couldn’t attend the live event, you can watch all the Streamfest sessions on-demand. If you’d rather skim to get the gist of what happened, read on.

Technical AMA lounge

To kick things off, two of our finest engineers, Patrick Angeles and TJ Hyde, were tasked with answering questions on all things Redpanda. The Technical AMA ran throughout the event so attendees could drop in and try to stump the experts to win some swag. 

AMA cover with Patrick Angeles (left) and TJ Hyde (right)

This session went to many places—predictable wasn’t one of them.

Questions flew into the chat from folks worldwide on everything from Flink and Kubernetes to the internals of Redpanda’s architecture. Occasionally, you’d hear Patrick take attendees for a spin with answers like, “What’s WASM? It stands for WASM-atta-baby!”

By the end, our smart engineers remained undefeated and answered everything. But don’t worry, they still awarded a handful of attendees with swag just for asking great questions.

A warm welcome

Over on the main virtual stage, the official Streamfest welcome began with a quick introduction from the host and our Head of Customer Success, Tristan Stevens. He aptly described the upcoming sessions as short and to the point — ”like espresso shots of technical content.” 

After running through the different Streamfest stages, Tristan handed it off to (a very enthusiastic) Alexander Gallego, our Founder and CEO, for a whirlwind summary of what was ahead: expert talks, unannounced features, head-turning demos, and his vision for Redpanda as the foundation for all data-intensive streaming applications.

Alex during his session on the future of data-intensive applications
“I want you to walk away with an idea of how to build the future.” - Alex Gallego.

Alex is a master of bringing people along for the journey, and that’s what his session was all about. He teleported us back to 2019 when he wrote the first lines of code for the fastest and simplest streaming data engine, paused in 2022 when we launched BYOC, and then fast-forwarded to 2024 when we released a feature-packed Redpanda 24.3

“The Achilles heel of data streaming has always been cost and complexity,” he explained. “This conference will show you a new way of thinking about building data-intensive applications.”   

As Alex’s session came to an end, the Streamfest chat picked up and occasional polls popped in from the corner to get to know our attendees. Then, Alex introduced the first keynote of the event.

Keynotes

Iceberg, streaming, and the democratization of tables

Tyler Akidau, Distinguished Software Engineer, Snowflake

Apache Iceberg™ — the keystone that unifies streaming and batch. Tyler has pioneered some of the world’s best advancements in data streaming, and this session picked his brain for a glimpse into how the world will be built differently based on the “duality of streams and tables.”

For the unfamiliar, there are two main kinds of data: data in motion (streaming) and data at rest. The idea is using tables to accumulate the latter over time so you can see the big picture and analyze it properly. Iceberg is democratizing the idea of storing your data in a tabular format and accessing it with any tool you want.

Here’s where Redpanda’s support comes in, easily creating Iceberg tables with all the reliability, ease of use, and performance developers expect. 

“My take on the table|stream duality is the democratization of tables and the chance to bridge the two worlds in a universal way.” - Tyler Akidau, Snowflake.

The future of data streaming: ubiquitous, unified, efficient

Yaroslav Tkachenko, Software Engineer & Evangelist

Yaroslav has a proven track record as a distinguished engineer at companies like Goldsky, Shopify, and Activision. Safe to say, he knows a thing or two about streaming data. So this session tapped into his “predictions for the future.”

“In the future, you won't need to choose between batch and stream processing. We’ll just have a single unified data processing pipeline, with streaming or incremental semantics by default and batch for anything specific.” - Yaroslav Tkachenko.

Change Data Capture is key in making this a reality. Although the reason we’re still lagging behind as an industry isn’t just due to a technological shift, but an organizational one. Data products need to be prioritized and building them needs to be infinitely easier. Fortunately, streamlined platforms like Redpanda are laying down the foundations this better future is already being built upon.

How streaming data advances Agentic AI applications

Eiso Kant, CTO & Co-founder, Poolside

This session was a meeting of the minds with both Alex and Eiso filling the screen with animated chatter on what a streaming-first architecture should look like. Eiso had a lot to say, considering his company Poolside is building the world’s most capable AI-driven software development tools to enhance developers' efficiency and capabilities. 

To build these foundational models better and faster than anyone else, Poolside uses Redpanda to stream data from Iceberg into the training process — taking model training time down from days or weeks to minutes. Eiso also points out that the true value of streaming data is being able to send data end to end as quickly as pushing a button.

“Redpanda’s throughput is amazing. We can materialize data sets and start training models five minutes later. It’s a huge upgrade for developers.” - Eiso Kant, Poolside.

To watch all the keynotes, head over and watch all the Streamfest sessions on-demand.

Throwback to the Redpanda Hackathon

Back in October, we ran a Redpanda AI Integration and Innovation Hackathon that challenged developers to flex their creativity with Redpanda Data Transforms, Redpanda Connect, and other cutting-edge technologies.

The focus wasn’t just on technical brilliance but also on the simplicity of how each solution could solve real-world problems. The winner was founder Dan Goodman and his project: Sovereign Structure, which proposed a simple solution for enterprises to tap into AI without everyone tapping into their data. 

Diagram of the Redpanda Hackathon’s winning project

If you’re curious about the project and how he put it together using Redpanda, read his blog for the full story. 

Lightning talks

Postgres Change Data Capture in Redpanda Connect

James Kinley, Principal Solutions Architect, Redpanda


The new Postgres CDC connector supports multiple replication modes and marks the beginning of a larger CDC effort tailored to Redpanda Connect's native Go (rather than Debezium's Java). Currently, the postgres_cdc input is an Enterprise-tier connector, available in beta for self-managed Redpanda Connect, and in Redpanda Cloud via Redpanda Connect. 

“You can skip Debezium, Kafka Connect, and all that Java. Just use our Postgres CDC connector for Redpanda Connect.” - James Kinley, Redpanda.

To introduce this handy new connector, James Kinley guided everyone through a live demo of the following setup: 

Architecture showing Redpanda Connect’s new Postgres CDC connector

In this example, James enlisted the new CDC connector to read data from a Postgres database then used Redpanda Connect to schematize that data and load it into a Redpanda topic. This topic was configured as an Iceberg topic, so the converted data could be queried by any Iceberg-compatible tool (in this case, Apache Spark™.  

To take James’ demo for a spin yourself, check out his project on GitHub.

Snowflake and Redpanda

Tyler Jones, Principal Software Engineer, Snowflake

Tyler graced the screen (kindly wearing a Redpanda shirt) to present Snowpipe Streaming and how Redpanda uses it to stream data via Redpanda Connect. 

Snowpipe Streaming is a native feature that enables high-throughput ingest into a destination table, supporting up to 5 GB/s with exactly-once semantics and P99 latencies of 5 seconds or less. Simplicity is a core principle at Snowflake, so naturally, they found a kindred soul in Redpanda.

Architecture showing Redpanda Connect and Snowpipe Streaming

During Tyler’s session, he gently explained the technical concepts that make Snowpipe Streaming an engineering accomplishment, and how Redpanda uses it to send data to Snowflake tables. This all happens in Redpanda Connect, which maps topic partitions to channels and uses offset tokens to achieve exactly-once ingest.

“I’m really proud of what Redpanda achieved with Snowpipe Streaming. It’s been a joy to work with them and make this integration happen.” - Tyler Jones, Snowflake.

To watch these lightning talks, check out all the Streamfest sessions on-demand.

Redpanda product demos

Sovereign AI: Send the AI model to your data

Tyler Rockwood, Software Engineer, Redpanda

It’s easy to stand in awe of AI’s capabilities, but there’s a growing awareness of all the different platforms and corporations squirrelling away our data. Tyler’s session showcased the value and importance of Sovereign AI, a safer way to build where instead of sending your data to the AI model, Redpanda sends the model to your data so everything stays within your network.   

Some of the major vector databases that Redpanda Connect supports

He then launched a live demo of an e-commerce application that triggers an engagement follow-up email using RAG, ClickHouse, and Redpanda Cloud — all running locally on his own GPUs and without a single call out to a public API.

“You’re the sole steward of your data. With Redpanda’s Sovereign AI you can bring powerful open-source LLMs into your VPC and process everything locally.” — Tyler Rockwood, Redpanda.

You can find Tyler's entire demo on GitHub.

Redpanda Integration with Iceberg

Andrew Wong, Software Engineer, Redpanda

Next up was Andrew with his session on one of our most exciting Redpanda 24.3 features: Apache Iceberg Topics. Iceberg Topics dramatically simplifies how you access your streaming data from common data analytics platforms using Iceberg tables. 

In a nutshell, Iceberg Topics unites streams and tables, resulting in:

  • 1:1 mapping of topics to tables
  • Enable redpanda.iceberg.mode topic property and choose between different modes:
    • key _value: the input record value is treated as a blob and the output table will have a simple value binary column.
    • value_schema_id_prefix: the input record value is in the Schema Registry wire format. The output table will be structured with fields from the schema used.
    • disabled (default): Iceberg writing is disabled.

Andrew then ran a demo to show you all of this in action, and reminded folks that the beta is available in 24.3 and ready to use. If you have questions about Iceberg Topics, drop into the Redpanda Community on Slack and ask the team.

Redpanda Migrator in action

Tristan Stevens, Head of Customer Success, Redpanda

Tristan briefly took off his hosting hat to present his own session on Redpanda Migrator, a tool designed to simplify migrations from any Apache Kafka® system to Redpanda.

Diagram of how the Redpanda Migrator works in Redpanda Connect

After a quick explanation on how Redpanda Migrator works, Tristan demostrated how easy it was to move data from a Kafka cluster to a Redpanda cluster in just a few minutes. Tristan also tested it in the cloud by spinning up a Confluent cluster and moving data into a Redpanda Serverless cluster, proving beyond doubt that Redpanda Migrator lets you migrate with a click of a button.

Redpanda Connect for Cloud, BYOC, and Serverless

Ash Jeffs, Technical Lead for Redpanda Connect, Redpanda

It’s a good thing we created an event prepared for the unpredictable because next up was Ash Jeffs, Benthos founder and a one-of-a-kind panda. If you haven’t witnessed a demo by Ash before, then you’re in for a treat. 

As many of you know, Redpanda Connect is a simpler alternative to Kafka Connect when integrating data from external sources. During his demo, Ash both impressed and tickled the audience with how easy it is to run Redpanda Connect in Redpanda Cloud (BYOC, Serverless) using rpk

To give you an idea, here’s the config he wrote: 

input:
  generate:
    interval: 1s
    mapping: |
      root.id = uuid_v4()
      root.name = fake("name")
      root.email = fake("email")
      root.age = random_int(max: 100)
      root.meow = if ( root.age % 2 ) == 0 { "meow" } else { "meooooowwwww" }
      root.email_hash = root.email.hash("sha256").encode("base64").string()

The chat wasted no time in filling up with cat GIFs during the session. To catch this curious demo and all the others, watch the Streamfest videos on-demand.

[CTA_MODULE]

Bring Your Own Cloud (BYOC) panel discussion

As a reminder, Bring Your Own Cloud is an increasingly popular deployment option for cloud services. You get a SaaS managed service as well full ownership of your compute and data residing in your own cloud environment.

To really dig into BYOC, this session channeled expert knowledge and experiences from an impressive panel. 

Lineup of experts leading the BYOC panel discussion

 Tristan kicked things off with his first question on “why BYOC?” The responses inevitably bounced between cost-savings, scalability, and compliance. 

“BYOC lets us get the latest and greatest data without compromising our privacy.” Ragul Nandagopal, Dun & Bradstreet.

The topic of transparency also bubbled up, with ShareChat praising the much-appreciated visibility into their infrastructure usage. 

“What we want as customers is transparency. To enable cost efficiency, scalability, and ecosystem efficiency. BYOC helps with all of these because we’re not paying for some throughput metric but for the visibility into the infrastructure costs from a usage perspective.“ Arya Ketan, ShareChat. 

Arya also had a thing or two to say about cost savings, considering ShareChat reduced cloud costs by 70% with Redpanda BYOC. The panel then veered into the challenges of BYOC. How hard is it to implement? While one customer, TrueCaller, went from signing the contract to running BYOC in production within nine days, Ragul shared that it took them a couple of months, but added that it was “time worth spending.” 

On the subject of maintenance issues, Aashish from ClickHouse noted that BYOC offers convenient maintenance windows, which allows customers to choose the time and date to upgrade their services. They can also select specific services to upgrade so they can test them first.  

Lastly, the audience landed on the topic of all the different BYOC flavors now saturating the market. Yaroslav chimed in that what makes Redpanda’s BYOC different is it separates data from metadata and uses Amazon S3 as primary storage, so it’s virtually stateless. That means Redpanda can scale, manage, and update its BYOC clusters without touching the customer’s data.

“BYOC is so easy to manage, even with a small team. You’re just dealing with a deployment of stateless workloads.” Yaroslav Tkachenko, Software Engineer & Evangelist.

Tristan then chimed in with his theory that if Redpanda were to “disappear from the face of the earth”, its BYOC clusters would still be up and running due to its isolated model. Now that’s a reassuring thought.

Partner talk and customer use cases

AI and data privacy: vector storage for agentic systems

David Myriel, Data sovereignty & vector storage, Qdrant

We love showcasing our partners, and David was a must. In his session, David gave an insightful walkthrough of vector databases and how to integrate Qdrant with Redpanda to unlock sovereign architectures for all sorts of RAG and AI applications. 

He began with the basics of AI: vectors. “Vectors are the language of AI. They’re how AI systems understand sentences and process data,” he explained. Qdrant is a vector database and similarity search engine, which ensures secure and scalable vector storage. So if you’re building with vectors, you’ll want to get familiar with Qdrant.

Happily enough, Redpanda Connect makes it easy to stream vector data with Qdrant, and with both technologies focusing heavily on data sovereignty, this powerful duo is ideal for building agentic AI systems that process a tremendous amount of sensitive data.

“With the rise of AI, privacy channels are becoming critical. Quadrant is open source that lets organizations keep their data in-house and see how their data is handled.” - David Myriel, Qdrant.
Example of a private GenAI architecture using Redpanda and a private LLM

NYSE Cloud Streaming powered by Redpanda

Anand Pradhan, Head of AI, Intercontinental Exchange (ICE)

Intercontinental Exchange is a global financial organization that connects entrepreneurs to raise capital. Under their umbrella is New York Stock Exchange (NYSE). The market data platform Anand manages typically processes over 650 billion transactions daily, so he can talk about high throughput and the importance of low latency.

Anand’s session was all about NYSE Cloud Streaming. “We wanted to take the data to the cloud for data analytics, so we built a real-time feed for the cloud for market insights,” he explained. After bumping into Redpanda at a conference, Anand decided to test it and eventually onboarded Redpanda to consolidate market data for five equity exchanges.

High-level diagram of NYSE’s cloud streaming architecture

Redpanda essentially takes market data to the data center, then to AWS, and finally to customers so they can keep track on what’s happening in the market. 

“We have better control over memory, costs, and CPU usage. There was no JVM or garbage collection and we got lower latency compared to Kafka, AWS, or MSK.” - Anand Pradhan, Intercontinental Exchange (ICE).

Anand also noted how easy Redpanda is to use, and provides everything they need to keep everything running smoothly (like its own web console). Sounds like a win-win.  

D&B & BYOC: data streaming and compliance with Redpanda

Raghul Nandagopal, VP of Engineering, Dun & Bradstreet

Dun & Bradstreet provides commercial data, analytics, and insights to help companies improve their business performance. In his session, Raghul shared his experience with Redpanda BYOC to stream data for real-time analytics. To start, Raghul walked through choosing the right streaming data platform for the enterprise. 

Table of the selection criteria Dun & Bradstreet used to select a streaming data platform

For Dun & Bradstreet, the priority was having the convenience of a fully managed service while protecting their data. For more reasons than one, Redpanda BYOC ticked all the boxes. In brief, choosing Redpanda led to:

  • Reduced operational risk with Redpanda in their VPC
  • Data kept safe within D&B’s own environment 
  • Freedom from worrying about managing infrastructure
  • High performance with a Tier 3 cluster handling 1B+ messages a day
  • Phenomenal collaboration and end-to-end support from Redpanda
“Our messaging infrastructure is on auto-pilot. We wouldn’t have been as successful this year without BYOC.” Ragul Nandagopal, Dun & Bradstreet.

Hands-on workshop: Redpanda Connect and AI

After seeing everything Redpanda Connect can do, it’s only natural that developers would want to roll up their sleeves and try it themselves. So, the last session of the event was a hands-on workshop hosted by Josh Purcell, Principal Solution Architect, with the unfairly smart Denis Coady and Mihai Todor monitoring the chat to answer any questions. 

This workshop builds the foundation you need to fully appreciate how easy — and frankly — how boring Redpanda Connect is to use. No runtime dependency, no Java class names, no complex cluster setup, and all the observability tools you need are ready out of the box. 

All you have to do is run it

But theory can only take you so far. Time to flex your fingers and play with Redpanda Connect in different AI use cases. You can access those demos on Instruqt and run them in an interactive environment.

Agenda slide for the hands-on workshop on Redpanda Connect for AI

If you like learning how to use cool tech with a generous dose of cute pandas in every slide, go watch the recording.

And the Streamfest prize winners are…

To make Streamfest even more exciting, different prizes were up for grabs throughout the event, from panda plushies to a fancy laser-powered proyector. To put anyone’s anticipation at ease, here are the winners:

  • Best Technical AMA question: Snehangsu De, Data Engineer at Black Piano
  • Most polls answered: Tyash Ghosh, Principal Engineer at Morgan Stanley

If you didn’t win this time, don’t worry, we’re always hosting more giveaways and competitions. To improve your chances, follow us on LinkedIn, Bluesky, and X so you know where to catch us next!  

That’s all, folks! Until we stream again**

The possibilities of streaming data are swiftly transitioning from unimaginable to obvious. In five hours of action-packed sessions, Redpanda Streamfest dug into its foundations, showed how it’s shaping the industry, and explored how it’s not only changing the way we build — but the way we think about building data-intensive applications. 

More importantly, this event was proof that all you really need to spark change is a group of passionate people who aren’t afraid to roll up their sleeves. And sometimes, your next big step starts with a bunch of 2D pandas and quirky presenters who love to share what they know.    

We hope these sessions give you a “paw-sitive” advantage in your own streaming data projects! Until next time.

Try the Blobl playground!
Hone your Bloblang skills for Redpanda Connect with our new interactive coding playground at blobl.redpanda.com.
No items found.

Related articles

VIEW ALL POSTS
What is a data streaming architecture?
Redpanda
&
&
&
November 14, 2024
Text Link
Batch vs. streaming data processing
Redpanda
&
&
&
November 8, 2024
Text Link
Kafka Connect vs. Redpanda Connect
Christa Lane
&
&
&
November 5, 2024
Text Link