November 15, 2023
San Francisco, CA (November 15, 2023) Streaming data pioneer Redpanda today released its inaugural State of Streaming Data Report to shed light on the trends, use cases, data volumes, technology stack and technical and business challenges in the rapidly growing streaming data ecosystem. The report shares what’s driving companies to migrate their systems from batch processing to real-time systems and the challenges they face. Based on a third-party survey of 300 engineering organizations familiar with streaming data, the report is the first comprehensive, independent study on the current state of the streaming data industry.
Organizations today increasingly look to enrich their applications, analytics platforms, and AI/ML models with real-time data, which requires a transformation from traditional batch processing to streaming data systems that process and analyze gigabytes of data per second. In these streaming data pipelines, real-time data flows continuously as it’s generated from sources such as sensors, devices and applications. However, streaming data adoption varies across industries and streaming data systems can be challenging to implement, manage and scale for the unprepared.
"At Redpanda, our mission has always been to make real-time data easier to consume," said Tristan Stevens, Director of Customer Success at Redpanda. “Toward that goal, this report offers insight into the transformative capabilities of streaming data. We commissioned the survey as a resource for comparison by organizations that have already adopted streaming data and a guide for those that are now evaluating this important technology.”
The report was based on responses from 300 engineering organizations familiar with streaming data, captured in a survey conducted by insights-driven strategy firm Material. Roughly 75% of survey participants were at various stages of adoption, providing a comprehensive overview of both established and emerging use cases.
"As we see in most industries, AI and machine learning are going to shake up how companies operate, and streaming data amplifies the leverage that can be gained from these powerful tools in real-time for bigger impact," said Hilary DeCamp, Material’s Chief Methodologist.
Key findings include:
Real-time analytics and AI are driving streaming data adoption – Survey respondents expressed that real-time analytics (71%) is the leading current use case for adopting streaming data systems. Looking forward, nearly three-in-four respondents cited that development of AI/ML systems will be the biggest driver of streaming data adoption in the next 12-24 months.
Data privacy and technical skills are barriers to adoption – Perceived technical challenges for adopting streaming data are led by concerns for data privacy (42%) and data consistency (35%). Perceived business challenges are centered around the cost (36%) of these systems and the in-house technical skills required to be successful with streaming data systems (34%).
Companies are running both analytical and transactional workloads – The majority (58%) of current streaming data users are running both transactional and analytical workloads. Nearly all users expect to see an increase in the amount of real-time data they stream for analytical (71%) and transactional (81%) workloads.
Streaming data environments are hybrid – Organizations consistently reported navigating multiple platforms, encompassing both Apache Kafka®-compatible and Kafka non-compatible solutions. More than half of current users stated that their data streaming infrastructure is hosted on VMs or containers and is located in a hybrid environment. AWS (57%) and Microsoft Azure (57%) were the most common cloud providers selected.
For a deeper dive into streaming data trends, download the full report at https://redpanda.com/state-of-streaming-data-report-2023-24. Additional topics covered in the full report include:
Typical message throughput for transactional and analytical workloads
Daily volume of streaming data based on workload types
Data retention policies in use
Most popular core components of streaming data pipelines
Most used client libraries, data processing tools and formats
Redpanda is the streaming data platform for developers. API-compatible with Apache Kafka®, Redpanda introduces a breakthrough architecture and disruptive capabilities that make it a simple, fast, reliable, and unified engine of record for both real-time and historical enterprise data. Innovators like Lacework, Jump Trading, Vodafone, Moody’s, Hotels Network and Alpaca rely on Redpanda to process hundreds of terabytes of data a day. Backed by premier venture investors Lightspeed, GV and Haystack VC, Redpanda is a diverse, people-first organization with teams distributed around the globe. To learn more, visit redpanda.com and follow the company on Twitter at @redpandadata.
Material is a global strategy partner to the world’s most recognized brands and innovators. Deeply connected to markets, culture and people through behavioral science and robust insights capabilities, Material helps companies realize critical outcomes across their growth and innovation agendas. We combine deep human insights with design and enabling technology – using a proprietary Science + Systems approach that speeds engagement and growth. We design and build customer-centric business models and experiences that create transformative relationships between businesses and the people they serve. Learn more at www.materialplus.io.