[{"data":1,"prerenderedAt":1733},["ShallowReactive",2],{"active-banner":3,"navbar-featured-partner-blog":23,"navbar-pricing-featured":304,"success-stories-solutions":1084},{"id":4,"title":5,"date":6,"dismissible":7,"extension":8,"link":9,"link2":10,"linkText":11,"linkText2":10,"meta":12,"stem":20,"variant":21,"__hash__":22},"banners\u002Fbanners\u002Fkafka-company-2025.md","Native Apache Kafka Service Is Coming Soon to StreamNative Cloud. Join the waitlist and get $1,000 in credits.","2026-04-01",true,"md","\u002Fnative-kafka-service-waitlist",null,"Join Waitlist",{"body":13},{"type":14,"value":15,"toc":16},"minimark",[],{"title":17,"searchDepth":18,"depth":18,"links":19},"",2,[],"banners\u002Fkafka-company-2025","default","IMIJszQOOWTfA_DV33eYUA5jqV7DrX1FWbBTBZfNvWc",{"id":24,"title":25,"authors":26,"body":28,"category":288,"createdAt":10,"date":289,"description":290,"extension":8,"featured":7,"image":291,"isDraft":292,"link":10,"meta":293,"navigation":7,"order":294,"path":295,"readingTime":296,"relatedResources":10,"seo":297,"stem":298,"tags":299,"__hash__":303},"blogs\u002Fblog\u002Fstreamnative-recognized-in-the-forrester-wave-streaming-data-platforms-2025.md","StreamNative Recognized as a Contender in The Forrester Wave™: Streaming Data Platforms, Q4 2025",[27],"David Kjerrumgaard",{"type":14,"value":29,"toc":275},[30,38,46,50,66,72,77,80,86,101,108,114,117,123,126,133,139,142,145,156,162,168,171,174,177,183,190,193,196,203,206,209,223,228,232,236,240,244,248,250,267,269],[31,32,34],"h3",{"id":33},"receives-highest-possible-scores-in-both-the-messaging-and-resource-optimization-criteria",[35,36,37],"em",{},"Receives Highest Possible Scores in BOTH the Messaging and Resource Optimization Criteria",[39,40,42],"h2",{"id":41},"introduction",[43,44,45],"strong",{},"Introduction",[47,48,49],"p",{},"Real-time data has become the backbone of modern innovation. As artificial intelligence (AI) and digital services demand instantaneous insights, organizations are realizing that streaming data is no longer optional – it's essential for delivering timely, context-rich experiences. StreamNative's data streaming platform is built precisely for this reality, ensuring data is immediate, reliable, and ready to power critical applications.",[47,51,52,53,62,63],{},"Today, we're excited to announce that Forrester Research has named StreamNative as a Contender in its evaluation, ",[54,55,57],"a",{"href":56},"\u002Freports\u002Frecognized-in-the-forrester-wave-tm-streaming-data-platforms-q4-2025",[35,58,59],{},[43,60,61],{},"The Forrester Wave™: Streaming Data Platforms, Q4 2025",". This report evaluated 15 top streaming data platform providers, and we're proud to share that ",[43,64,65],{},"StreamNative received the highest scores possible—5 out of 5—in both the Messaging and Resource Optimization criteria.",[47,67,68,69],{},"***Forrester's Take: ***",[35,70,71],{},"\"StreamNative is a good fit for enterprises that want an Apache Pulsar implementation that is also compatible with Kafka APIs.\"",[47,73,74],{},[35,75,76],{},"— The Forrester Wave™: Streaming Data Platforms, Q4 2025",[47,78,79],{},"Being recognized in the Forrester Wave is a proud milestone, and for us, it highlights how far StreamNative has come in enabling enterprises to unlock the power of real-time data. In the sections below, we'll dive into what we believe sets StreamNative apart—from our modern architecture and cloud-native design to our open-source foundation and real-time use cases—and how we see these strengths aligning with Forrester's findings.",[39,81,83],{"id":82},"trusted-by-industry-leaders",[43,84,85],{},"Trusted by Industry Leaders",[47,87,88,89,92,93,96,97,100],{},"Companies across industries are already leveraging StreamNative to drive real-time outcomes. Global enterprises like ",[43,90,91],{},"Cisco"," rely on StreamNative to handle massive IoT telemetry, supporting 245 million+ connected devices. Martech leaders such as ",[43,94,95],{},"Iterable"," process billions of events per day with StreamNative for hyper-personalized customer engagement. And in financial services, ",[43,98,99],{},"FICO"," trusts StreamNative to power its real-time fraud detection and analytics pipelines with a secure, scalable streaming backbone.",[47,102,103,104,107],{},"The Forrester report notes that, “",[35,105,106],{},"Customers appreciate the lower infrastructure costs that result from StreamNative’s cost-efficient, Kafka-compatible architecture. Customers note excellent support responsiveness…","”",[39,109,111],{"id":110},"modern-cloud-native-architecture-built-for-scale",[43,112,113],{},"Modern, Cloud-Native Architecture Built for Scale",[47,115,116],{},"From day one, StreamNative was designed with a modern architecture to meet the demanding scale and flexibility requirements of real-time data. Unlike legacy streaming systems that often rely on tightly coupled storage and compute, StreamNative's platform takes a cloud-native approach: it decouples these layers to enable elastic scalability and efficient resource utilization across any environment. The core is powered by Apache Pulsar—a distributed messaging and streaming engine—enhanced with multi-protocol support (including native Apache Kafka API compatibility) to unify diverse data streams under one roof. This means organizations can consolidate siloed messaging systems and handle both high-volume event streams and traditional message queues on a single platform, without sacrificing performance or reliability.",[47,118,119,120,107],{},"Forrester's evaluation described that “",[35,121,122],{},"StreamNative aims to provide a high-performance, multi-protocol streaming data platform: It uses Apache Pulsar with Kafka API compatibility to deliver cost-efficient, real-time applications for enterprises. It appeals to organizations that want a flexible, low-cost streaming solution, due to its focus on scalability and resource optimization, while its investments in Pulsar’s open-source ecosystem and performance optimization make it the primary platform for enterprises wishing to implement Pulsar.",[47,124,125],{},"Our cloud-first, leaderless architecture (with no single broker bottlenecks) and tiered storage model were built to maximize throughput and cost-efficiency for real-time workloads. By separating compute from storage and leveraging distributed object storage, StreamNative can retain huge volumes of event data indefinitely while keeping compute costs in check—effectively providing a flexible, low-cost streaming solution.",[47,127,128,129,132],{},"This modern design not only delivers high performance, but also ensures fault tolerance and geo-distribution out of the box, so enterprises can trust their streaming data is always available and durable. As Forrester’s evaluation noted, StreamNative ",[35,130,131],{},"\"excels at messaging and resource optimization\" and “Its platform supports use cases like real-time analytics and event-driven architectures with robust scalability.","” Our architecture provides the strong foundation that today's real-time applications demand, from ultra-fast data ingestion to seamless scale-out across hybrid and multi-cloud environments.",[39,134,136],{"id":135},"open-source-foundation-and-pulsar-expertise",[43,137,138],{},"Open Source Foundation and Pulsar Expertise",[47,140,141],{},"StreamNative's DNA is rooted in open source innovation. Our founders are the original creators of Apache Pulsar, and we've built our platform with the same open principles: freedom, flexibility, and community-driven innovation. For developers and data teams, this means adopting StreamNative comes with no proprietary lock-in—instead, you get a platform built on open standards and a thriving ecosystem. We offer broad API compatibility (Pulsar, Kafka, JMS, MQTT, and more) so that teams can work with familiar interfaces and integrate StreamNative into existing systems with ease.",[47,143,144],{},"StreamNative is the primary commercial contributor to the Apache Pulsar project and its surrounding ecosystem. We invest heavily in Pulsar's ongoing improvements our investments in Pulsar's open-source ecosystem and performance optimization bolster StreamNative's value. We also foster a vibrant community through initiatives like the Data Streaming Summit and free training resources.",[47,146,147,148,151,152,155],{},"Forrester's assessment noted that StreamNative’s “",[35,149,150],{},"events-driven agents, extensibility, and performance architecture are solid,","” and we're continuing to build on that foundation. ",[43,153,154],{},"We're actively investing in expanding our tooling for observability, governance, schema management, and developer productivity","—areas we recognize as critical for enterprise adoption and where we're committed to accelerating our roadmap.",[47,157,158,159],{},"Being open also means embracing an open ecosystem of technologies. StreamNative actively integrates with the tools and platforms that matter most to our users. We partner with industry leaders like Snowflake, Databricks, Google, and Ververica to ensure our streaming platform works seamlessly with data warehouses, lakehouse storage, and stream processing frameworks. Forrester’s evaluation observed that StreamNative’s ",[35,160,161],{},"\"investments in Pulsar’s open-source ecosystem and performance optimization make it the primary platform for enterprises wishing to implement Pulsar.\"",[39,163,165],{"id":164},"powering-real-time-use-cases-across-industries",[43,166,167],{},"Powering Real-Time Use Cases Across Industries",[47,169,170],{},"One of the greatest validations of StreamNative's approach is the success our customers are achieving with real-time data. StreamNative's platform is versatile and use-case agnostic—if an application demands high-volume, low-latency data movement, we can power it. This flexibility is why our customer base spans industries from finance and IoT to major automobile manufacturers and online gaming. The common thread is that these organizations need to process and react to data in milliseconds, and StreamNative is delivering the capabilities to make that possible.",[47,172,173],{},"Cisco uses StreamNative to underpin an IoT telemetry system of colossal scale, connecting hundreds of millions of devices and thousands of enterprise clients with real-time data streams. The platform's multi-tenant design and proven reliability allow Cisco to offer its customers a live feed of device data with unwavering confidence. In the financial sector, FICO has built streaming pipelines on StreamNative to detect fraud as transactions happen and to monitor systems in real time. With StreamNative's strong guarantees around message durability and ordering, FICO can catch anomalies or suspicious patterns within seconds. And in digital customer engagement, Iterable relies on StreamNative to process billions of events every day—clicks, views, purchases—so that marketers can trigger personalized campaigns instantly based on user behavior.",[47,175,176],{},"Our customers uniformly deal with mission-critical data streams, where downtime or delays are unacceptable. StreamNative's fault-tolerant, scalable infrastructure has proven equal to the task, handling scenarios like bursting to millions of events per second or seamlessly spanning multiple cloud regions. Forrester's report recognized StreamNative for supporting event-driven architectures with robust scalability—which for us is a reflection of our platform's ability to meet the most demanding enterprise requirements.",[39,178,180],{"id":179},"continuing-to-innovate-ursa-orca-and-the-road-ahead",[43,181,182],{},"Continuing to Innovate: Ursa, Orca, and the Road Ahead",[47,184,185,186,189],{},"While we are thrilled to be recognized in Forrester's Streaming Data Platforms Wave, we view this as just the beginning. StreamNative's vision has always been bold: to ",[43,187,188],{},"provide a unified platform that not only handles today's streaming needs but also anticipates the emerging requirements of tomorrow",".",[47,191,192],{},"One key area of focus is the convergence of streaming data with advanced analytics and AI. As Forrester points out in the report, technology leaders should look for platforms that natively integrate messaging, stream processing, and analytics to provide AI agents with real-time, contextualized information. We couldn't agree more. Our award-winning Ursa Engine and Orca Agent Engine are aimed at extending our platform up the stack—bridging the gap between data streams and data lakes, and between event streams and intelligent processing.",[47,194,195],{},"Our new Ursa Engine introduces a lakehouse-native approach to streaming: it can write events directly to table formats like Iceberg on cloud storage, eliminating entire classes of ETL jobs and making fresh data instantly available for analytics queries. By integrating streaming and lakehouse technologies, we help customers collapse data silos and accelerate their AI\u002FML pipelines.",[47,197,198,199,202],{},"Beyond analytics integration, we are also enhancing StreamNative with more out-of-the-box processing and governance capabilities. In the coming months, we plan to introduce new features for lightweight stream processing and transformation, making it easier to build reactive applications directly on the platform. We're also expanding our ecosystem of connectors and integrations, so that whether your data lands in Snowflake, Databricks, or an AI model, StreamNative will seamlessly feed it. ",[43,200,201],{},"We're investing significantly in enterprise features including security, schema registry, governance, and monitoring tooling","—capabilities that are essential for mission-critical deployments and where we're committed to continued improvement.",[47,204,205],{},"This recognition from Forrester energizes us to keep innovating at full speed. We're sharing this honor with our amazing customers, community, and partners who drive us forward every day. Your feedback and real-world challenges have helped shape StreamNative into what it is today, and together, we will shape the future of streaming data. Thank you for joining us on this journey—we're just getting started, and we can't wait to deliver even more value as we continue to evolve our platform. Onward to real-time everything!",[207,208],"hr",{},[31,210,212],{"id":211},"streamnative-in-the-forrester-wave-evaluation-findings",[43,213,214,215,222],{},"StreamNative in ",[43,216,217],{},[54,218,219],{"href":56},[43,220,221],{},"The Forrester Wave™",": Evaluation Findings",[224,225,227],"h5",{"id":226},"recognized-as-a-contender-among-15-streaming-data-platform-providers","• Recognized as a Contender among 15 streaming data platform providers",[224,229,231],{"id":230},"received-the-highest-scores-possible-50-in-both-the-messaging-and-resource-optimization-criteria","* Received the highest scores possible (5.0) in both the Messaging and Resource Optimization criteria",[224,233,235],{"id":234},"cited-as-the-primary-platform-for-enterprises-wishing-to-implement-pulsar","• Cited as the primary platform for enterprises wishing to implement Pulsar",[224,237,239],{"id":238},"noted-for-excelling-at-messaging-and-resource-optimization","• Noted for excelling at messaging and resource optimization",[224,241,243],{"id":242},"customers-cited-lower-infrastructure-costs-and-excellent-support-responsiveness","• Customers cited lower infrastructure costs and excellent support responsiveness",[224,245,247],{"id":246},"recognized-for-supporting-event-driven-architectures-with-robust-scalability","• Recognized for supporting event-driven architectures with robust scalability",[207,249],{},[251,252,254,255,258,259,189],"h6",{"id":253},"forrester-disclaimer-forrester-does-not-endorse-any-company-product-brand-or-service-included-in-its-research-publications-and-does-not-advise-any-person-to-select-the-products-or-services-of-any-company-or-brand-based-on-the-ratings-included-in-such-publications-information-is-based-on-the-best-available-resources-opinions-reflect-judgment-at-the-time-and-are-subject-to-change-for-more-information-read-about-forresters-objectivity-here","**Forrester Disclaimer: **",[35,256,257],{},"Forrester does not endorse any company, product, brand, or service included in its research publications and does not advise any person to select the products or services of any company or brand based on the ratings included in such publications. Information is based on the best available resources. Opinions reflect judgment at the time and are subject to change",". *For more information, read about Forrester’s objectivity *",[54,260,264],{"href":261,"rel":262},"https:\u002F\u002Fwww.forrester.com\u002Fabout-us\u002Fobjectivity\u002F",[263],"nofollow",[35,265,266],{},"here",[207,268],{},[251,270,272],{"id":271},"apache-apache-pulsar-apache-kafka-apache-flink-and-other-names-are-trademarks-of-the-apache-software-foundation-no-endorsement-by-apache-or-other-third-parties-is-implied",[35,273,274],{},"Apache®, Apache Pulsar®, Apache Kafka®, Apache Flink® and other names are trademarks of The Apache Software Foundation. No endorsement by Apache or other third parties is implied.",{"title":17,"searchDepth":18,"depth":18,"links":276},[277,279,280,281,282,283,284],{"id":33,"depth":278,"text":37},3,{"id":41,"depth":18,"text":45},{"id":82,"depth":18,"text":85},{"id":110,"depth":18,"text":113},{"id":135,"depth":18,"text":138},{"id":164,"depth":18,"text":167},{"id":179,"depth":18,"text":182,"children":285},[286],{"id":211,"depth":278,"text":287},"StreamNative in The Forrester Wave™: Evaluation Findings","Company","2025-12-16","StreamNative is recognized in The Forrester Wave™: Streaming Data Platforms, Q4 2025. Discover why Forrester highlights StreamNative's high-performance messaging, efficient resource use, and cost-effective Kafka API compatibility for real-time innovation.","\u002Fimgs\u002Fblogs\u002F693bd36cf01b217dcb67278f_Streamnative_blog_thumbnail.png",false,{},0,"\u002Fblog\u002Fstreamnative-recognized-in-the-forrester-wave-streaming-data-platforms-2025","10 mins read",{"title":25,"description":290},"blog\u002Fstreamnative-recognized-in-the-forrester-wave-streaming-data-platforms-2025",[300,301,302],"Announcements","Real-Time","Forrester","sOeeJtEO3O-IIfTPJjY1AFOMawZ_rf8FOH8A98NEKgU",{"id":305,"title":306,"authors":307,"body":312,"category":1071,"createdAt":10,"date":1072,"description":1073,"extension":8,"featured":7,"image":1074,"isDraft":292,"link":10,"meta":1075,"navigation":7,"order":294,"path":1076,"readingTime":1077,"relatedResources":10,"seo":1078,"stem":1079,"tags":1080,"__hash__":1083},"blogs\u002Fblog\u002Fhow-we-run-a-5-gb-s-kafka-workload-for-just-50-per-hour.md","How We Run a 5 GB\u002Fs Kafka Workload for Just $50 per Hour",[308,309,310,311],"Matteo Meril","Neng Lu","Hang Chen","Penghui Li",{"type":14,"value":313,"toc":1041},[314,317,320,323,326,329,333,336,346,352,355,363,368,372,379,382,385,393,397,400,405,409,412,415,418,421,430,434,437,448,451,455,458,461,472,475,479,483,491,494,498,506,535,539,542,547,551,554,558,561,564,569,578,583,586,589,600,604,607,618,622,625,628,633,636,665,669,671,677,680,685,690,693,697,711,715,726,730,745,754,765,768,771,775,778,781,792,795,798,801,806,811,815,819,836,840,854,859,863,874,877,893,897,908,913,918,926,930,933,937,944,948,951,960,965,974,980,989,998,1007,1016,1025,1033],[47,315,316],{},"The rise of DeepSeek has shaken the AI infrastructure market, forcing companies to confront the escalating costs of training and deploying AI models. But the real pressure point isn’t just compute—it’s data acquisition and ingestion costs.",[47,318,319],{},"As businesses rethink their AI cost-containment strategies, real-time data streaming is emerging as a critical enabler. The growing adoption of Kafka as a standard protocol has expanded cost-efficient options, allowing companies to optimize streaming analytics while keeping expenses in check.",[47,321,322],{},"Ursa, the data streaming engine powering StreamNative’s managed Kafka service, is built for this new reality. With its leaderless architecture and native lakehouse storage integration, Ursa eliminates costly inter-zone network traffic for data replication and client-to-broker communication while ensuring high availability at minimal operational cost.",[47,324,325],{},"In this blog post, we benchmarked the infrastructure cost and total cost of ownership (TCO) for running a 5GB\u002Fs Kafka workload across different Kafka vendors, including Redpanda, Confluent WarpStream, and AWS MSK. Our benchmark results show that Ursa can sustain 5GB\u002Fs Kafka workloads at just 5% of the cost of traditional streaming engines like Redpanda—making it the ideal solution for high-performance, cost-efficient ingestion and data streaming for data lakehouses and AI workloads.",[47,327,328],{},"Note: We also evaluated vanilla Kafka in our benchmark; however, for simplicity, we have focused our cost comparison on vendor solutions rather than self-managed deployments. That said, it is important to highlight that both Redpanda and vanilla Kafka use a leader-based data replication approach. In a data-intensive, network-bound workload like 5GB\u002Fs streaming, with the same machine type and replication factor, Redpanda and vanilla Kafka produced nearly identical cost profiles.",[39,330,332],{"id":331},"key-benchmark-findings","Key Benchmark Findings",[47,334,335],{},"Ursa delivered 5 GB\u002Fs of sustained throughput at an infrastructure cost of just $54 per hour. For comparison:",[337,338,339,343],"ul",{},[340,341,342],"li",{},"MSK: $303 per hour → 5.6x more expensive compared to Ursa",[340,344,345],{},"Redpanda: $988 per hour → 18x more expensive compared to Ursa",[47,347,348],{},[349,350],"img",{"alt":17,"src":351},"\u002Fimgs\u002Fblogs\u002F679c71b67d9046f26edc7977_AD_4nXfvTqyBNUBu2lObdkKAx-5UNkpNP8UYULLZyOcixE6z99VMZUUEsUqWjzexI7vjyNGRNSAUoM9smYvdTP55ctAhIbrs5lmQgcSVMWdaoigbWouCl95DVSQsxooY-qqfGcYqS4g4zA.png",[47,353,354],{},"Beyond infrastructure costs, when factoring in both storage pricing, vendor pricing and operational expenses, Ursa’s total cost of ownership (TCO) for a 5GB\u002Fs workload with a 7-day retention period is:",[337,356,357,360],{},[340,358,359],{},"50% cheaper than Confluent WarpStream",[340,361,362],{},"85% cheaper than MSK and Redpanda",[47,364,365],{},[349,366],{"alt":17,"src":367},"\u002Fimgs\u002Fblogs\u002F679c602d77e9c706de5343b8_AD_4nXeDv8rrv_C1CTCCiqYo1zpvlGYbdBk1r0VEqovAPu22iFMQZgh54Hfw9PBMLzM7jDFxKwAFDxbdG0np4XVk_tGsWhEKMloLRcmmea7lvueCx-0cFsyaE3Mya4Mxc1Dox95A6JEc.png",[39,369,371],{"id":370},"ursa-highly-cost-efficient-data-streaming-at-scale","Ursa: Highly Cost-Efficient Data Streaming at Scale",[47,373,374,378],{},[54,375,377],{"href":376},"\u002Fblog\u002Fursa-reimagine-apache-kafka-for-the-cost-conscious-data-streaming","Ursa"," is a next-generation data streaming engine designed to deliver high performance at a fraction of the cost of traditional disk-based solutions. It is fully compatible with Apache Kafka and Apache Pulsar APIs, while leveraging a leaderless, lakehouse-native architecture to maximize scalability, efficiency, and cost savings.",[47,380,381],{},"Ursa’s key innovation is separating storage from compute and decoupling metadata\u002Findex operations from data operations by utilizing cloud object storage (e.g., AWS S3) instead of costly inter-zone disk-based replication. It also employs open lakehouse formats (Iceberg and Delta Lake), enabling columnar compression to significantly reduce storage costs while maintaining durability and availability.",[47,383,384],{},"In contrast, traditional streaming systems—like Kafka and Redpanda—depend on leader-based architectures, which drive up inter-zone traffic costs due to replication and client communication. Ursa mitigates these costs by:",[337,386,387,390],{},[340,388,389],{},"Eliminating inter-zone traffic costs via a leaderless architecture.",[340,391,392],{},"Replacing costly inter-zone replication with direct writes to cloud storage using open lakehouse formats.",[39,394,396],{"id":395},"how-ursa-eliminates-inter-zone-traffic","How Ursa Eliminates Inter-Zone Traffic",[47,398,399],{},"Ursa minimizes inter-zone traffic by leveraging a leaderless architecture, which eliminates inter-zone communication between clients and brokers, and lakehouse-native storage, which removes the need for inter-zone data replication. This approach ensures high availability and scalability while avoiding unnecessary cross-zone data movement.",[47,401,402],{},[349,403],{"alt":17,"src":404},"\u002Fimgs\u002Fblogs\u002F679c602e21b3571bb7117dca_AD_4nXd7Oahc77NjRLNvA9clLt0tsyU6MrIqVibFYv5pW5giTIcCHPr3EA_yTGzfVEUIVO3VXK56qWK8zmBCp5lY0E_4nmlWIPFrHjtHylA5NhwELjn-UB0fLG2h_kbrxrc7Cs_edvveNA.png",[31,406,408],{"id":407},"leaderless-architecture","Leaderless architecture",[47,410,411],{},"Traditional streaming engines such as Kafka, Pulsar, or RedPanda rely on a leader-based model, where each partition is assigned to a single leader broker that handles all writes and reads.",[47,413,414],{},"Pros of Leader-Based Architectures:\n✔ Maintains message ordering via local sequence IDs\n✔ Delivers low latency and high performance through message caching",[47,416,417],{},"Cons of Leader-Based Architectures:\n✖ Throughput bottlenecked by a single broker per partition\n✖ Inter-zone traffic required for high availability in multi-AZ deployments",[47,419,420],{},"While Kafka and Pulsar offer partial solutions (e.g., reading from followers, shadow topics) to reduce read-related inter-zone traffic, producers still send data to a single leader.",[47,422,423,424,429],{},"Ursa removes the concept of topic ownership, allowing any broker in the cluster to handle reads or writes for any partition. The primary challenge—ensuring message ordering—is solved with ",[54,425,428],{"href":426,"rel":427},"https:\u002F\u002Fgithub.com\u002Fstreamnative\u002Foxia",[263],"Oxia",", a scalable metadata and index service created by StreamNative in 2022.",[31,431,433],{"id":432},"oxia-the-metadata-layer-enabling-leaderless-architecture","Oxia: The Metadata Layer Enabling Leaderless Architecture",[47,435,436],{},"Ensuring message ordering in a leaderless architecture is complex, but Ursa solves this with Oxia:",[337,438,439,442,445],{},[340,440,441],{},"Handles millions of metadata\u002Findex operations per second",[340,443,444],{},"Generates sequential IDs to maintain strict message ordering",[340,446,447],{},"Optimized for Kubernetes with horizontal scalability",[47,449,450],{},"Producers and consumers can connect to any broker within their local AZ, eliminating inter-zone traffic costs while maintaining performance through localized caching.",[31,452,454],{"id":453},"zero-interzone-data-replication","Zero interzone data replication",[47,456,457],{},"In most distributed systems, data replication from a leader (primary) to followers (replicas) is crucial for fault tolerance and availability. However, replication across zones can inflate infrastructure expenses substantially.",[47,459,460],{},"Ursa avoids these costs by writing data directly to cloud storage (e.g., AWS S3, Google GCS):",[337,462,463,466,469],{},[340,464,465],{},"Built-In Resilience: Cloud storage inherently offers high availability and fault tolerance without inter-zone traffic fees.",[340,467,468],{},"Tradeoff: Slightly higher latency (sub-second, with p99 at 500 milliseconds) compared to local disk\u002FEBS (single-digit to sub-100 milliseconds), in exchange for significantly lower costs (up to 10x lower).",[340,470,471],{},"Flexible Modes: Ursa is an addition to the classic BookKeeper-based engine, providing users with the flexibility to optimize for either cost or low latency based on their workload requirements.",[47,473,474],{},"By foregoing conventional replication, Ursa slashes inter-zone traffic costs and associated complexities—making it a compelling option for organizations seeking to balance high-performance data streaming with strict budget constraints.",[39,476,478],{"id":477},"how-we-ran-a-5-gbs-test-with-ursa","How We Ran a 5 GB\u002Fs Test with Ursa",[31,480,482],{"id":481},"ursa-cluster-deployment","Ursa Cluster Deployment",[337,484,485,488],{},[340,486,487],{},"9 brokers across 3 availability zones, each on m6i.8xlarge (Fixed 12.5 Gbps bandwidth, 32 vCPU cores, 128 GB memory).",[340,489,490],{},"Oxia cluster (metadata store) with 3 nodes of m6i.8xlarge, distributed across three availability zones (AZs).",[47,492,493],{},"During peak throughput (5 GB\u002Fs), each broker’s network usage was about 10 Gbps.",[31,495,497],{"id":496},"openmessaging-benchmark-workers-configuration","OpenMessaging Benchmark Workers & Configuration",[47,499,500,501,505],{},"The OpenMessaging Benchmark(OMB) Framework is a suite of tools that make it easy to benchmark distributed messaging systems in the cloud. Please check ",[54,502,503],{"href":503,"rel":504},"https:\u002F\u002Fopenmessaging.cloud\u002Fdocs\u002Fbenchmarks\u002F",[263]," for details.",[337,507,508,523,532],{},[340,509,510,511,516,517,522],{},"12 OMB workers: 6 for ",[54,512,515],{"href":513,"rel":514},"https:\u002F\u002Fgist.github.com\u002Fcodelipenghui\u002Fd1094122270775e4f1580947f80c5055",[263],"producers",", 6 for ",[54,518,521],{"href":519,"rel":520},"https:\u002F\u002Fgist.github.com\u002Fcodelipenghui\u002F06bada89381fb77a7862e1b4c1d8963d",[263],"consumers"," across 3 availability zones, on m6i.8xlarge instances. Each worker is configured with 12 CPU cores and 48 GB memory.",[340,524,525,526,531],{},"Sample YAML ",[54,527,530],{"href":528,"rel":529},"https:\u002F\u002Fgist.github.com\u002Fcodelipenghui\u002F204c1f26c4d44a218ae235bf2de99904",[263],"scripts"," provided for Kafka-compatible configuration and rate limits.",[340,533,534],{},"Achieved consistent 5 GB\u002Fs publish\u002Fsubscribe throughput.",[39,536,538],{"id":537},"ursa-benchmark-tests-results","Ursa Benchmark Tests & Results",[47,540,541],{},"The following diagram demonstrates that Ursa can consistently handle 5 GB\u002Fs of traffic, fully saturating the network across all broker nodes.",[47,543,544],{},[349,545],{"alt":17,"src":546},"\u002Fimgs\u002Fblogs\u002F679c602d7b261bac1113f7d6_AD_4nXdDPsRc3koXICiFF0bqSmGWbJt_RlUy4FE3ruuWOfbCfpcqZ1dejjqGbkaCJv2hQFL1nirRouBVRW2l5uMWBvY9naMqGB_wHcLI14dBM0f85TXhmdm3UxEv1yGX9Y4hf5FttSkZew.png",[39,548,550],{"id":549},"comparing-infrastructure-cost","Comparing Infrastructure Cost",[47,552,553],{},"This benchmark first evaluates infrastructure costs of running a 5 GB\u002Fs streaming workload (1:1 producer-to-consumer ratio) across different data streaming engines, including Ursa, Redpanda, and AWS MSK, with a focus on multi-AZ deployments to ensure a fair comparison.",[31,555,557],{"id":556},"test-setup-key-assumptions","Test Setup & Key Assumptions",[47,559,560],{},"All tests use multi-AZ configurations, with clusters and clients distributed across three AWS availability zones (AZs). Cluster size scales proportionally to the number of AZs, and rack-awareness is enabled for all engines to evenly distribute topic partitions and leaders.",[47,562,563],{},"To ensure a fair comparison, we selected the same machine type capable of fully utilizing both network and storage bandwidth for Ursa and Redpanda in this 5GB\u002Fs test:",[337,565,566],{},[340,567,568],{},"9 × m6i.8xlarge instances",[47,570,571,572,577],{},"However, MSK's storage bandwidth limits vary depending on the selected instance type, with the highest allowed limit capped at 1000 MiB\u002Fs per broker, according to",[54,573,576],{"href":574,"rel":575},"https:\u002F\u002Fdocs.aws.amazon.com\u002Fmsk\u002Flatest\u002Fdeveloperguide\u002Fmsk-provision-throughput-management.html#throughput-bottlenecks",[263]," AWS documentation",". Given this constraint, achieving 5 GB\u002Fs throughput with a replication factor of 3 required the following setup:",[337,579,580],{},[340,581,582],{},"15 × kafka.m7g.8xlarge (32 vCPUs, 128 GB memory, 15 Gbps network, 4000 GiB EBS).",[47,584,585],{},"This configuration was necessary to work around MSK's storage bandwidth limitations, ensuring a comparable cost basis to other evaluated streaming engines.",[47,587,588],{},"Additional key assumptions include:",[337,590,591,594,597],{},[340,592,593],{},"Inter-AZ producer traffic: For leader-based engines, two-thirds of producer-to-broker traffic crosses AZs due to leader distribution.",[340,595,596],{},"Consumer optimizations: Follower fetch is enabled across all tests, eliminating inter-AZ consumer traffic.",[340,598,599],{},"Storage cost exclusions: This benchmark only evaluates streaming costs, assuming no long-term data retention.",[31,601,603],{"id":602},"inter-broker-replication-costs","Inter-Broker Replication Costs",[47,605,606],{},"Inter-broker (cross-AZ) replication is a major cost driver for data streaming engines:",[337,608,609,612,615],{},[340,610,611],{},"RedPanda: Inter-broker replication is not free, leading to substantial costs when data must be copied across multiple availability zones.",[340,613,614],{},"AWS MSK: Inter-broker replication is free, but MSK instance pricing is significantly higher (e.g., $3.264 per hour for kafka.m7g.8xlarge vs $1.306 per hour for an on-demand m7g.8xlarge). The storage price of MSK is $0.10 per GB-month which is significantly higher than st1, which costs $0.045 per GB-month. Even though replication is free, client-to-broker traffic still incurs inter-AZ charges.",[340,616,617],{},"Ursa: No inter-broker replication costs due to its leaderless architecture, eliminating inter-zone replication costs entirely.",[31,619,621],{"id":620},"zone-affinity-reducing-inter-az-costs","Zone Affinity: Reducing Inter-AZ Costs",[47,623,624],{},"We evaluated zone affinity mechanisms to further reduce inter-AZ data transfer costs.",[47,626,627],{},"Consumers:",[337,629,630],{},[340,631,632],{},"Follower fetch is enabled across all tests, ensuring consumers fetch data from replicas in their local AZ—eliminating inter-zone consumer traffic except for metadata lookups",[47,634,635],{},"Producers:",[337,637,638,647,656],{},[340,639,640,641,646],{},"Kafka protocol lacks an easy way to enforce producer AZ affinity (though ",[54,642,645],{"href":643,"rel":644},"https:\u002F\u002Fcwiki.apache.org\u002Fconfluence\u002Fdisplay\u002FKAFKA\u002FKIP-1123:+Rack-aware+partitioning+for+Kafka+Producer",[263],"KIP-1123"," aims to address this). And it only works with the default partitioner (i.e., when no record partition or record key is specified).",[340,648,649,650,655],{},"Redpanda recently introduced ",[54,651,654],{"href":652,"rel":653},"https:\u002F\u002Fdocs.redpanda.com\u002Fredpanda-cloud\u002Fdevelop\u002Fproduce-data\u002Fleader-pinning\u002F",[263],"leader pinning",", but this only benefits setups where producers are confined to a single AZ—not applicable to our multi-AZ benchmark.",[340,657,658,659,664],{},"Ursa is the only system in this test with ",[54,660,663],{"href":661,"rel":662},"https:\u002F\u002Fdocs.streamnative.io\u002Fdocs\u002Fconfig-kafka-client#eliminate-cross-az-networking-traffic",[263],"built-in zone affinity for both producers and consumers",". It achieves this by embedding producer AZ information in client.id, allowing metadata lookups to route clients to local-AZ brokers, eliminating inter-AZ producer traffic.",[31,666,668],{"id":667},"cost-comparison-results","Cost Comparison Results",[47,670,335],{},[337,672,673,675],{},[340,674,342],{},[340,676,345],{},[47,678,679],{},"Ursa’s leaderless architecture, zone affinity, and native cloud storage integration deliver unparalleled cost efficiency, making it the most cost-effective choice for high-throughput data streaming workloads.",[47,681,682],{},[349,683],{"alt":17,"src":684},"\u002Fimgs\u002Fblogs\u002F679c72208198ca36a352f228_AD_4nXeeZuM8T-xBlD4Vf3j67K618n08qh8wIDLLtiLJG0ssA1Wj1V26u7wIDTX9sqLrtw8mB2c299dwzarGen62CG0Vh7nWstn5qbPGFcBaKJYEepTsLr5fHWv1U8uqbg8Y0UOK6fJ7.png",[47,686,687],{},[349,688],{"alt":17,"src":689},"\u002Fimgs\u002Fblogs\u002F679c625978031f40229de484_AD_4nXdLkLLJ30KKr-_A_rN1j8akVwBYacAWIPzWHoOReJF421890kfByZoQQxkLczihVSmiw5Q9J51-V9I2SEKITbwsYnANDDTlAVL5nQ_jfaHNTe9VEWhSoa7DZooCnilDYL6l6msmJg.png",[47,691,692],{},"The detailed infrastructure cost calculations for each data streaming engine are listed below:",[31,694,696],{"id":695},"streamnative-ursa","StreamNative - Ursa",[337,698,699,702,705,708],{},[340,700,701],{},"Server EC2 costs: 9 * $1.536\u002Fhr = $14",[340,703,704],{},"Client EC2 costs: 9 * $1.536\u002Fhr =$14",[340,706,707],{},"S3 write requests costs: 1350 r\u002Fs * $0.005\u002F1000r * 3600s = $24",[340,709,710],{},"S3 read requests costs: 1350 r\u002Fs * $0.0004\u002F1000r * 3600s = $2",[31,712,714],{"id":713},"aws-msk","AWS MSK",[337,716,717,720,723],{},[340,718,719],{},"Server EC2 costs: 15 * $3.264\u002Fhr = $49",[340,721,722],{},"Client side EC2 costs: 9 * $1.536\u002Fhr =$14",[340,724,725],{},"Interzone traffic - producer to broker: 5GB\u002Fs * ⅔ * $0.02\u002FG(in+out) * 3600 = $240",[31,727,729],{"id":728},"redpanda","RedPanda",[337,731,732,734,736,739,742],{},[340,733,701],{},[340,735,704],{},[340,737,738],{},"Interzone traffic - producer to broker: 5GB\u002Fs * ⅔ * $0.02\u002FGB(in+out) * 3600 = $240",[340,740,741],{},"Interzone traffic - replication: 10GB\u002Fs * $0.02\u002FGB(in+out) * 3600 = $720",[340,743,744],{},"Interzone traffic - broker to consumer: $0 (fetch from local zone)",[47,746,747,748,753],{},"Please note that we were unable to test ",[54,749,752],{"href":750,"rel":751},"https:\u002F\u002Fwww.redpanda.com\u002Fblog\u002Fcloud-topics-streaming-data-object-storage",[263],"Redpanda with Cloud Topics",", as it remains an announced but unreleased feature and is not yet available for evaluation. Based on the limited information available, while Cloud Topics may help optimize inter-zone data replication costs, producers still need to traverse inter-availability zones to connect to the topic partition owners and incur inter-zone traffic costs of up to $240 per hour.",[337,755,756,762],{},[340,757,758,761],{},[54,759,645],{"href":643,"rel":760},[263]," (when implemented) will help mitigate producer-to-broker inter-zone traffic, but it is not yet available. And it only works with the default partitioner (no record partition or key is specified).",[340,763,764],{},"Redpanda’s leader pinning helps only when all producers for the pinned topic are confined to a single AZ. In multi-AZ environments (like our benchmark), inter-zone producer traffic remains unavoidable.",[47,766,767],{},"Additionally, Redpanda’s Cloud Topics architecture is not documented publicly. Their blog mentions \"leader placement rules to optimize produce latency and ingress cost,\" but it is unclear whether this represents a shift away from a leader-based architecture or if it uses techniques similar to Ursa’s zone-aware approach.",[47,769,770],{},"We may revisit this comparison as more details become available.",[39,772,774],{"id":773},"comparing-total-cost-of-ownership","Comparing Total Cost of Ownership",[47,776,777],{},"As highlighted earlier, with a BYOC Ursa setup, you can achieve 5 GB\u002Fs throughput at just 5% of the infrastructure cost of a traditional leader-based data streaming engine, such as Kafka or RedPanda, while managing the infrastructure yourself. This significant cost reduction is enabled by Ursa’s leaderless architecture and lakehouse-native storage design, which eliminate overhead costs such as inter-zone traffic and leader-based data replication. By leveraging a lakehouse-native, leaderless architecture, Ursa reduces resource requirements, enabling you to handle high data throughput efficiently and at a fraction of the cost of RedPanda.",[47,779,780],{},"Now, let’s examine the total cost comparison, evaluating Ursa alongside other vendors, including those that have adopted a leaderless architecture (e.g., Confluent WarpStream). This comparison is based on a 5GB\u002Fs workload with a 7-day retention period, factoring in both storage cost and vendor costs Here are the key findings:",[337,782,783,786,789],{},[340,784,785],{},"Ursa ($164,353\u002Fmonth) is: 50% cheaper than Confluent WarpStream ($337,068\u002Fmonth)",[340,787,788],{},"85% cheaper than AWS MSK ($1,115,251\u002Fmonth)",[340,790,791],{},"86% cheaper than Redpanda ($1,202,853\u002Fmonth)",[47,793,794],{},"In addition to Ursa’s architectural advantages—eliminating most inter-AZ traffic and leveraging lakehouse storage for cost-effective data retention—it also adopts a more fair and cost-efficient pricing model: Elastic Throughput-based pricing. This approach aligns costs with actual usage, avoiding unnecessary overhead.",[47,796,797],{},"Unlike WarpStream, which charges for both storage and throughput, Ursa ensures that customers only pay for the throughput they actively use. Ursa’s pricing is based on compressed data sent by clients, meaning the more data compressed on the client side, the lower the cost. In contrast, WarpStream prices are based on uncompressed data, unfairly inflating expenses and failing to incentivize customers to optimize their client applications.",[47,799,800],{},"This distinction is crucial, as compressed data reduces both storage and network costs, making Ursa’s pricing model not only more cost-effective but also more transparent and predictable.",[47,802,803],{},[349,804],{"alt":17,"src":805},"\u002Fimgs\u002Fblogs\u002F679c602d194800c9206d9d58_AD_4nXcFlf755xgyz7htxhMhBV5fGrsxy642mQNodt61DTok_z1dwkw5A6lkO5hatXVneCaB0anbZPAyvLI3MlIMuQEYLEACHHvQMOr5UfaB37dfzkdqewDEvcT-20VGd_zzvJsuA00zGA.png",[47,807,808],{},[349,809],{"alt":17,"src":810},"\u002Fimgs\u002Fblogs\u002F679c62594e9c2e629fae73aa_AD_4nXeU6cOgItnjLsEZCOf13TEvMY_SHWWIxYP2OYUj-B1GUPyWO78OG08K_v03hwYSVcg06f9dqDiGmdwy76vynjmiDGL5bluZ5_XF4nSU_r59oOZdfViXndXt6s11vVOY7qwfZN8v.png",[31,812,814],{"id":813},"cost-breakdown","Cost Breakdown",[816,817,818],"h4",{"id":695},"StreamNative – Ursa",[337,820,821,824,827,830,833],{},[340,822,823],{},"EC2 (Server): 9 × $1.536\u002Fhr × 24 hr × 30 days = $9,953.28",[340,825,826],{},"S3 Write Requests: 1,350 r\u002Fs × $0.005\u002F1,000 r × 3,600 s × 24 hr × 30 days = $17,496",[340,828,829],{},"S3 Read Requests: 1,350 r\u002Fs × $0.0004\u002F1,000 r × 3,600 s × 24 hr × 30 days = $1,400",[340,831,832],{},"S3 Storage Costs: 5 GB\u002Fs × $0.021\u002FGB × 3,600 s × 24 hr × 7 days = $63,504",[340,834,835],{},"Vendor Cost: 200 ETU × $0.50\u002Fhr × 24 hr × 30 days = $72,000",[816,837,839],{"id":838},"warpstream","WarpStream",[337,841,842,845],{},[340,843,844],{},"Based on WarpStream’s pricing calculator (as of January 29, 2025), we assume a 4:1 client data compression ratio, meaning 20 GB\u002Fs of uncompressed data translates to 5 GB\u002Fs of compressed data.",[340,846,847,848,853],{},"It's important to note that WarpStream’s pricing structure has fluctuated frequently throughout January. We observed the cost reported by their calculator changing from $409,644 per month to $337,068 per month. This variability has been previously highlighted in the blog post “",[54,849,852],{"href":850,"rel":851},"https:\u002F\u002Fbigdata.2minutestreaming.com\u002Fp\u002Fthe-brutal-truth-about-apache-kafka-cost-calculators",[263],"The Brutal Truth About Kafka Cost Calculators","”. To ensure transparency, we have documented the pricing as of January 29, 2025.",[47,855,856],{},[349,857],{"alt":17,"src":858},"\u002Fimgs\u002Fblogs\u002F679c602e42713e0028e9af5e_AD_4nXcu5_VWTLu9jRYs6zX1MBAOtLQEo5gyfNSWPcbpnQHXTa8qNCFAXezRR2E8daygzYTTwd4dhJjaLaLM8C6y_3OGbu2NS7pdvEv3a8-ptNKOg7AeKnYqPQCAYvQ5EuxzuI3JYIvY.png",[816,860,862],{"id":861},"msk","MSK",[337,864,865,868,871],{},[340,866,867],{},"EC2 (Server): 15 * $3.264\u002Fhr × 24 hr × 30 days = $35,251",[340,869,870],{},"Interzone Traffic (Client-Server): 5 GB\u002Fs × ⅔ × $0.02\u002FGB (in+out) × 3,600 s × 24 hr × 30 days = $172,800",[340,872,873],{},"Storage: 5 GB\u002Fs × $0.1\u002FGB-month × 3,600 s × 24 hr × 7 days * 3 replicas = $907,200",[816,875,729],{"id":876},"redpanda-1",[337,878,879,882,884,887,890],{},[340,880,881],{},"EC2 (Server): 9 × $1.536\u002Fhr × 24 hr × 30 days = $9953",[340,883,870],{},[340,885,886],{},"Interzone Traffic (Replication): 5 GB\u002Fs × 2 × $0.02\u002FGB (in+out) × 3,600 s × 24 hr × 30 days = $518,400",[340,888,889],{},"Storage: 5 GB\u002Fs × $0.045\u002FGB-month(st1) × 3,600 s × 24 hr × 7 days * 3 replicas = $408,240",[340,891,892],{},"Vendor Cost: $93,333 per month (based on limited information. See additional notes below).",[816,894,896],{"id":895},"additional-notes","Additional Notes",[337,898,899],{},[340,900,901,902,907],{},"Redpanda does not publicly disclose its BYOC pricing, making it difficult to accurately assess its total costs. We refer to information from the whitepaper “",[54,903,906],{"href":904,"rel":905},"https:\u002F\u002Fwww.redpanda.com\u002Fresources\u002Fredpanda-vs-confluent-performance-tco-benchmark-report#form",[263],"Redpanda vs. Confluent: A Performance and TCO Benchmark Report by McKnight Consulting Group.","” for estimation purposes. Based on the Tier-8 pricing model in the whitepaper,  the estimated cost to support a 5GB\u002Fs workload would be $1.12 million per year ($93,333 per month). However, since this calculation is based on an estimation, we will revisit and refine the cost assessment once Redpanda publishes its BYOC pricing.",[47,909,910],{},[349,911],{"alt":17,"src":912},"\u002Fimgs\u002Fblogs\u002F679c602dc8a9859eed89a0ef_AD_4nXdbcO8vsNNPy4GtkNLlmNKf22fjxRvzLzH7CtOna1L08sTbvnZx3HhufeFqc1w4K2gEF7lxO2IR5supotxebAiGnA07Qa8Yr3Rd1pVK2LYKK4WurlJGwgdwwucZIFoF-N_2oBjY.png",[47,914,915],{},[349,916],{"alt":17,"src":917},"\u002Fimgs\u002Fblogs\u002F679c602d6bc1c2287e012540_AD_4nXfcHZnLfjbjIr3ZAgoQXT9dwP3aQCOQPmGZZJUtpNZSwE6qY6M3yehIaBxCwxEIeu5PVdUPY0zhyjnow26YfgjdYgSG4GnV9ibxu0YWTIpwng6z_F6FUGJMpERMKtpsFESzXSN_Sw.png",[337,919,920,923],{},[340,921,922],{},"When estimating the storage costs for Kafka and Redpanda, we assume the use of HDD storage at $0.045\u002FGB, based on the premise that both systems can fully utilize disk bandwidth without incurring the higher costs associated with GP2 or GP3 volumes. However, in practice, many users opt for GP2 or GP3, significantly increasing the total storage cost for Kafka and Redpanda.",[340,924,925],{},"Unlike disk-based solutions, S3 storage does not require capacity preallocation—Ursa only incurs costs for the actual data stored. This contrasts with Kafka and Redpanda, where preallocating storage can drive up expenses. As a result, the real-world storage costs for Kafka and Redpanda are often 50% higher than the estimates above.",[39,927,929],{"id":928},"conclusion","Conclusion",[47,931,932],{},"Ursa represents a transformative shift in streaming data infrastructure, offering cost efficiency, scalability, and flexibility without compromising durability or reliability. By leveraging a leaderless architecture and eliminating inter-zone data replication, Ursa reduces total cost of ownership by over 90% compared to traditional leader-based streaming engines like Kafka and Redpanda. Its direct integration with cloud storage and scalable metadata & index management via Oxia ensure high availability and simplified infrastructure management.",[31,934,936],{"id":935},"balancing-latency-and-cost","Balancing Latency and Cost",[47,938,939,943],{},[54,940,942],{"href":941},"\u002Fblog\u002Fcap-theorem-for-data-streaming","Ursa trades off slightly higher latency for ultra low cost",", making it an ideal choice for the majority of streaming workloads, especially those that prioritize throughput and cost savings over ultra-low latency. Meanwhile, StreamNative’s BookKeeper-based engine remains the preferred solution for real-time, latency-sensitive applications. By combining these two approaches, StreamNative empowers customers with the flexibility to choose the right engine for their specific needs—whether it's maximizing cost savings or achieving ultra low-latency real-time performance.",[31,945,947],{"id":946},"the-future-of-streaming-infrastructure","The Future of Streaming Infrastructure",[47,949,950],{},"In an era where data fuels AI, analytics, and real-time decision-making, managing infrastructure costs is critical to sustaining innovation. Ursa is not just a cost-cutting alternative—it is a forward-thinking, lakehouse-native platform that redefines how modern data streaming infrastructure should be built and operated.",[47,952,953,954,959],{},"Whether your priority is reducing costs, improving flexibility, or ingesting massive data into lakehouses, Ursa delivers a future-proof solution for the evolving demands of real-time data streaming. ",[54,955,958],{"href":956,"rel":957},"https:\u002F\u002Fconsole.streamnative.cloud\u002F",[263],"Get started"," with StreamNative Ursa today!",[961,962,964],"h1",{"id":963},"references","References",[47,966,967,970,971],{},[968,969,428],"span",{}," ",[54,972,973],{"href":973},"\u002Fblog\u002Fintroducing-oxia-scalable-metadata-and-coordination",[47,975,976,970,978],{},[968,977,377],{},[54,979,376],{"href":376},[47,981,982,970,985],{},[968,983,984],{},"StreamNative pricing",[54,986,987],{"href":987,"rel":988},"https:\u002F\u002Fdocs.streamnative.io\u002Fdocs\u002Fbilling-overview",[263],[47,990,991,970,994],{},[968,992,993],{},"WarpStream pricing",[54,995,996],{"href":996,"rel":997},"https:\u002F\u002Fwww.warpstream.com\u002Fpricing#pricingfaqs",[263],[47,999,1000,970,1003],{},[968,1001,1002],{},"AWS S3 pricing",[54,1004,1005],{"href":1005,"rel":1006},"https:\u002F\u002Faws.amazon.com\u002Fs3\u002Fpricing\u002F",[263],[47,1008,1009,970,1012],{},[968,1010,1011],{},"AWS EBS pricing",[54,1013,1014],{"href":1014,"rel":1015},"https:\u002F\u002Faws.amazon.com\u002Febs\u002Fpricing\u002F",[263],[47,1017,1018,970,1021],{},[968,1019,1020],{},"AWS MSK pricing",[54,1022,1023],{"href":1023,"rel":1024},"https:\u002F\u002Faws.amazon.com\u002Fmsk\u002Fpricing\u002F",[263],[47,1026,1027,970,1030],{},[968,1028,1029],{},"The Brutal Truth about Kafka Cost Calculators",[54,1031,850],{"href":850,"rel":1032},[263],[47,1034,1035,970,1038],{},[968,1036,1037],{},"Redpanda vs. Confluent: A Performance and TCO Benchmark Report by McKnight Consulting Group",[54,1039,904],{"href":904,"rel":1040},[263],{"title":17,"searchDepth":18,"depth":18,"links":1042},[1043,1044,1045,1050,1054,1055,1064,1067],{"id":331,"depth":18,"text":332},{"id":370,"depth":18,"text":371},{"id":395,"depth":18,"text":396,"children":1046},[1047,1048,1049],{"id":407,"depth":278,"text":408},{"id":432,"depth":278,"text":433},{"id":453,"depth":278,"text":454},{"id":477,"depth":18,"text":478,"children":1051},[1052,1053],{"id":481,"depth":278,"text":482},{"id":496,"depth":278,"text":497},{"id":537,"depth":18,"text":538},{"id":549,"depth":18,"text":550,"children":1056},[1057,1058,1059,1060,1061,1062,1063],{"id":556,"depth":278,"text":557},{"id":602,"depth":278,"text":603},{"id":620,"depth":278,"text":621},{"id":667,"depth":278,"text":668},{"id":695,"depth":278,"text":696},{"id":713,"depth":278,"text":714},{"id":728,"depth":278,"text":729},{"id":773,"depth":18,"text":774,"children":1065},[1066],{"id":813,"depth":278,"text":814},{"id":928,"depth":18,"text":929,"children":1068},[1069,1070],{"id":935,"depth":278,"text":936},{"id":946,"depth":278,"text":947},"StreamNative Cloud","2025-01-31","Discover how Ursa achieves 5GB\u002Fs Kafka workloads at just 5% of the cost of traditional streaming engines like Redpanda and AWS MSK. See our benchmark results comparing infrastructure costs, total cost of ownership (TCO), and performance across leading Kafka vendors.","\u002Fimgs\u002Fblogs\u002F679c6593d25099b1cdcec4ca_image-31.png",{},"\u002Fblog\u002Fhow-we-run-a-5-gb-s-kafka-workload-for-just-50-per-hour","30 min",{"title":306,"description":1073},"blog\u002Fhow-we-run-a-5-gb-s-kafka-workload-for-just-50-per-hour",[1081,1082,301],"TCO","Apache Kafka","A0o_2xdJiLI6rf6xj4RKsxJNo_A6QN2fYzCp6gaLrFw",[1085,1233,1401,1511,1652],{"id":1086,"title":1087,"authors":1088,"body":1090,"company":1216,"createdAt":10,"customerQuote":10,"date":1217,"description":1218,"extension":8,"featured":292,"image":1219,"industry":1220,"isDraft":292,"link":1203,"logo":10,"meta":1221,"navigation":7,"order":294,"path":1222,"products":1223,"readingTime":1224,"relatedResources":10,"seo":1225,"size":1226,"stem":1227,"tags":1228,"technologies":1230,"useCases":1231,"__hash__":1232},"successStories\u002Fsuccess-stories\u002Fhow-blueshift-powers-ai-driven-customer-engagement-with-apache-pulsar.md","How Blueshift Powers AI-Driven Customer Engagement with Apache Pulsar",[1089],"Meraj Bhawani",{"type":14,"value":1091,"toc":1209},[1092,1096,1099,1105,1108,1112,1115,1136,1139,1143,1146,1165,1169,1172,1186,1190,1193,1196,1199,1206],[39,1093,1095],{"id":1094},"introduction-the-challenge-of-scalable-customer-engagement","Introduction: The Challenge of Scalable Customer Engagement",[47,1097,1098],{},"Blueshift is an AI-driven customer engagement platform that combines customer data, cross-channel marketing, and AI to deliver personalized experiences at every stage of the customer journey. Achieving this vision at scale is no small feat – Blueshift ingests billions of events per day and processes over 50 terabytes of data daily for hundreds of enterprise customers. Its backend consisted of 250+ microservices and dozens of databases, handling both real-time and batch data flows. However, Blueshift’s legacy architecture began to strain under this growth. The system was tightly coupled – hardwired service dependencies meant a slowdown in one microservice could cause failures in multiple other services. The legacy stack relied on multiple disparate messaging systems (Kafka for streaming, NSQ for pub\u002Fsub, Sidekiq for job queues) which added complexity and operational overhead. This resulted in occasional cascading failures and reliability issues under extreme load. In short, the architecture lacked one critical quality: “resilience” – the ability to gracefully handle failures, spikes, and unpredictable conditions without collapsing.",[47,1100,1101,1102,1104],{},"Blueshift’s engineering team realized a fundamental overhaul was needed. They envisioned a new architecture with resilience as the core focus. Key requirements included adopting a fully event-driven design with asynchronous processing (to decouple services), introducing customer-level and service-level SLAs (to isolate workloads and avoid noisy neighbors), enabling data stream fan-out (so one event could feed multiple consumers), and seamless auto-scaling to handle traffic spikes. They also sought higher fault tolerance and infrastructure consolidation – replacing the tangle of Kafka\u002FNSQ\u002FSidekiq with a single unified messaging platform. In reimagining the platform, Blueshift “decided to rebuild with ",[968,1103,54],{}," new architecture” centered on these goals.",[47,1106,1107],{},"Enter Apache Pulsar. To meet the above requirements, Blueshift turned to Apache Pulsar as the event-driven backbone of its next-generation system. Pulsar offered the promise of a reliable publish\u002Fsubscribe foundation with the flexibility, scalability, and durability needed to connect hundreds of microservices in real time. The following sections describe why Blueshift chose Pulsar and how it transformed their architecture to achieve massive scale with resilience.",[39,1109,1111],{"id":1110},"why-blueshift-chose-pulsar-as-its-backbone","Why Blueshift Chose Pulsar as Its Backbone",[47,1113,1114],{},"Several factors led Blueshift to select Apache Pulsar as the messaging heart of its platform, replacing the legacy mix of systems:",[337,1116,1117,1120,1123,1126,1133],{},[340,1118,1119],{},"Unified Streaming and Queueing: Pulsar allowed Blueshift to consolidate Kafka, NSQ, and Sidekiq into a single platform, eliminating the operational overhead of managing multiple messaging technologies. With one unified cluster handling pub\u002Fsub, queuing, and streaming, the team reduced infrastructure complexity, training requirements, and costs.",[340,1121,1122],{},"Decoupling for Fault Isolation: Pulsar’s native publish-subscribe model decouples producers and consumers, enabling loose coupling between microservices. Blueshift’s services now communicate through Pulsar topics instead of direct calls, so a slowdown in one component no longer cascades to others. This event-driven architecture provides true fault isolation, vastly improving overall system resiliency. If one service lags, its messages queue in Pulsar without bringing down the entire pipeline.",[340,1124,1125],{},"Multi-Tenancy and Isolation: Apache Pulsar was designed with multi-tenancy (tenants and namespaces) which Blueshift leverages to isolate data streams per customer and service. In the new design, different teams and features operate in separate Pulsar namespaces, and each customer gets dedicated topics for their data. This prevents the “noisy neighbor” problem – one client’s traffic spikes can’t interfere with others – making per-customer SLAs technically feasible. Blueshift can guarantee a minimum processing throughput for each customer by segregating workloads at the topic level.",[340,1127,1128,1129,1132],{},"Durable Storage (No Data Loss): Pulsar’s segmented storage (backed by Apache BookKeeper) ensures persistence of events and guards against data loss. Unlike the old system where outages could drop events, Pulsar’s durable log keeps all data until acknowledged by consumers. Blueshift “never need",[968,1130,1131],{},"s"," to worry about message loss” anymore thanks to Pulsar’s highly durable storage architecture – a critical requirement given the volume of valuable customer interaction data being handled.",[340,1134,1135],{},"Scalable Fan-Out: Many of Blueshift’s pipelines require the same event to drive multiple actions (for example, a user activity event might update profiles, trigger a campaign, and index into search). Pulsar supports consumer fan-out, allowing multiple independent subscriptions on the same topic. Blueshift no longer needs to build duplicate data pipelines or topic clones for each new consumer. Each service simply subscribes to the relevant Pulsar topic, and Pulsar efficiently delivers a copy of each message to all subscribers. This drastically simplifies the architecture for cross-cutting data flows and ensures consistent data across services without extra overhead.",[47,1137,1138],{},"In addition to the above, Pulsar brought other out-of-the-box features that Blueshift found valuable, such as broker-side dispatch rate limiting to throttle consumers (useful for protecting downstream systems) and flexible retention policies for different data types (hot data vs. cold data). All these capabilities aligned perfectly with Blueshift’s needs for a multi-tenant, scalable, and robust messaging backbone.",[39,1140,1142],{"id":1141},"architecting-blueshift-with-pulsar-key-improvements","Architecting Blueshift with Pulsar: Key Improvements",[47,1144,1145],{},"Blueshift’s new architecture, built around Apache Pulsar, introduced several powerful patterns and operational improvements that solved the legacy challenges:",[1147,1148,1149,1152,1155,1162],"ol",{},[340,1150,1151],{},"Per-Customer Topics for Isolated SLAs: In the Pulsar model, Blueshift segregated event streams by customer account to eliminate contention. Within each Pulsar namespace (grouped by domain like “user updates” or “campaign events”), Blueshift enabled automatic topic creation so that every new customer gets their own set of topics for their data. For example, under the user-updates namespace, there might be topics named for Customer1, Customer2, Customer3, etc., created on the fly when each customer onboards. This provides strong isolation between customers’ event flows – one client’s surge won’t backlog another’s, since they are on different topics. Blueshift also applies Pulsar’s namespace policies (like per-namespace retention and rate limits) to give each data stream the appropriate SLA. High-priority events (e.g. real-time user clicks) use topics with short retention and aggressive rate limits, while less urgent data (e.g. weekly analytics) can reside in topics with longer retention and moderated throughput. By organizing topics per customer and use case, Blueshift avoids noisy neighbors and can guarantee minimum service levels for each client’s data – a crucial business requirement as the company scales.",[340,1153,1154],{},"Seamless Scaling and Fault Isolation: In the revamped architecture, services are decoupled via Pulsar, so the system handles component failures gracefully. Suppose a particular microservice (e.g. the event processing service) slows down or goes offline – instead of rippling failures, Pulsar buffers incoming events in a backlog for that service’s topic. Other services continue to function normally, and the platform as a whole stays online (the status page might simply show a delay in one area, rather than a full outage). Once the affected service is restored, it can replay the queued messages and catch up. Blueshift configured auto-scaling for such scenarios: when a backlog is building up, additional consumer instances automatically spin up to drain the queue faster. Pulsar distributes the accumulated messages to the newly scaled-out consumers, and the backlog drops to normal levels without manual intervention. This elastic scaling ensures that recovery is quick and throughput can surge to meet demand, all while isolating the incident to the troubled service. The new Pulsar-driven design thus contains failures and spikes to single subsystems – a stark contrast to the old architecture where one slow database could drag everything down. Small outages that previously caused widespread downtime are now non-events, handled transparently by Pulsar’s backpressure and buffering capabilities.",[340,1156,1157,1158,1161],{},"Message Replay for Easy Recovery: Pulsar’s persistent storage and cursor management give Blueshift the ability to reprocess events on demand – a feature that has vastly improved operability. If a downstream system experiences a transient issue or data needs to be reloaded, the team can simply replay messages from Pulsar rather than building custom scripts or asking clients to resend data. For example, if the database that feeds a particular report was temporarily down, Blueshift can instruct the consumer to rewind to an earlier position or use Pulsar’s built-in replay tools to re-deliver recent events once the database recovers. This capability means no data is permanently lost or skipped due to outages. The team highlighted that they can “easily go and replay messages right from Pulsar and ",[968,1159,1160],{},"don’t"," have to involve the consumer at all” to backfill missing data. Similarly, for use cases that benefit from periodic reprocessing (say, rebuilding a machine learning feature store), they can consume past events from Pulsar’s log without impacting live ingestion. Pulsar’s replay and infinite retention options act as a safety net, making recovery and maintenance tasks far less painful than in the past.",[340,1163,1164],{},"Zero-Downtime Elasticsearch Maintenance: One striking example of Pulsar’s impact is how Blueshift revamped its Elasticsearch indexing pipeline. Blueshift’s platform relies on Elasticsearch for powering user profile search and segmentation, with hundreds of indices across multiple clusters ingesting billions of documents (user data, events, etc.). In the past, intensive maintenance tasks such as reindexing or shard reconfiguration risked performance degradation or required downtime. By integrating Apache Pulsar into its architecture, Blueshift introduced a new approach that decouples live indexing from maintenance workflows. Pulsar’s durable, replayable message streams and native subscription fan-out model allow multiple consumers to independently process the same data streams, enabling Blueshift to run parallel maintenance or migration operations without affecting production indexing. This design ensures continuity of data during long-running background tasks, allowing index updates, rebalances, or optimizations to complete seamlessly with no service interruption or performance degradation. As a result, complex Elasticsearch operations that once required downtime can now be executed transparently with far greater operational agility.",[39,1166,1168],{"id":1167},"benefits-and-business-impact","Benefits and Business Impact",[47,1170,1171],{},"By rebuilding its data infrastructure around Apache Pulsar, Blueshift realized significant technical and business benefits:",[337,1173,1174,1177,1180,1183],{},[340,1175,1176],{},"Reduced Complexity and Cost: Simplifying from three messaging systems to one Pulsar-based platform immediately lowered Blueshift’s operational complexity and expenses. The team no longer maintains separate Kafka, NSQ, and Sidekiq clusters – a consolidated Pulsar cluster handles all streaming, queueing, and pub\u002Fsub needs. This infrastructure consolidation cuts down on maintenance effort, infrastructure footprint, and training, allowing engineers to focus on innovation rather than babysitting multiple systems.",[340,1178,1179],{},"Higher Reliability and Resilience: The Pulsar-driven event architecture has virtually eliminated cascading failures that previously caused platform outages. Services are insulated by Pulsar topics, so an issue in one area results at most in a backlog and localized delay, not a platform-wide crash. Blueshift’s platform now stays operational through hiccups like machine failures or sudden traffic spikes – precisely the resilient behavior the team set out to achieve. This improved reliability translates into better uptime and trust for Blueshift’s customers, as the system can handle unexpected disturbances “within some acceptable degradation” rather than going down.",[340,1181,1182],{},"Guaranteed Customer SLAs: Thanks to Pulsar, Blueshift can confidently offer per-customer performance guarantees. Each client’s data streams are isolated in their own set of topics, protected by Pulsar’s tenant and namespace isolation. One customer uploading millions of records will not slow down another customer’s processing. This not only avoids awkward conversations (“we’re slow because another customer overloaded the system”), but it also ensures consistent, predictable service for all clients big and small. In terms of business impact, this isolation is a competitive advantage – Blueshift can handle large enterprise workloads without letting any single tenant degrade overall platform performance.",[340,1184,1185],{},"Streamlined Operations and Recovery: Pulsar’s rich feature set has made day-to-day data operations much easier. The ability to replay data from Pulsar means the team can recover from errors or backfill data at any time, without special tooling. Complex maintenance tasks, like the Elasticsearch reindexing, are now done with zero downtime using Pulsar to keep systems in sync. Moreover, scaling up throughput is as simple as adding more consumer instances to a topic – Pulsar handles load balancing – which gives Blueshift headroom to grow on demand. These improvements translate to less firefighting for the engineering team and more agility in rolling out new features or handling traffic peaks. Overall, Apache Pulsar has become a force multiplier for Blueshift’s developers and SREs, reducing risk and toil while improving service quality.",[39,1187,1189],{"id":1188},"conclusion-pulsar-as-the-foundation-for-a-resilient-data-platform","Conclusion: Pulsar as the Foundation for a Resilient Data Platform",[47,1191,1192],{},"Blueshift’s journey illustrates how a robust messaging backbone can unlock the full potential of a data-intensive platform. By adopting Apache Pulsar, Blueshift transformed a fragile legacy system into a scalable, event-driven architecture that powers real-time customer engagement on a global scale. Pulsar now serves as the “central nervous system” of Blueshift’s platform, connecting hundreds of microservices and data pipelines in a decoupled, reliable manner. Features like persistent storage, multi-tenancy, and flexible subscriptions enabled Blueshift to achieve a level of flexibility and resilience that would have been impractical otherwise. With Pulsar ensuring no data is lost and no service is overwhelmed, Blueshift’s team can innovate faster and deliver new capabilities knowing the backbone will scale and recover gracefully. The end result is a win-win: developers spend less time on plumbing and more on product, while customers experience a highly reliable, real-time personalization service even as data volumes explode.",[47,1194,1195],{},"Blueshift’s next-gen infrastructure, built on Pulsar, is a compelling blueprint for any organization facing growth challenges with  legacy architecture. It demonstrates that modern event streaming technology can replace brittle, monolithic designs with cloud-native resiliency – enabling mission-critical systems to meet strict SLAs and adapt to change with ease. As Blueshift continues to expand its AI-driven customer engagement platform, Apache Pulsar remains the backbone that ensures every message reaches its destination and every customer action is processed promptly, come what may.",[47,1197,1198],{},"To learn more about Blueshift’s Pulsar journey and architectural insights, watch the full talk from Data Streaming Summit 2025 on YouTube.",[47,1200,1201],{},[54,1202,1205],{"href":1203,"rel":1204},"https:\u002F\u002Fyoutu.be\u002F4ESnTHc2wR8",[263],"Next-Gen Data Infra: Building Resilient, Scalable Architecture with Apache Pulsar",[47,1207,1208],{},"‍",{"title":17,"searchDepth":18,"depth":18,"links":1210},[1211,1212,1213,1214,1215],{"id":1094,"depth":18,"text":1095},{"id":1110,"depth":18,"text":1111},{"id":1141,"depth":18,"text":1142},{"id":1167,"depth":18,"text":1168},{"id":1188,"depth":18,"text":1189},"Blueshift","2025-11-19","Discover how Blueshift, an AI-driven customer engagement platform, transformed its legacy architecture by adopting Apache Pulsar for unified, resilient, and scalable real-time data streaming and guaranteed customer SLAs.","\u002Fimgs\u002Fsuccess-stories\u002F691dc0e070a064ef19f490d2_Blueshift.png","MarTech",{},"\u002Fsuccess-stories\u002Fhow-blueshift-powers-ai-driven-customer-engagement-with-apache-pulsar",[],"8 min",{"title":1087,"description":1218},"50-200 employees","success-stories\u002Fhow-blueshift-powers-ai-driven-customer-engagement-with-apache-pulsar",[1229,1082],"Apache Pulsar",[1229,1082],"AI-driven customer engagement platform","aGVFOuHcHzCtt5MYZyTHeYcy-m8zSdzAFUajZGOZfVY",{"id":1234,"title":1235,"authors":1236,"body":1238,"company":1265,"createdAt":10,"customerQuote":10,"date":1386,"description":1387,"extension":8,"featured":7,"image":1388,"industry":1389,"isDraft":292,"link":1390,"logo":10,"meta":1391,"navigation":7,"order":294,"path":1390,"products":1392,"readingTime":1393,"relatedResources":10,"seo":1394,"size":1395,"stem":1396,"tags":1397,"technologies":1398,"useCases":1399,"__hash__":1400},"successStories\u002Fsuccess-stories\u002Funify-achieves-real-time-go-to-market-scale-with-apache-pulsar-and-streamnative-cloud.md","Unify Achieves Real-Time Go-To-Market Scale with Apache Pulsar and StreamNative Cloud",[1237],"Sam Waterbury",{"type":14,"value":1239,"toc":1378},[1240,1244,1255,1259,1273,1276,1280,1283,1300,1304,1307,1324,1328,1331,1351,1355,1358,1375],[39,1241,1243],{"id":1242},"executive-summary","‍Executive Summary",[337,1245,1246,1249,1252],{},[340,1247,1248],{},"Real-Time Pipeline at Scale: Unify’s AI-driven go-to-market platform ingests tens of millions of events per day in real time, enabling instantaneous lead scoring and workflow triggers that turn growth into a science.",[340,1250,1251],{},"Simplified, Reliable Architecture: By replacing batch jobs and a legacy queue system (AWS SQS) with Apache Pulsar, Unify consolidated multiple use cases (event streaming, pub-sub, and scheduling) onto one platform. This simplification improved scalability and eliminated costly cron jobs, all while providing message retention for replays and 20× faster monitoring metrics (30 seconds vs. ~10 minutes).",[340,1253,1254],{},"High Leverage & Resilience: With StreamNative Cloud, Unify achieved a stable, hands-off infrastructure that easily auto-scales to demand. The asynchronous architecture ensured zero downtime even during major cloud outages, and features like delayed messages and replayable logs give the engineering team peace of mind when deploying new AI-driven features.",[39,1256,1258],{"id":1257},"customer-overview","Customer Overview",[47,1260,1261,1266,1267,1272],{},[54,1262,1265],{"href":1263,"rel":1264},"https:\u002F\u002Fwww.unifygtm.com\u002F",[263],"Unify"," is a San Francisco-based, AI-native go-to-market (GTM) platform that helps revenue teams capture intent signals and engage prospects in real time. Backed by leading investors – including the OpenAI Startup Fund – Unify has rapidly grown, ",[54,1268,1271],{"href":1269,"rel":1270},"https:\u002F\u002Fwww.unifygtm.com\u002Fblog\u002Fseries-b",[263],"raising a $40 million Series B in July 2025",". The company’s platform combines data integration, AI agents, and automated outreach into one system of action. From website clicks to CRM updates, Unify tracks all buyer interactions and uses AI to personalize outreach, aiming to transform growth into repeatable, scalable science.",[47,1274,1275],{},"From day one, Unify’s founding team set out to build the product on a real-time, event-driven architecture. The vision was to avoid traditional batch processing or daily jobs in favor of continuous streams of data. This approach would allow the young startup to scale more easily and deliver instant insights to sales teams. However, achieving high-throughput, real-time data flow with a lean team presented early technical challenges.",[39,1277,1279],{"id":1278},"challenges","Challenges",[47,1281,1282],{},"Unify’s initial infrastructure relied on AWS SQS for message queuing, but as the platform’s ambitions grew, this approach revealed significant limitations:",[337,1284,1285,1288,1291,1294,1297],{},[340,1286,1287],{},"Scaling Batch Jobs: Early on, Unify realized that a pipeline built on cron-triggered batch jobs would not scale gracefully. \"When you design an architecture built on jobs, the jobs break at every order of magnitude,\" explains Sam Waterbury, Founding ML Engineer at Unify. \"It’s not simple to horizontally scale batch jobs\". The team needed an event-driven model where adding consumers could handle load increases seamlessly, without reengineering pipelines at each growth step.",[340,1289,1290],{},"Pub\u002FSub and Fan-Out: Unify’s product demanded that a single inbound event (e.g. a website visit) trigger multiple downstream actions in different services (lead scoring, AI research agent, analytics, etc.). SQS, however, is a point-to-point queue system. Implementing a publish-subscribe pattern with multiple subscribers on SQS was clunky and limited. The team sought a true pub\u002Fsub messaging system to easily fan out events to many services in parallel.",[340,1292,1293],{},"No Replay or Retention: Because Unify’s AI algorithms and workflows continuously evolve, the engineers wanted the ability to reprocess past events in case of a bug fix or a new feature. SQS does not retain messages once consumed, making it impossible to \"replay\" historical events. This lack of message retention introduced risk — a glitch in a consumer could permanently lose valuable event data with no way to recover. Unify needed a durable log of events to enable replays and backfills for peace of mind.",[340,1295,1296],{},"Operational Visibility: Monitoring and scaling on SQS proved too reactive. Metrics on queue length and throughput often lagged by up to 10 minutes in cloud dashboards, slowing down auto-scaling responses. In a dynamic environment with spiky web traffic, Unify required faster insight into backlog growth to scale consumer instances promptly and avoid delays.",[340,1298,1299],{},"Lean Team & Maintenance Burden: As a small, fast-growing startup, Unify was wary of managing complex infrastructure. They evaluated Apache Pulsar (the cloud-native distributed messaging platform originally created at Yahoo), which offered the features they needed. However, self-hosting a Pulsar cluster would demand significant DevOps effort. \"I spent about a day trying to self-host Pulsar,\" Sam admits, \"and I quickly realized I didn’t want to be in the business of managing that.\" The team needed a managed solution that would free them to focus on product development.",[39,1301,1303],{"id":1302},"solution","Solution",[47,1305,1306],{},"After weighing their options, Unify migrated from SQS to StreamNative Cloud - a fully managed Apache Pulsar service – providing a real-time streaming backbone without the ops overhead. This switch addressed the startup’s challenges head-on:",[337,1308,1309,1312,1315,1318,1321],{},[340,1310,1311],{},"True Multi-Subscriber Streaming: With Pulsar, Unify can publish an event once and have multiple services subscribe to it in parallel. For example, when a prospect visits a website, that single event is written to a Pulsar topic and picked up simultaneously by separate consumer services: one updates the person’s intent score, another triggers an AI research agent to qualify the lead, and yet another logs the event for analytics. Pulsar’s built-in pub\u002Fsub capability elegantly handles this fan-out, something that was manual and error-prone with SQS.",[340,1313,1314],{},"Built-in Durability and Replay: StreamNative Cloud retains all events even after consumers process them, allowing Unify to replay data if needed. If the team discovers a bug in an AI scoring algorithm, they can fix the code and reconsume the past events from Pulsar’s log to correct the scores. This replay feature provides a safety net that was absent before. \"We wanted the ability to retain messages after processing so we could replay them if something was wrong,\" Sam says. \"Pulsar gave us that peace of mind.\"",[340,1316,1317],{},"Delayed Message Scheduling: Unify took advantage of an innovative Pulsar feature – delayed message delivery – to eliminate batch jobs entirely in one core use case. The platform’s intent scoring system needs to decay a lead’s score over time if no new events occur. Instead of running a scheduled job every few hours, Unify’s scoring service simply sends a message to a Pulsar topic addressed to itself with a 24-hour delay. If no new activity happens, Pulsar delivers that message back to the service the next day, prompting it to lower the score and schedule another delayed message. This cycle continues for 30 days until the score naturally falls to zero. If a new event comes in, the delayed message is canceled and the cycle restarts. By using Pulsar as a built-in scheduler, Unify eliminated the need for external cron jobs or workflow schedulers for this task. The result is an elegant, fully event-driven solution for a problem that traditionally required complex batch processing.",[340,1319,1320],{},"SaaS Service & Ease of Scale: Adopting StreamNative Cloud meant Unify’s small engineering team didn’t have to worry about cluster operations, updates, or scaling the messaging layer. Pulsar’s elastic scalability and StreamNative’s 24\u002F7 management ensure that as Unify’s data volumes grow, the messaging infrastructure seamlessly scales with it. The team can easily add consumer instances in new cloud regions or for new customers without re-architecting their pipeline. Furthermore, real-time metrics from Apache Pulsar (with roughly 30-second latency) feed into Unify’s Datadog dashboards, allowing them to auto-scale consumer pods within moments of a queue build-up — a drastic improvement over the laggy SQS monitoring.",[340,1322,1323],{},"Wide Feature Set for Future Needs: Pulsar’s rich feature set (beyond pub-sub and delays, it offers geo-replication, schema registry, tiered storage, and more) gives Unify confidence that the platform can adapt to new requirements. The team is already planning to leverage key-shared subscription to route all events for a given tenant to the same consumer instance. This will improve throughput by enabling better batching and reducing database record contention. They also have topic compaction on their roadmap, which will allow them to retain a long history of events but quickly reload only the latest state per key when reprocessing (ideal for rebuilding materialized views or caches). Knowing these capabilities are readily available in Pulsar means Unify can unlock new use cases without introducing additional systems.",[39,1325,1327],{"id":1326},"results","Results",[47,1329,1330],{},"By implementing StreamNative Cloud, Unify realized a robust, real-time data pipeline that underpins their AI-powered product. Key outcomes include:",[337,1332,1333,1336,1339,1342,1345,1348],{},[340,1334,1335],{},"Real-Time Customer Interactions: Unify’s platform now reacts to prospect behaviors in seconds rather than hours. Website clicks, product usage events, and CRM updates flow through Pulsar and immediately trigger personalized actions (like an AI agent researching the lead or a tailored email sequence). This real-time responsiveness has enhanced lead engagement and helped generate more pipeline for Unify’s clients. Sales reps can reach out at the perfect time with relevant context, significantly improving conversion rates.",[340,1337,1338],{},"Massive Scale & Performance: The Pulsar-based architecture easily handles tens of millions of events per day and scales horizontally. As Unify’s customer base grew and data volume surged, the event pipeline did not become a bottleneck. The team avoided the typical pains of batch-job systems that often fail or require re-engineering at high scale. \"We’ve built a real-time system that scales without breaking a sweat, thanks to Pulsar’s consumers and queues,\" Sam notes. The switch to Pulsar removed the strain that SQS and batch processes were starting to show, ensuring Unify can continue to onboard large enterprise customers and higher event loads.",[340,1340,1341],{},"Increased Engineering Leverage: Unify’s engineers gained a \"high leverage\" infrastructure where one technology addresses many needs. Instead of stitching together separate tools for queueing, pub-sub, scheduling, and reprocessing, Pulsar provided a unified platform. This simplification accelerates development of new features. The team spends less time on plumbing and more on product innovation (like refining their AI models and workflows). It aligns perfectly with Unify’s philosophy of doing more with fewer tools. As Sam puts it, \"We prefer technologies that give us a lot of leverage. Pulsar’s wide feature set means one tool covers many use cases for us.\"",[340,1343,1344],{},"Operational Resilience: The shift to an asynchronous, Pulsar-centric architecture has made Unify’s platform inherently more resilient to failures. During a recent internet outage — when services like Cloudflare and portions of Google Cloud went down — Unify’s system stayed online and continued processing events. Thanks to Pulsar’s durable storage, incoming data simply queued up during downstream outages and automatically caught up once connections were restored. End users experienced slightly delayed updates, but no data was lost and the application did not crash. This kind of graceful degradation and quick recovery is a major competitive advantage, ensuring Unify delivers a reliable experience. Moreover, with near-real-time visibility into event backlogs and consumer lag, the team can proactively manage throughput and avoid incidents before they escalate.",[340,1346,1347],{},"Faster Scaling and Cost Efficiency: With Pulsar’s metrics available almost instantly, Unify configured its infrastructure to auto-scale consumer services within seconds of surging load. For example, if a viral event drives a sudden spike in website traffic, Pulsar’s topic metrics in StreamNative Cloud will reflect the growing backlog almost immediately. Kubernetes can then spin up additional consumers to drain the queue, and scale back down when the burst subsides. This responsiveness not only maintains performance but also optimizes costs by right-sizing resources in real time (no need to over-provision for worst-case traffic). In contrast, the old SQS setup might lag so much that scaling actions came minutes too late, or not at all, leading to either delays or wasted compute.",[340,1349,1350],{},"Peace of Mind for Developers: Perhaps most importantly, Pulsar’s reliability and tooling have given Unify’s developers confidence in their pipeline. They know that if a bug slips through, they can rewind and reprocess data. And they trust that StreamNative Cloud will keep the system stable. \"With StreamNative managing Apache Pulsar, it’s been a very stable, hands-off experience — it fits our low-risk engineering approach,\" says Sam Waterbury. \"The ability to replay messages after they’ve been processed gives us a lot of peace of mind when we ship new code.\" This confidence lets the team move faster and innovate, without fearing that a small mistake will permanently drop data or take down the system.",[39,1352,1354],{"id":1353},"use-cases","Use Cases",[47,1356,1357],{},"Unify’s deployment of Apache Pulsar has unlocked a range of impactful use cases in their go-to-market platform:",[337,1359,1360,1363,1366,1369,1372],{},[340,1361,1362],{},"High-Throughput Data Ingestion: Pulsar serves as the central event hub for all of Unify’s data streams. Web tracking events (via JavaScript tags on clients’ sites) and CRM updates (from systems like Salesforce and HubSpot) are continuously published into Pulsar. On the busiest days, Unify processes millions of events with ease. Pulsar’s distributed log handles this firehose of data while ensuring each event is delivered to the appropriate services for processing.",[340,1364,1365],{},"Real-Time Intent Scoring (with Decay): Every inbound signal – a page view, a form fill, an email click – updates Unify’s proprietary intent score for that account or lead. Those updates happen instantly via Pulsar messages. If activity ceases, Unify uses delayed events in Pulsar to gradually decrease the score over time. For example, when a lead’s score is updated, the system schedules a follow-up event in Pulsar for 24 hours later. If the lead generates no new interactions by then, the delayed message triggers and lowers the score, then schedules the next decrement. This innovative use of Pulsar’s delay feature allows scores to naturally decay over 30 days of inactivity – without any nightly batch jobs. The result is a continuously up-to-date view of customer intent, powered entirely by streaming events.",[340,1367,1368],{},"AI-Driven Workflows: Unify’s \"Plays\" – multi-step, AI-driven sales workflows – are fueled by Pulsar events. A simple example is an automated workflow that engages a new website visitor: as soon as a visit event hits Pulsar, it triggers an AI research agent to qualify the company and fetch relevant info. Next, another service might draft a personalized outreach email. Because all these services subscribe to the event stream, the workflow unfolds in real time. Additionally, Pulsar’s speed ensures that if a prospect takes a key action (for instance, replies to an email or requests a demo), that event will instantly halt any redundant follow-ups. This prevents embarrassing overlaps (like sales contacting a lead who already responded) and lets Unify hand off hot leads to human reps at just the right moment. The end-to-end process is event-driven, delivering timely and context-aware responses that would be impossible to coordinate in a slower, batch-based system.",[340,1370,1371],{},"Resilient Asynchronous Processing: Unify’s use of Pulsar has made their overall architecture more fault-tolerant. Because producers and consumers are decoupled by the Pulsar queue, a failure in one component doesn’t cascade. For example, if the CRM integration service goes down temporarily, incoming CRM events are safely buffered in Pulsar until the service is restored – with no data loss. This buffering behavior proved critical during external outages; Pulsar acted as a shock absorber that kept Unify’s internal systems from being overwhelmed or dropping data. Once the downstream systems recovered, they simply caught up on the backlog. Unify’s customers benefited from consistent uptime, and the engineering team could resolve issues without firefighting live traffic in the moment.",[340,1373,1374],{},"Future Expansion with Advanced Features: As Unify continues to grow, it plans to harness more advanced Pulsar capabilities. One upcoming use is key-based ordering: partitioning topics such that all messages for a given account or tenant route to the same consumer instance. This will enable more efficient processing (e.g. aggregating events per account sequentially) and reduce database locking contention across tenants. Another planned feature is topic compaction for select event streams. In scenarios like rebuilding a lead database from event logs, compaction will allow Unify to retain a long history but quickly load only the latest state for each lead – vastly speeding up replays or backfills. These features are available out-of-the-box in Pulsar, meaning Unify can adopt them without introducing new systems. The flexibility of Pulsar ensures that Unify’s messaging infrastructure will continue to meet new demands on their path to transforming go-to-market operations.",[47,1376,1377],{},"By leveraging Apache Pulsar’s rich ecosystem, Unify has built a future-proof, real-time foundation for its platform. What started as a need for a better queue became a strategic advantage: an event-driven system that delivers the right information to the right service (or salesperson) at exactly the right time. Unify’s engineering choices have enabled the company to provide its customers – fast-growing sales teams – with up-to-the-second insights and automations, all on the back of a scalable, reliable streaming core.",{"title":17,"searchDepth":18,"depth":18,"links":1379},[1380,1381,1382,1383,1384,1385],{"id":1242,"depth":18,"text":1243},{"id":1257,"depth":18,"text":1258},{"id":1278,"depth":18,"text":1279},{"id":1302,"depth":18,"text":1303},{"id":1326,"depth":18,"text":1327},{"id":1353,"depth":18,"text":1354},"2025-08-21","Discover how Unify, an AI-native GTM platform backed by the OpenAI Startup Fund, scaled to process tens of millions of real-time events daily with StreamNative Cloud and Apache Pulsar—achieving faster monitoring, seamless scheduling, and resilient scalability.","\u002Fimgs\u002Fsuccess-stories\u002F68a6f862fb6f4c46942d161f_Unity-case-study.png","AI & Machine Learning","\u002Fsuccess-stories\u002Funify-achieves-real-time-go-to-market-scale-with-apache-pulsar-and-streamnative-cloud",{},[1071],"6 min",{"title":1235,"description":1387},"11-50 employees","success-stories\u002Funify-achieves-real-time-go-to-market-scale-with-apache-pulsar-and-streamnative-cloud",[1229,1071],[1229],"Real-time go-to-market pipeline replacing batch jobs","YUFbDz3tpl9vvVZafhDEhAnj4F78wMggaDFqKjTYZ4Y",{"id":1402,"title":1403,"authors":1404,"body":1407,"company":1419,"createdAt":10,"customerQuote":1492,"date":1496,"description":1497,"extension":8,"featured":7,"image":1498,"industry":1389,"isDraft":292,"link":1499,"logo":10,"meta":1500,"navigation":7,"order":1501,"path":1502,"products":1503,"readingTime":1224,"relatedResources":10,"seo":1505,"size":10,"stem":1506,"tags":1507,"technologies":1508,"useCases":1509,"__hash__":1510},"successStories\u002Fsuccess-stories\u002Fhow-q6-cyber-tamed-85-billion-cyberthreat-records-with-apache-pulsar-streamnative-new.md","How Q6 Cyber Tamed 85+ Billion Cyberthreat Records with Apache Pulsar & StreamNative",[1405,1406],"Jeff Bolle","Daniel Shaver",{"type":14,"value":1408,"toc":1483},[1409,1413,1421,1425,1428,1431,1434,1436,1439,1442,1446,1449,1452,1454,1457,1460,1471,1475,1478,1480],[39,1410,1412],{"id":1411},"background","Background",[47,1414,1415,1420],{},[54,1416,1419],{"href":1417,"rel":1418},"https:\u002F\u002Fwww.q6cyber.com\u002F",[263],"Q6 Cyber"," is a leading provider of actionable threat intelligence, focusing specifically on financial fraud prevention. Founded in 2016, the company delivers intelligence that helps financial institutions identify and mitigate cyber threats before they result in financial losses. Unlike traditional threat intelligence companies that focus on general cybersecurity threats, Q6 Cyber collects data directly from the cybercriminal ecosystem—compromised credentials, malware command-and-control servers, and dark web forums—and processes this data into actionable, easy-to-consume intelligence that their clients can immediately utilize to prevent fraud.",[39,1422,1424],{"id":1423},"challenge","Challenge",[47,1426,1427],{},"As Q6 Cyber's intelligence operations expanded to collect and process billions of threat intelligence records, they faced significant scaling challenges with their data infrastructure.",[47,1429,1430],{},"The company’s Google Cloud Pub\u002FSub implementation struggled with throughput during high-volume data ingestion periods. With over 85 billion records collected, their primary OpenSearch cluster had become unwieldy to manage and was hitting performance limitations. Q6 Cyber needed a more flexible solution that could handle unpredictable volume spikes while supporting their plans to transition to a data lake architecture.",[47,1432,1433],{},"Additionally, the sensitive nature of their data required complete control over their infrastructure, making a Bring Your Own Cloud (BYOC) solution essential.",[39,1435,1303],{"id":1302},[47,1437,1438],{},"Q6 Cyber implemented StreamNative's platform with Apache Pulsar at its core to serve as the central nervous system of their data processing architecture. The solution provided high-performance messaging with better throughput and reliability than their previous implementation, while the BYOC option allowed them to maintain complete control over their sensitive data.",[47,1440,1441],{},"Pulsar's native schema management simplified data processing across diverse formats and sources, while Pulsar Functions enabled them to deploy lightweight processing logic directly within the messaging layer. Most importantly, StreamNative positioned Q6 Cyber to execute their planned migration of 85 billion records to a data lake architecture, providing the reliable transport layer needed for this massive undertaking.",[39,1443,1445],{"id":1444},"technical-journey","Technical Journey",[47,1447,1448],{},"Q6 Cyber initially adopted open-source Apache Pulsar and later transitioned to StreamNative's managed platform with the BYOC option. They began by introducing Pulsar alongside Google Pub\u002FSub for specific high-throughput use cases, allowing them to validate the technology without disrupting existing workflows.",[47,1450,1451],{},"As confidence in the platform grew, they shifted more of their data flows to Pulsar, particularly for new applications. They leveraged Pulsar's unified messaging model to simplify their architecture, transitioning from disparate systems to a more centralized approach with Pulsar at the core. With this foundation in place, Q6 Cyber began designing their data lake migration strategy, with StreamNative serving as the critical transport layer.",[39,1453,1327],{"id":1326},[47,1455,1456],{},"StreamNative's platform delivered more consistent, reliable performance for Q6 Cyber's variable workloads, particularly during high-volume ingestion periods. The architectural flexibility allows them to easily route data to different systems as needed, with Pulsar serving as the center of their data architecture.",[47,1458,1459],{},"Additional benefits include:",[1147,1461,1462,1465,1468],{},[340,1463,1464],{},"While Q6 Cyber's initial vision for using Pulsar focused on long-term tiered storage, the flexibility of the platform allowed them to adapt their architecture as their strategy evolved.",[340,1466,1467],{},"Having schema validation at the transport layer proved more valuable than initially anticipated, reducing development complexity and improving data quality.",[340,1469,1470],{},"The platform's ability to handle variable workloads proved essential for a threat intelligence provider dealing with unpredictable surges in cybercriminal activity.",[39,1472,1474],{"id":1473},"future-prospects","Future Prospects",[47,1476,1477],{},"Q6 Cyber plans to complete their data lake migration, leveraging StreamNative to move tens of billions of records while maintaining operational continuity. They're also interested in further leveraging Pulsar Functions as improvements are made to support their high-fanout publishing scenarios, where a single input can generate numerous outputs. As they continue to scale their threat intelligence capabilities, StreamNative's platform will remain central to their data architecture.",[39,1479,929],{"id":928},[47,1481,1482],{},"StreamNative's platform has positioned Q6 Cyber to execute on their vision of a more scalable, flexible threat intelligence infrastructure. By providing a high-performance, reliable messaging layer with the security and control they require, StreamNative has enabled Q6 to focus on their core mission: delivering actionable threat intelligence to prevent financial fraud. As cyber threats continue to evolve, Q6 Cyber now has the infrastructure flexibility needed to adapt quickly, ensuring they can continue to protect financial institutions and their customers.",{"title":17,"searchDepth":18,"depth":18,"links":1484},[1485,1486,1487,1488,1489,1490,1491],{"id":1411,"depth":18,"text":1412},{"id":1423,"depth":18,"text":1424},{"id":1302,"depth":18,"text":1303},{"id":1444,"depth":18,"text":1445},{"id":1326,"depth":18,"text":1327},{"id":1473,"depth":18,"text":1474},{"id":928,"depth":18,"text":929},{"quote":1493,"name":1494,"position":1495},"StreamNative gives us the flexibility to move things where we want. It's not just performance improvements—it's enabling our long-term vision of transforming how we store, access, and deliver threat intelligence.","Jeff Boelle","CTO, Q6 Cyber","2025-04-18","Facing scaling challenges with 85+ billion cyberthreat records, Q6 Cyber overhauled its data infrastructure using Apache Pulsar and StreamNative. The solution delivered high-throughput messaging, seamless schema management, and a BYOC model—enabling their massive data lake migration while combating financial fraud. Today, StreamNative powers Q6’s real-time threat intelligence, proving that scalable data flows are critical in the fight against cybercrime.","\u002Fimgs\u002Fsuccess-stories\u002F6801bbdbc84a75db5a7ed2c3_SN-SuccessStories-q6cyber.png","https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=lcwu4KDB18c",{},1,"\u002Fsuccess-stories\u002Fhow-q6-cyber-tamed-85-billion-cyberthreat-records-with-apache-pulsar-streamnative-new",[1071,1504],"BYOC",{"title":1403,"description":1497},"success-stories\u002Fhow-q6-cyber-tamed-85-billion-cyberthreat-records-with-apache-pulsar-streamnative-new",[1229,1071],[1229],"Scalable fraud detection processing 85B+ threat records","bwPx36IgebGzj0IooZDuH0gmw8faM_5gB7P0487q8Wc",{"id":1512,"title":1513,"authors":1514,"body":1516,"company":1540,"createdAt":10,"customerQuote":1634,"date":1637,"description":1638,"extension":8,"featured":292,"image":1639,"industry":1389,"isDraft":292,"link":1640,"logo":10,"meta":1641,"navigation":7,"order":1501,"path":1642,"products":1643,"readingTime":1645,"relatedResources":10,"seo":1646,"size":1395,"stem":1647,"tags":1648,"technologies":1649,"useCases":1650,"__hash__":1651},"successStories\u002Fsuccess-stories\u002Fsafari-ai-cuts-cloud-costs-by-50-while-scaling-real-time-computer-vision-analytics-with-streamnative.md","Safari AI Cuts Cloud Costs by 50% While Scaling Real-Time Computer Vision Analytics with StreamNative",[1515],"Kaiwen Yuan",{"type":14,"value":1517,"toc":1624},[1518,1526,1532,1534,1542,1544,1547,1564,1566,1569,1572,1575,1589,1592,1595,1598,1602,1613,1615,1618,1621],[47,1519,1520,1521,1525],{},"Featuring: ",[54,1522,1515],{"href":1523,"rel":1524},"https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fkaiwen-yuan-12372768\u002F",[263],", Co-Founder, Head of Engineering at Safari AI",[1527,1528,1529],"blockquote",{},[47,1530,1531],{},"\"StreamNative's resilience is critical to our SaaS operations. Each quarter, we evaluate Pulsar, StreamNative, and other solutions, and StreamNative consistently emerges as the best choice—cutting our costs by more than 50% while seamlessly supporting our ML data structure requirements. We're also collaborating closely with the StreamNative team to adopt innovations like Ursa Engine, Iceberg integration, and more to sustain our growth and profitability\" - Kaiwen Yuan, Co-Founder, Head of Engineering at Safari AI",[39,1533,1412],{"id":1411},[47,1535,1536,1541],{},[54,1537,1540],{"href":1538,"rel":1539},"https:\u002F\u002Fgetsafari.ai\u002F",[263],"Safari AI"," helps enterprises understand their physical operations by digitizing real-world activities. Using a customer’s existing camera infrastructure, the company provides automated measurements of critical physical activities via Computer Vision technology, helping businesses track metrics like guest occupancy, staff engagement, parking utilization, and queue wait times. This real-time operational intelligence enables managers and front-line staff to focus on customer service rather than manual monitoring tasks.",[39,1543,1424],{"id":1423},[47,1545,1546],{},"Safari AI faced several critical challenges in building their computer vision platform:",[337,1548,1549,1552,1555,1558,1561],{},[340,1550,1551],{},"Processing massive amounts of video data efficiently while maintaining the company’s commitment to 90-95% accuracy.",[340,1553,1554],{},"Managing costs while scaling to handle 10,000+ pipelines and 50,000+ cameras.",[340,1556,1557],{},"Ensuring real-time data delivery without requiring video reprocessing when camera positions change.",[340,1559,1560],{},"Finding a cost-effective way to store and process ML data without maintaining expensive infrastructure.",[340,1562,1563],{},"Previous solutions like Kafka and AWS Kinesis required significant DevOps resources and were costly to maintain.",[39,1565,1303],{"id":1302},[47,1567,1568],{},"‍ Safari AI implemented StreamNative's fully managed platform as their primary data storage and streaming backbone, creating an end-to-end solution for real-time operational intelligence. StreamNative’s comprehensive platform collects ML data from edge devices and processes this information through Safari's Flink-based data engine, enabling immediate delivery of metrics and alerts to customers. By leveraging StreamNative's tiered storage capabilities, Safari AI achieved cost-effective data retention while maintaining rapid access to historical information. The platform's utilization of StreamNative's schema registry proved particularly valuable for handling structured ML data, ensuring consistent data quality and format across their growing deployment of computer vision analytics. This integrated approach enabled Safari AI to focus on delivering value to their customers without the overhead of managing complex streaming infrastructure.",[39,1570,1571],{"id":1444},"‍Technical Journey",[47,1573,1574],{},"‍Safari AI's architecture consists of four main components:",[1147,1576,1577,1580,1583,1586],{},[340,1578,1579],{},"Customer's existing camera infrastructure;",[340,1581,1582],{},"GPU server for ML services;",[340,1584,1585],{},"StreamNative Cloud for data streaming and storage; and",[340,1587,1588],{},"Safari AI Cloud for business metrics processing.",[47,1590,1591],{},"The platform processes video feeds through ML models, sends the output with timestamps and IDs to StreamNative's Pulsar, and then transforms this data into actionable business metrics through their data engine.",[39,1593,1594],{"id":1326},"‍Results",[47,1596,1597],{},"‍The implementation of StreamNative's platform delivered significant operational and financial benefits for Safari AI. Most notably, the company achieved a 50% reduction in infrastructure cost compared to their previous solutions, while maintaining consistent sub-10 second end-to-end latency for real-time metrics delivery. The switch to StreamNative's fully managed service eliminated the need for a dedicated DevOps team to manage streaming infrastructure, further reducing operational overhead. The platform's efficient storage capabilities enabled Safari AI to maintain one year of back-processing storage, providing valuable historical data access for their clients. Most importantly, the solution successfully scaled to support multiple enterprise clients through shared resources, demonstrating the platform's ability to grow with Safari AI's business needs while maintaining cost efficiency. These results validate Safari AI's decision to choose StreamNative as their core streaming infrastructure provider, enabling them to focus on their core mission of delivering computer vision analytics to their customers.",[39,1599,1601],{"id":1600},"key-takeaways","‍Key Takeaways",[337,1603,1604,1607,1610],{},[340,1605,1606],{},"Safari AI archived a 50% cloud cost reduction compared to their previous solutions, while maintaining extreme low latency \u003C 10ms and longer data retention.",[340,1608,1609],{},"Schema compatibility is crucial for ML data structure requirements.",[340,1611,1612],{},"The multi-tenancy hierarchy supports various clients while sharing underlying resources.",[39,1614,1474],{"id":1473},[47,1616,1617],{},"Safari AI plans to adopt StreamNative's Ursa Engine and Iceberg integration, which, based on its initial evaluation and investigation, shows promise in further reducing costs—by up to 90%—for new client implementations while sustaining growth and profitability. The company will continue expanding its no-code platform and auto-calibration capabilities to enhance the accessibility and scalability of computer vision analytics.",[39,1619,1620],{"id":928},"‍Conclusion",[47,1622,1623],{},"By choosing StreamNative over other alternatives like Confluent and Redpanda as their streaming platform, Safari AI has built a scalable, cost-effective solution for computer vision analytics that delivers consistent accuracy and performance for their enterprise customers. The partnership enables Safari AI to focus on their core mission of digitizing physical operations while maintaining reliable, real-time data delivery.",{"title":17,"searchDepth":18,"depth":18,"links":1625},[1626,1627,1628,1629,1630,1631,1632,1633],{"id":1411,"depth":18,"text":1412},{"id":1423,"depth":18,"text":1424},{"id":1302,"depth":18,"text":1303},{"id":1444,"depth":18,"text":1571},{"id":1326,"depth":18,"text":1594},{"id":1600,"depth":18,"text":1601},{"id":1473,"depth":18,"text":1474},{"id":928,"depth":18,"text":1620},{"quote":1635,"name":1515,"position":1636},"StreamNative's resilience is critical to our SaaS operations. Each quarter, we evaluate Pulsar, StreamNative, and other solutions, and StreamNative consistently emerges as the best choice—cutting our costs by more than 50% while seamlessly supporting our ML data structure requirements. We're also collaborating closely with the StreamNative team to adopt innovations like Ursa Engine, Iceberg integration, and more to sustain our growth and profitability","Co-Founder, Head of Engineering at Safari AI","2025-02-04","Discover how Safari AI leveraged StreamNative to cut cloud costs by 50%, achieve sub-10s real-time data delivery, and scale its computer vision analytics across 50,000+ cameras—all while eliminating DevOps overhead.","\u002Fimgs\u002Fsuccess-stories\u002F67be8dbd5b4dd01225b3f174_SN-SuccessStories-safari-ai.webp","https:\u002F\u002Fyoutu.be\u002Fagj1VBc4LyM",{},"\u002Fsuccess-stories\u002Fsafari-ai-cuts-cloud-costs-by-50-while-scaling-real-time-computer-vision-analytics-with-streamnative",[1644],"Dedicated","10 min",{"title":1513,"description":1638},"success-stories\u002Fsafari-ai-cuts-cloud-costs-by-50-while-scaling-real-time-computer-vision-analytics-with-streamnative",[1229,1071],[1229],"Real-time computer vision analytics at 50% lower cost","ofqQemO7h7hgDrmjPhmOt6CCsqppBB8Ra794PmmvtHc",{"id":1653,"title":1654,"authors":1655,"body":1657,"company":1715,"createdAt":10,"customerQuote":1716,"date":1719,"description":1720,"extension":8,"featured":292,"image":1721,"industry":1722,"isDraft":292,"link":10,"logo":10,"meta":1723,"navigation":7,"order":294,"path":1724,"products":1725,"readingTime":1645,"relatedResources":10,"seo":1726,"size":1727,"stem":1728,"tags":1729,"technologies":1730,"useCases":1731,"__hash__":1732},"successStories\u002Fsuccess-stories\u002Fdriving-logistics-innovation-how-transport-exchange-group-modernized-with-apache-pulsar-and-streamnative.md","Driving Logistics Innovation: How Transport Exchange Group Modernized with Apache Pulsar and StreamNative",[1656],"Darren Parsons",{"type":14,"value":1658,"toc":1706},[1659,1667,1670,1672,1675,1677,1680,1682,1685,1687,1690,1693,1696,1698,1701,1703],[47,1660,1661,1666],{},[54,1662,1665],{"href":1663,"rel":1664},"https:\u002F\u002Ftransportexchangegroup.com\u002F",[263],"Transport Exchange Group"," (TEG) is a 24-year-old UK-based company operating in the logistics industry. The company’s exchange platform connects businesses with goods to deliver and carriers with available capacity.",[47,1668,1669],{},"Recently, TEG expanded into finance, identity, and compliance services for the logistics sector. TEG has a diverse membership of goods distributors and carriers and has become an integral part of the UK's supply chain and logistics industry, with long-term plans for geographic expansion.",[39,1671,1424],{"id":1423},[47,1673,1674],{},"As the company achieved growth and maturity, TEG needed to modernize its architecture, transitioning from a large monolithic model to an event-driven microservices architecture. With a team of 30 data, engineering, and DevOps employees, TEG required a scalable, high-performance messaging system that could handle real-time data processing, support transactional semantics, and integrate with various data sources including telematics providers and transport management systems. The solution needed to accommodate their complex algorithm for matching suitable carriers to available loads, which creates 0.5 million notifications per day, each within seconds of a load being posted on the platform.",[39,1676,1303],{"id":1302},[47,1678,1679],{},"After evaluating options, TEG chose Apache Pulsar, implemented through StreamNative's managed service. This decision was driven by Pulsar's support for transactional semantics, superior scalability without disruptive rebalancing, and its ability to handle elastic workloads. TEG adopted a hybrid approach, using StreamNative's hosted solution for non-production environments and a bring-your-own-cloud (BYOC) model for production workloads, optimizing for both performance and cost-effectiveness.",[39,1681,1445],{"id":1444},[47,1683,1684],{},"TEG's migration to Pulsar began two years ago when they were using Kafka and looking to stream data from PostgreSQL. Pulsar's transactional capabilities and performance characteristics were key factors in the decision. TEG maintained some Kafka implementations for specific use cases while adopting Pulsar for new ones. They are now considering consolidating to Pulsar entirely as they move towards cloud environments, leveraging StreamNative Ursa engine’s Kafka compatibility capabilities.",[39,1686,1327],{"id":1326},[47,1688,1689],{},"Since implementing StreamNative's Pulsar solution, TEG has experienced highly reliable service with only one production issue in two years, which was reported proactively with 24\u002F7 monitoring by the StreamNative support team. Darren Parsons, Director of Architecture and Engineering, regards the platform as having been \"bullet-proof,\" handling TEG's varying workloads efficiently, including demanding and busy seasons like the pre-Christmas period and other high-volume events such as bank holidays and major sporting events.",[39,1691,1692],{"id":1600},"Key Takeaways",[47,1694,1695],{},"The adoption of StreamNative has provided TEG with a robust, scalable messaging system that meets their complex requirements. The company benefited from excellent support during onboarding and ongoing operations. TEG's experience highlights the importance of choosing a flexible, scalable solution that can grow with the business and adapt to changing needs.",[39,1697,1474],{"id":1473},[47,1699,1700],{},"TEG is considering further optimization of their infrastructure, potentially consolidating their messaging platforms by leveraging Pulsar's Kafka protocol support. They continue to evolve their cloud architecture and may explore additional StreamNative offerings to simplify their overall system architecture. The company's focus on cost optimization and performance improvement suggests ongoing refinement of their Pulsar implementation.",[39,1702,929],{"id":928},[47,1704,1705],{},"StreamNative has enabled TEG to modernize its data streaming capabilities, supporting their transition to a more flexible and scalable architecture. The reliability, performance, and support provided by StreamNative have been crucial in maintaining TEG's operations across various peak periods and handling their dynamic workload requirements. As TEG continues to grow and expand globally, their Pulsar implementation provides a solid foundation for future developments.",{"title":17,"searchDepth":18,"depth":18,"links":1707},[1708,1709,1710,1711,1712,1713,1714],{"id":1423,"depth":18,"text":1424},{"id":1302,"depth":18,"text":1303},{"id":1444,"depth":18,"text":1445},{"id":1326,"depth":18,"text":1327},{"id":1600,"depth":18,"text":1692},{"id":1473,"depth":18,"text":1474},{"id":928,"depth":18,"text":929},"TEG",{"quote":1717,"name":1656,"position":1718},"Our experience with StreamNative has been really good. From a service provider point of view, it's been bullet-proof. When we did have an issue, StreamNative knew about it before we did and had it fixed by the time we got into the office. That was a really positive experience.","Director of Architecture and Engineering, Transport Exchange Group","2024-12-12","Discover how Transport Exchange Group modernized its logistics platform with StreamNative’s Apache Pulsar solution, enabling real-time data streaming, scalability, and seamless carrier matching.","\u002Fimgs\u002Fsuccess-stories\u002F67942d59043fd3f235d8f295_SN-SuccessStories-teg.webp","Logistics & Transportation",{},"\u002Fsuccess-stories\u002Fdriving-logistics-innovation-how-transport-exchange-group-modernized-with-apache-pulsar-and-streamnative",[1504,1644],{"title":1654,"description":1720},"51-200 employees","success-stories\u002Fdriving-logistics-innovation-how-transport-exchange-group-modernized-with-apache-pulsar-and-streamnative",[1229,1071],[1229],"Event-driven microservices for real-time logistics matching","ZURcwyLvOTH7l6OU5ogfoIF6gmke5oKHwjmXkb0U6A0",1775235745372]