[{"data":1,"prerenderedAt":1787},["ShallowReactive",2],{"active-banner":3,"navbar-featured-partner-blog":24,"navbar-pricing-featured":306,"blog-\u002Fblog\u002Fcomparison-of-messaging-platforms-apache-pulsar-vs-rabbitmq-vs-nats-jetstream":1086,"blog-authors-\u002Fblog\u002Fcomparison-of-messaging-platforms-apache-pulsar-vs-rabbitmq-vs-nats-jetstream":1722,"related-\u002Fblog\u002Fcomparison-of-messaging-platforms-apache-pulsar-vs-rabbitmq-vs-nats-jetstream":1764},{"id":4,"title":5,"date":6,"dismissible":7,"extension":8,"link":9,"link2":10,"linkText":11,"linkText2":12,"meta":13,"stem":21,"variant":22,"__hash__":23},"banners\u002Fbanners\u002Flakestream-ufk-launch.md","StreamNative Introduces Lakestream Architecture and Launches Native Kafka Service","2026-04-07",true,"md","\u002Fblog\u002Ffrom-streams-to-lakestreams","https:\u002F\u002Fconsole.streamnative.cloud\u002Fsignup?from=banner_lakestream-launch","Read Announcement","Sign Up Now",{"body":14},{"type":15,"value":16,"toc":17},"minimark",[],{"title":18,"searchDepth":19,"depth":19,"links":20},"",2,[],"banners\u002Flakestream-ufk-launch","default","zRueBGutATZB0ZnFFHwaEV7F0Di4tnZUHhgOiI4cu6k",{"id":25,"title":26,"authors":27,"body":29,"category":289,"createdAt":290,"date":291,"description":292,"extension":8,"featured":7,"image":293,"isDraft":294,"link":290,"meta":295,"navigation":7,"order":296,"path":297,"readingTime":298,"relatedResources":290,"seo":299,"stem":300,"tags":301,"__hash__":305},"blogs\u002Fblog\u002Fstreamnative-recognized-in-the-forrester-wave-streaming-data-platforms-2025.md","StreamNative Recognized as a Contender in The Forrester Wave™: Streaming Data Platforms, Q4 2025",[28],"David Kjerrumgaard",{"type":15,"value":30,"toc":276},[31,39,47,51,67,73,78,81,87,102,109,115,118,124,127,134,140,143,146,157,163,169,172,175,178,184,191,194,197,204,207,210,224,229,233,237,241,245,249,251,268,270],[32,33,35],"h3",{"id":34},"receives-highest-possible-scores-in-both-the-messaging-and-resource-optimization-criteria",[36,37,38],"em",{},"Receives Highest Possible Scores in BOTH the Messaging and Resource Optimization Criteria",[40,41,43],"h2",{"id":42},"introduction",[44,45,46],"strong",{},"Introduction",[48,49,50],"p",{},"Real-time data has become the backbone of modern innovation. As artificial intelligence (AI) and digital services demand instantaneous insights, organizations are realizing that streaming data is no longer optional – it's essential for delivering timely, context-rich experiences. StreamNative's data streaming platform is built precisely for this reality, ensuring data is immediate, reliable, and ready to power critical applications.",[48,52,53,54,63,64],{},"Today, we're excited to announce that Forrester Research has named StreamNative as a Contender in its evaluation, ",[55,56,58],"a",{"href":57},"\u002Freports\u002Frecognized-in-the-forrester-wave-tm-streaming-data-platforms-q4-2025",[36,59,60],{},[44,61,62],{},"The Forrester Wave™: Streaming Data Platforms, Q4 2025",". This report evaluated 15 top streaming data platform providers, and we're proud to share that ",[44,65,66],{},"StreamNative received the highest scores possible—5 out of 5—in both the Messaging and Resource Optimization criteria.",[48,68,69,70],{},"***Forrester's Take: ***",[36,71,72],{},"\"StreamNative is a good fit for enterprises that want an Apache Pulsar implementation that is also compatible with Kafka APIs.\"",[48,74,75],{},[36,76,77],{},"— The Forrester Wave™: Streaming Data Platforms, Q4 2025",[48,79,80],{},"Being recognized in the Forrester Wave is a proud milestone, and for us, it highlights how far StreamNative has come in enabling enterprises to unlock the power of real-time data. In the sections below, we'll dive into what we believe sets StreamNative apart—from our modern architecture and cloud-native design to our open-source foundation and real-time use cases—and how we see these strengths aligning with Forrester's findings.",[40,82,84],{"id":83},"trusted-by-industry-leaders",[44,85,86],{},"Trusted by Industry Leaders",[48,88,89,90,93,94,97,98,101],{},"Companies across industries are already leveraging StreamNative to drive real-time outcomes. Global enterprises like ",[44,91,92],{},"Cisco"," rely on StreamNative to handle massive IoT telemetry, supporting 245 million+ connected devices. Martech leaders such as ",[44,95,96],{},"Iterable"," process billions of events per day with StreamNative for hyper-personalized customer engagement. And in financial services, ",[44,99,100],{},"FICO"," trusts StreamNative to power its real-time fraud detection and analytics pipelines with a secure, scalable streaming backbone.",[48,103,104,105,108],{},"The Forrester report notes that, “",[36,106,107],{},"Customers appreciate the lower infrastructure costs that result from StreamNative’s cost-efficient, Kafka-compatible architecture. Customers note excellent support responsiveness…","”",[40,110,112],{"id":111},"modern-cloud-native-architecture-built-for-scale",[44,113,114],{},"Modern, Cloud-Native Architecture Built for Scale",[48,116,117],{},"From day one, StreamNative was designed with a modern architecture to meet the demanding scale and flexibility requirements of real-time data. Unlike legacy streaming systems that often rely on tightly coupled storage and compute, StreamNative's platform takes a cloud-native approach: it decouples these layers to enable elastic scalability and efficient resource utilization across any environment. The core is powered by Apache Pulsar—a distributed messaging and streaming engine—enhanced with multi-protocol support (including native Apache Kafka API compatibility) to unify diverse data streams under one roof. This means organizations can consolidate siloed messaging systems and handle both high-volume event streams and traditional message queues on a single platform, without sacrificing performance or reliability.",[48,119,120,121,108],{},"Forrester's evaluation described that “",[36,122,123],{},"StreamNative aims to provide a high-performance, multi-protocol streaming data platform: It uses Apache Pulsar with Kafka API compatibility to deliver cost-efficient, real-time applications for enterprises. It appeals to organizations that want a flexible, low-cost streaming solution, due to its focus on scalability and resource optimization, while its investments in Pulsar’s open-source ecosystem and performance optimization make it the primary platform for enterprises wishing to implement Pulsar.",[48,125,126],{},"Our cloud-first, leaderless architecture (with no single broker bottlenecks) and tiered storage model were built to maximize throughput and cost-efficiency for real-time workloads. By separating compute from storage and leveraging distributed object storage, StreamNative can retain huge volumes of event data indefinitely while keeping compute costs in check—effectively providing a flexible, low-cost streaming solution.",[48,128,129,130,133],{},"This modern design not only delivers high performance, but also ensures fault tolerance and geo-distribution out of the box, so enterprises can trust their streaming data is always available and durable. As Forrester’s evaluation noted, StreamNative ",[36,131,132],{},"\"excels at messaging and resource optimization\" and “Its platform supports use cases like real-time analytics and event-driven architectures with robust scalability.","” Our architecture provides the strong foundation that today's real-time applications demand, from ultra-fast data ingestion to seamless scale-out across hybrid and multi-cloud environments.",[40,135,137],{"id":136},"open-source-foundation-and-pulsar-expertise",[44,138,139],{},"Open Source Foundation and Pulsar Expertise",[48,141,142],{},"StreamNative's DNA is rooted in open source innovation. Our founders are the original creators of Apache Pulsar, and we've built our platform with the same open principles: freedom, flexibility, and community-driven innovation. For developers and data teams, this means adopting StreamNative comes with no proprietary lock-in—instead, you get a platform built on open standards and a thriving ecosystem. We offer broad API compatibility (Pulsar, Kafka, JMS, MQTT, and more) so that teams can work with familiar interfaces and integrate StreamNative into existing systems with ease.",[48,144,145],{},"StreamNative is the primary commercial contributor to the Apache Pulsar project and its surrounding ecosystem. We invest heavily in Pulsar's ongoing improvements our investments in Pulsar's open-source ecosystem and performance optimization bolster StreamNative's value. We also foster a vibrant community through initiatives like the Data Streaming Summit and free training resources.",[48,147,148,149,152,153,156],{},"Forrester's assessment noted that StreamNative’s “",[36,150,151],{},"events-driven agents, extensibility, and performance architecture are solid,","” and we're continuing to build on that foundation. ",[44,154,155],{},"We're actively investing in expanding our tooling for observability, governance, schema management, and developer productivity","—areas we recognize as critical for enterprise adoption and where we're committed to accelerating our roadmap.",[48,158,159,160],{},"Being open also means embracing an open ecosystem of technologies. StreamNative actively integrates with the tools and platforms that matter most to our users. We partner with industry leaders like Snowflake, Databricks, Google, and Ververica to ensure our streaming platform works seamlessly with data warehouses, lakehouse storage, and stream processing frameworks. Forrester’s evaluation observed that StreamNative’s ",[36,161,162],{},"\"investments in Pulsar’s open-source ecosystem and performance optimization make it the primary platform for enterprises wishing to implement Pulsar.\"",[40,164,166],{"id":165},"powering-real-time-use-cases-across-industries",[44,167,168],{},"Powering Real-Time Use Cases Across Industries",[48,170,171],{},"One of the greatest validations of StreamNative's approach is the success our customers are achieving with real-time data. StreamNative's platform is versatile and use-case agnostic—if an application demands high-volume, low-latency data movement, we can power it. This flexibility is why our customer base spans industries from finance and IoT to major automobile manufacturers and online gaming. The common thread is that these organizations need to process and react to data in milliseconds, and StreamNative is delivering the capabilities to make that possible.",[48,173,174],{},"Cisco uses StreamNative to underpin an IoT telemetry system of colossal scale, connecting hundreds of millions of devices and thousands of enterprise clients with real-time data streams. The platform's multi-tenant design and proven reliability allow Cisco to offer its customers a live feed of device data with unwavering confidence. In the financial sector, FICO has built streaming pipelines on StreamNative to detect fraud as transactions happen and to monitor systems in real time. With StreamNative's strong guarantees around message durability and ordering, FICO can catch anomalies or suspicious patterns within seconds. And in digital customer engagement, Iterable relies on StreamNative to process billions of events every day—clicks, views, purchases—so that marketers can trigger personalized campaigns instantly based on user behavior.",[48,176,177],{},"Our customers uniformly deal with mission-critical data streams, where downtime or delays are unacceptable. StreamNative's fault-tolerant, scalable infrastructure has proven equal to the task, handling scenarios like bursting to millions of events per second or seamlessly spanning multiple cloud regions. Forrester's report recognized StreamNative for supporting event-driven architectures with robust scalability—which for us is a reflection of our platform's ability to meet the most demanding enterprise requirements.",[40,179,181],{"id":180},"continuing-to-innovate-ursa-orca-and-the-road-ahead",[44,182,183],{},"Continuing to Innovate: Ursa, Orca, and the Road Ahead",[48,185,186,187,190],{},"While we are thrilled to be recognized in Forrester's Streaming Data Platforms Wave, we view this as just the beginning. StreamNative's vision has always been bold: to ",[44,188,189],{},"provide a unified platform that not only handles today's streaming needs but also anticipates the emerging requirements of tomorrow",".",[48,192,193],{},"One key area of focus is the convergence of streaming data with advanced analytics and AI. As Forrester points out in the report, technology leaders should look for platforms that natively integrate messaging, stream processing, and analytics to provide AI agents with real-time, contextualized information. We couldn't agree more. Our award-winning Ursa Engine and Orca Agent Engine are aimed at extending our platform up the stack—bridging the gap between data streams and data lakes, and between event streams and intelligent processing.",[48,195,196],{},"Our new Ursa Engine introduces a lakehouse-native approach to streaming: it can write events directly to table formats like Iceberg on cloud storage, eliminating entire classes of ETL jobs and making fresh data instantly available for analytics queries. By integrating streaming and lakehouse technologies, we help customers collapse data silos and accelerate their AI\u002FML pipelines.",[48,198,199,200,203],{},"Beyond analytics integration, we are also enhancing StreamNative with more out-of-the-box processing and governance capabilities. In the coming months, we plan to introduce new features for lightweight stream processing and transformation, making it easier to build reactive applications directly on the platform. We're also expanding our ecosystem of connectors and integrations, so that whether your data lands in Snowflake, Databricks, or an AI model, StreamNative will seamlessly feed it. ",[44,201,202],{},"We're investing significantly in enterprise features including security, schema registry, governance, and monitoring tooling","—capabilities that are essential for mission-critical deployments and where we're committed to continued improvement.",[48,205,206],{},"This recognition from Forrester energizes us to keep innovating at full speed. We're sharing this honor with our amazing customers, community, and partners who drive us forward every day. Your feedback and real-world challenges have helped shape StreamNative into what it is today, and together, we will shape the future of streaming data. Thank you for joining us on this journey—we're just getting started, and we can't wait to deliver even more value as we continue to evolve our platform. Onward to real-time everything!",[208,209],"hr",{},[32,211,213],{"id":212},"streamnative-in-the-forrester-wave-evaluation-findings",[44,214,215,216,223],{},"StreamNative in ",[44,217,218],{},[55,219,220],{"href":57},[44,221,222],{},"The Forrester Wave™",": Evaluation Findings",[225,226,228],"h5",{"id":227},"recognized-as-a-contender-among-15-streaming-data-platform-providers","• Recognized as a Contender among 15 streaming data platform providers",[225,230,232],{"id":231},"received-the-highest-scores-possible-50-in-both-the-messaging-and-resource-optimization-criteria","* Received the highest scores possible (5.0) in both the Messaging and Resource Optimization criteria",[225,234,236],{"id":235},"cited-as-the-primary-platform-for-enterprises-wishing-to-implement-pulsar","• Cited as the primary platform for enterprises wishing to implement Pulsar",[225,238,240],{"id":239},"noted-for-excelling-at-messaging-and-resource-optimization","• Noted for excelling at messaging and resource optimization",[225,242,244],{"id":243},"customers-cited-lower-infrastructure-costs-and-excellent-support-responsiveness","• Customers cited lower infrastructure costs and excellent support responsiveness",[225,246,248],{"id":247},"recognized-for-supporting-event-driven-architectures-with-robust-scalability","• Recognized for supporting event-driven architectures with robust scalability",[208,250],{},[252,253,255,256,259,260,190],"h6",{"id":254},"forrester-disclaimer-forrester-does-not-endorse-any-company-product-brand-or-service-included-in-its-research-publications-and-does-not-advise-any-person-to-select-the-products-or-services-of-any-company-or-brand-based-on-the-ratings-included-in-such-publications-information-is-based-on-the-best-available-resources-opinions-reflect-judgment-at-the-time-and-are-subject-to-change-for-more-information-read-about-forresters-objectivity-here","**Forrester Disclaimer: **",[36,257,258],{},"Forrester does not endorse any company, product, brand, or service included in its research publications and does not advise any person to select the products or services of any company or brand based on the ratings included in such publications. Information is based on the best available resources. Opinions reflect judgment at the time and are subject to change",". *For more information, read about Forrester’s objectivity *",[55,261,265],{"href":262,"rel":263},"https:\u002F\u002Fwww.forrester.com\u002Fabout-us\u002Fobjectivity\u002F",[264],"nofollow",[36,266,267],{},"here",[208,269],{},[252,271,273],{"id":272},"apache-apache-pulsar-apache-kafka-apache-flink-and-other-names-are-trademarks-of-the-apache-software-foundation-no-endorsement-by-apache-or-other-third-parties-is-implied",[36,274,275],{},"Apache®, Apache Pulsar®, Apache Kafka®, Apache Flink® and other names are trademarks of The Apache Software Foundation. No endorsement by Apache or other third parties is implied.",{"title":18,"searchDepth":19,"depth":19,"links":277},[278,280,281,282,283,284,285],{"id":34,"depth":279,"text":38},3,{"id":42,"depth":19,"text":46},{"id":83,"depth":19,"text":86},{"id":111,"depth":19,"text":114},{"id":136,"depth":19,"text":139},{"id":165,"depth":19,"text":168},{"id":180,"depth":19,"text":183,"children":286},[287],{"id":212,"depth":279,"text":288},"StreamNative in The Forrester Wave™: Evaluation Findings","Company",null,"2025-12-16","StreamNative is recognized in The Forrester Wave™: Streaming Data Platforms, Q4 2025. Discover why Forrester highlights StreamNative's high-performance messaging, efficient resource use, and cost-effective Kafka API compatibility for real-time innovation.","\u002Fimgs\u002Fblogs\u002F693bd36cf01b217dcb67278f_Streamnative_blog_thumbnail.png",false,{},0,"\u002Fblog\u002Fstreamnative-recognized-in-the-forrester-wave-streaming-data-platforms-2025","10 mins read",{"title":26,"description":292},"blog\u002Fstreamnative-recognized-in-the-forrester-wave-streaming-data-platforms-2025",[302,303,304],"Announcements","Real-Time","Forrester","sOeeJtEO3O-IIfTPJjY1AFOMawZ_rf8FOH8A98NEKgU",{"id":307,"title":308,"authors":309,"body":314,"category":1073,"createdAt":290,"date":1074,"description":1075,"extension":8,"featured":7,"image":1076,"isDraft":294,"link":290,"meta":1077,"navigation":7,"order":296,"path":1078,"readingTime":1079,"relatedResources":290,"seo":1080,"stem":1081,"tags":1082,"__hash__":1085},"blogs\u002Fblog\u002Fhow-we-run-a-5-gb-s-kafka-workload-for-just-50-per-hour.md","How We Run a 5 GB\u002Fs Kafka Workload for Just $50 per Hour",[310,311,312,313],"Matteo Meril","Neng Lu","Hang Chen","Penghui Li",{"type":15,"value":315,"toc":1043},[316,319,322,325,328,331,335,338,348,354,357,365,370,374,381,384,387,395,399,402,407,411,414,417,420,423,432,436,439,450,453,457,460,463,474,477,481,485,493,496,500,508,537,541,544,549,553,556,560,563,566,571,580,585,588,591,602,606,609,620,624,627,630,635,638,667,671,673,679,682,687,692,695,699,713,717,728,732,747,756,767,770,773,777,780,783,794,797,800,803,808,813,817,821,838,842,856,861,865,876,879,895,899,910,915,920,928,932,935,939,946,950,953,962,967,976,982,991,1000,1009,1018,1027,1035],[48,317,318],{},"The rise of DeepSeek has shaken the AI infrastructure market, forcing companies to confront the escalating costs of training and deploying AI models. But the real pressure point isn’t just compute—it’s data acquisition and ingestion costs.",[48,320,321],{},"As businesses rethink their AI cost-containment strategies, real-time data streaming is emerging as a critical enabler. The growing adoption of Kafka as a standard protocol has expanded cost-efficient options, allowing companies to optimize streaming analytics while keeping expenses in check.",[48,323,324],{},"Ursa, the data streaming engine powering StreamNative’s managed Kafka service, is built for this new reality. With its leaderless architecture and native lakehouse storage integration, Ursa eliminates costly inter-zone network traffic for data replication and client-to-broker communication while ensuring high availability at minimal operational cost.",[48,326,327],{},"In this blog post, we benchmarked the infrastructure cost and total cost of ownership (TCO) for running a 5GB\u002Fs Kafka workload across different Kafka vendors, including Redpanda, Confluent WarpStream, and AWS MSK. Our benchmark results show that Ursa can sustain 5GB\u002Fs Kafka workloads at just 5% of the cost of traditional streaming engines like Redpanda—making it the ideal solution for high-performance, cost-efficient ingestion and data streaming for data lakehouses and AI workloads.",[48,329,330],{},"Note: We also evaluated vanilla Kafka in our benchmark; however, for simplicity, we have focused our cost comparison on vendor solutions rather than self-managed deployments. That said, it is important to highlight that both Redpanda and vanilla Kafka use a leader-based data replication approach. In a data-intensive, network-bound workload like 5GB\u002Fs streaming, with the same machine type and replication factor, Redpanda and vanilla Kafka produced nearly identical cost profiles.",[40,332,334],{"id":333},"key-benchmark-findings","Key Benchmark Findings",[48,336,337],{},"Ursa delivered 5 GB\u002Fs of sustained throughput at an infrastructure cost of just $54 per hour. For comparison:",[339,340,341,345],"ul",{},[342,343,344],"li",{},"MSK: $303 per hour → 5.6x more expensive compared to Ursa",[342,346,347],{},"Redpanda: $988 per hour → 18x more expensive compared to Ursa",[48,349,350],{},[351,352],"img",{"alt":18,"src":353},"\u002Fimgs\u002Fblogs\u002F679c71b67d9046f26edc7977_AD_4nXfvTqyBNUBu2lObdkKAx-5UNkpNP8UYULLZyOcixE6z99VMZUUEsUqWjzexI7vjyNGRNSAUoM9smYvdTP55ctAhIbrs5lmQgcSVMWdaoigbWouCl95DVSQsxooY-qqfGcYqS4g4zA.png",[48,355,356],{},"Beyond infrastructure costs, when factoring in both storage pricing, vendor pricing and operational expenses, Ursa’s total cost of ownership (TCO) for a 5GB\u002Fs workload with a 7-day retention period is:",[339,358,359,362],{},[342,360,361],{},"50% cheaper than Confluent WarpStream",[342,363,364],{},"85% cheaper than MSK and Redpanda",[48,366,367],{},[351,368],{"alt":18,"src":369},"\u002Fimgs\u002Fblogs\u002F679c602d77e9c706de5343b8_AD_4nXeDv8rrv_C1CTCCiqYo1zpvlGYbdBk1r0VEqovAPu22iFMQZgh54Hfw9PBMLzM7jDFxKwAFDxbdG0np4XVk_tGsWhEKMloLRcmmea7lvueCx-0cFsyaE3Mya4Mxc1Dox95A6JEc.png",[40,371,373],{"id":372},"ursa-highly-cost-efficient-data-streaming-at-scale","Ursa: Highly Cost-Efficient Data Streaming at Scale",[48,375,376,380],{},[55,377,379],{"href":378},"\u002Fblog\u002Fursa-reimagine-apache-kafka-for-the-cost-conscious-data-streaming","Ursa"," is a next-generation data streaming engine designed to deliver high performance at a fraction of the cost of traditional disk-based solutions. It is fully compatible with Apache Kafka and Apache Pulsar APIs, while leveraging a leaderless, lakehouse-native architecture to maximize scalability, efficiency, and cost savings.",[48,382,383],{},"Ursa’s key innovation is separating storage from compute and decoupling metadata\u002Findex operations from data operations by utilizing cloud object storage (e.g., AWS S3) instead of costly inter-zone disk-based replication. It also employs open lakehouse formats (Iceberg and Delta Lake), enabling columnar compression to significantly reduce storage costs while maintaining durability and availability.",[48,385,386],{},"In contrast, traditional streaming systems—like Kafka and Redpanda—depend on leader-based architectures, which drive up inter-zone traffic costs due to replication and client communication. Ursa mitigates these costs by:",[339,388,389,392],{},[342,390,391],{},"Eliminating inter-zone traffic costs via a leaderless architecture.",[342,393,394],{},"Replacing costly inter-zone replication with direct writes to cloud storage using open lakehouse formats.",[40,396,398],{"id":397},"how-ursa-eliminates-inter-zone-traffic","How Ursa Eliminates Inter-Zone Traffic",[48,400,401],{},"Ursa minimizes inter-zone traffic by leveraging a leaderless architecture, which eliminates inter-zone communication between clients and brokers, and lakehouse-native storage, which removes the need for inter-zone data replication. This approach ensures high availability and scalability while avoiding unnecessary cross-zone data movement.",[48,403,404],{},[351,405],{"alt":18,"src":406},"\u002Fimgs\u002Fblogs\u002F679c602e21b3571bb7117dca_AD_4nXd7Oahc77NjRLNvA9clLt0tsyU6MrIqVibFYv5pW5giTIcCHPr3EA_yTGzfVEUIVO3VXK56qWK8zmBCp5lY0E_4nmlWIPFrHjtHylA5NhwELjn-UB0fLG2h_kbrxrc7Cs_edvveNA.png",[32,408,410],{"id":409},"leaderless-architecture","Leaderless architecture",[48,412,413],{},"Traditional streaming engines such as Kafka, Pulsar, or RedPanda rely on a leader-based model, where each partition is assigned to a single leader broker that handles all writes and reads.",[48,415,416],{},"Pros of Leader-Based Architectures:\n✔ Maintains message ordering via local sequence IDs\n✔ Delivers low latency and high performance through message caching",[48,418,419],{},"Cons of Leader-Based Architectures:\n✖ Throughput bottlenecked by a single broker per partition\n✖ Inter-zone traffic required for high availability in multi-AZ deployments",[48,421,422],{},"While Kafka and Pulsar offer partial solutions (e.g., reading from followers, shadow topics) to reduce read-related inter-zone traffic, producers still send data to a single leader.",[48,424,425,426,431],{},"Ursa removes the concept of topic ownership, allowing any broker in the cluster to handle reads or writes for any partition. The primary challenge—ensuring message ordering—is solved with ",[55,427,430],{"href":428,"rel":429},"https:\u002F\u002Fgithub.com\u002Fstreamnative\u002Foxia",[264],"Oxia",", a scalable metadata and index service created by StreamNative in 2022.",[32,433,435],{"id":434},"oxia-the-metadata-layer-enabling-leaderless-architecture","Oxia: The Metadata Layer Enabling Leaderless Architecture",[48,437,438],{},"Ensuring message ordering in a leaderless architecture is complex, but Ursa solves this with Oxia:",[339,440,441,444,447],{},[342,442,443],{},"Handles millions of metadata\u002Findex operations per second",[342,445,446],{},"Generates sequential IDs to maintain strict message ordering",[342,448,449],{},"Optimized for Kubernetes with horizontal scalability",[48,451,452],{},"Producers and consumers can connect to any broker within their local AZ, eliminating inter-zone traffic costs while maintaining performance through localized caching.",[32,454,456],{"id":455},"zero-interzone-data-replication","Zero interzone data replication",[48,458,459],{},"In most distributed systems, data replication from a leader (primary) to followers (replicas) is crucial for fault tolerance and availability. However, replication across zones can inflate infrastructure expenses substantially.",[48,461,462],{},"Ursa avoids these costs by writing data directly to cloud storage (e.g., AWS S3, Google GCS):",[339,464,465,468,471],{},[342,466,467],{},"Built-In Resilience: Cloud storage inherently offers high availability and fault tolerance without inter-zone traffic fees.",[342,469,470],{},"Tradeoff: Slightly higher latency (sub-second, with p99 at 500 milliseconds) compared to local disk\u002FEBS (single-digit to sub-100 milliseconds), in exchange for significantly lower costs (up to 10x lower).",[342,472,473],{},"Flexible Modes: Ursa is an addition to the classic BookKeeper-based engine, providing users with the flexibility to optimize for either cost or low latency based on their workload requirements.",[48,475,476],{},"By foregoing conventional replication, Ursa slashes inter-zone traffic costs and associated complexities—making it a compelling option for organizations seeking to balance high-performance data streaming with strict budget constraints.",[40,478,480],{"id":479},"how-we-ran-a-5-gbs-test-with-ursa","How We Ran a 5 GB\u002Fs Test with Ursa",[32,482,484],{"id":483},"ursa-cluster-deployment","Ursa Cluster Deployment",[339,486,487,490],{},[342,488,489],{},"9 brokers across 3 availability zones, each on m6i.8xlarge (Fixed 12.5 Gbps bandwidth, 32 vCPU cores, 128 GB memory).",[342,491,492],{},"Oxia cluster (metadata store) with 3 nodes of m6i.8xlarge, distributed across three availability zones (AZs).",[48,494,495],{},"During peak throughput (5 GB\u002Fs), each broker’s network usage was about 10 Gbps.",[32,497,499],{"id":498},"openmessaging-benchmark-workers-configuration","OpenMessaging Benchmark Workers & Configuration",[48,501,502,503,507],{},"The OpenMessaging Benchmark(OMB) Framework is a suite of tools that make it easy to benchmark distributed messaging systems in the cloud. Please check ",[55,504,505],{"href":505,"rel":506},"https:\u002F\u002Fopenmessaging.cloud\u002Fdocs\u002Fbenchmarks\u002F",[264]," for details.",[339,509,510,525,534],{},[342,511,512,513,518,519,524],{},"12 OMB workers: 6 for ",[55,514,517],{"href":515,"rel":516},"https:\u002F\u002Fgist.github.com\u002Fcodelipenghui\u002Fd1094122270775e4f1580947f80c5055",[264],"producers",", 6 for ",[55,520,523],{"href":521,"rel":522},"https:\u002F\u002Fgist.github.com\u002Fcodelipenghui\u002F06bada89381fb77a7862e1b4c1d8963d",[264],"consumers"," across 3 availability zones, on m6i.8xlarge instances. Each worker is configured with 12 CPU cores and 48 GB memory.",[342,526,527,528,533],{},"Sample YAML ",[55,529,532],{"href":530,"rel":531},"https:\u002F\u002Fgist.github.com\u002Fcodelipenghui\u002F204c1f26c4d44a218ae235bf2de99904",[264],"scripts"," provided for Kafka-compatible configuration and rate limits.",[342,535,536],{},"Achieved consistent 5 GB\u002Fs publish\u002Fsubscribe throughput.",[40,538,540],{"id":539},"ursa-benchmark-tests-results","Ursa Benchmark Tests & Results",[48,542,543],{},"The following diagram demonstrates that Ursa can consistently handle 5 GB\u002Fs of traffic, fully saturating the network across all broker nodes.",[48,545,546],{},[351,547],{"alt":18,"src":548},"\u002Fimgs\u002Fblogs\u002F679c602d7b261bac1113f7d6_AD_4nXdDPsRc3koXICiFF0bqSmGWbJt_RlUy4FE3ruuWOfbCfpcqZ1dejjqGbkaCJv2hQFL1nirRouBVRW2l5uMWBvY9naMqGB_wHcLI14dBM0f85TXhmdm3UxEv1yGX9Y4hf5FttSkZew.png",[40,550,552],{"id":551},"comparing-infrastructure-cost","Comparing Infrastructure Cost",[48,554,555],{},"This benchmark first evaluates infrastructure costs of running a 5 GB\u002Fs streaming workload (1:1 producer-to-consumer ratio) across different data streaming engines, including Ursa, Redpanda, and AWS MSK, with a focus on multi-AZ deployments to ensure a fair comparison.",[32,557,559],{"id":558},"test-setup-key-assumptions","Test Setup & Key Assumptions",[48,561,562],{},"All tests use multi-AZ configurations, with clusters and clients distributed across three AWS availability zones (AZs). Cluster size scales proportionally to the number of AZs, and rack-awareness is enabled for all engines to evenly distribute topic partitions and leaders.",[48,564,565],{},"To ensure a fair comparison, we selected the same machine type capable of fully utilizing both network and storage bandwidth for Ursa and Redpanda in this 5GB\u002Fs test:",[339,567,568],{},[342,569,570],{},"9 × m6i.8xlarge instances",[48,572,573,574,579],{},"However, MSK's storage bandwidth limits vary depending on the selected instance type, with the highest allowed limit capped at 1000 MiB\u002Fs per broker, according to",[55,575,578],{"href":576,"rel":577},"https:\u002F\u002Fdocs.aws.amazon.com\u002Fmsk\u002Flatest\u002Fdeveloperguide\u002Fmsk-provision-throughput-management.html#throughput-bottlenecks",[264]," AWS documentation",". Given this constraint, achieving 5 GB\u002Fs throughput with a replication factor of 3 required the following setup:",[339,581,582],{},[342,583,584],{},"15 × kafka.m7g.8xlarge (32 vCPUs, 128 GB memory, 15 Gbps network, 4000 GiB EBS).",[48,586,587],{},"This configuration was necessary to work around MSK's storage bandwidth limitations, ensuring a comparable cost basis to other evaluated streaming engines.",[48,589,590],{},"Additional key assumptions include:",[339,592,593,596,599],{},[342,594,595],{},"Inter-AZ producer traffic: For leader-based engines, two-thirds of producer-to-broker traffic crosses AZs due to leader distribution.",[342,597,598],{},"Consumer optimizations: Follower fetch is enabled across all tests, eliminating inter-AZ consumer traffic.",[342,600,601],{},"Storage cost exclusions: This benchmark only evaluates streaming costs, assuming no long-term data retention.",[32,603,605],{"id":604},"inter-broker-replication-costs","Inter-Broker Replication Costs",[48,607,608],{},"Inter-broker (cross-AZ) replication is a major cost driver for data streaming engines:",[339,610,611,614,617],{},[342,612,613],{},"RedPanda: Inter-broker replication is not free, leading to substantial costs when data must be copied across multiple availability zones.",[342,615,616],{},"AWS MSK: Inter-broker replication is free, but MSK instance pricing is significantly higher (e.g., $3.264 per hour for kafka.m7g.8xlarge vs $1.306 per hour for an on-demand m7g.8xlarge). The storage price of MSK is $0.10 per GB-month which is significantly higher than st1, which costs $0.045 per GB-month. Even though replication is free, client-to-broker traffic still incurs inter-AZ charges.",[342,618,619],{},"Ursa: No inter-broker replication costs due to its leaderless architecture, eliminating inter-zone replication costs entirely.",[32,621,623],{"id":622},"zone-affinity-reducing-inter-az-costs","Zone Affinity: Reducing Inter-AZ Costs",[48,625,626],{},"We evaluated zone affinity mechanisms to further reduce inter-AZ data transfer costs.",[48,628,629],{},"Consumers:",[339,631,632],{},[342,633,634],{},"Follower fetch is enabled across all tests, ensuring consumers fetch data from replicas in their local AZ—eliminating inter-zone consumer traffic except for metadata lookups",[48,636,637],{},"Producers:",[339,639,640,649,658],{},[342,641,642,643,648],{},"Kafka protocol lacks an easy way to enforce producer AZ affinity (though ",[55,644,647],{"href":645,"rel":646},"https:\u002F\u002Fcwiki.apache.org\u002Fconfluence\u002Fdisplay\u002FKAFKA\u002FKIP-1123:+Rack-aware+partitioning+for+Kafka+Producer",[264],"KIP-1123"," aims to address this). And it only works with the default partitioner (i.e., when no record partition or record key is specified).",[342,650,651,652,657],{},"Redpanda recently introduced ",[55,653,656],{"href":654,"rel":655},"https:\u002F\u002Fdocs.redpanda.com\u002Fredpanda-cloud\u002Fdevelop\u002Fproduce-data\u002Fleader-pinning\u002F",[264],"leader pinning",", but this only benefits setups where producers are confined to a single AZ—not applicable to our multi-AZ benchmark.",[342,659,660,661,666],{},"Ursa is the only system in this test with ",[55,662,665],{"href":663,"rel":664},"https:\u002F\u002Fdocs.streamnative.io\u002Fdocs\u002Fconfig-kafka-client#eliminate-cross-az-networking-traffic",[264],"built-in zone affinity for both producers and consumers",". It achieves this by embedding producer AZ information in client.id, allowing metadata lookups to route clients to local-AZ brokers, eliminating inter-AZ producer traffic.",[32,668,670],{"id":669},"cost-comparison-results","Cost Comparison Results",[48,672,337],{},[339,674,675,677],{},[342,676,344],{},[342,678,347],{},[48,680,681],{},"Ursa’s leaderless architecture, zone affinity, and native cloud storage integration deliver unparalleled cost efficiency, making it the most cost-effective choice for high-throughput data streaming workloads.",[48,683,684],{},[351,685],{"alt":18,"src":686},"\u002Fimgs\u002Fblogs\u002F679c72208198ca36a352f228_AD_4nXeeZuM8T-xBlD4Vf3j67K618n08qh8wIDLLtiLJG0ssA1Wj1V26u7wIDTX9sqLrtw8mB2c299dwzarGen62CG0Vh7nWstn5qbPGFcBaKJYEepTsLr5fHWv1U8uqbg8Y0UOK6fJ7.png",[48,688,689],{},[351,690],{"alt":18,"src":691},"\u002Fimgs\u002Fblogs\u002F679c625978031f40229de484_AD_4nXdLkLLJ30KKr-_A_rN1j8akVwBYacAWIPzWHoOReJF421890kfByZoQQxkLczihVSmiw5Q9J51-V9I2SEKITbwsYnANDDTlAVL5nQ_jfaHNTe9VEWhSoa7DZooCnilDYL6l6msmJg.png",[48,693,694],{},"The detailed infrastructure cost calculations for each data streaming engine are listed below:",[32,696,698],{"id":697},"streamnative-ursa","StreamNative - Ursa",[339,700,701,704,707,710],{},[342,702,703],{},"Server EC2 costs: 9 * $1.536\u002Fhr = $14",[342,705,706],{},"Client EC2 costs: 9 * $1.536\u002Fhr =$14",[342,708,709],{},"S3 write requests costs: 1350 r\u002Fs * $0.005\u002F1000r * 3600s = $24",[342,711,712],{},"S3 read requests costs: 1350 r\u002Fs * $0.0004\u002F1000r * 3600s = $2",[32,714,716],{"id":715},"aws-msk","AWS MSK",[339,718,719,722,725],{},[342,720,721],{},"Server EC2 costs: 15 * $3.264\u002Fhr = $49",[342,723,724],{},"Client side EC2 costs: 9 * $1.536\u002Fhr =$14",[342,726,727],{},"Interzone traffic - producer to broker: 5GB\u002Fs * ⅔ * $0.02\u002FG(in+out) * 3600 = $240",[32,729,731],{"id":730},"redpanda","RedPanda",[339,733,734,736,738,741,744],{},[342,735,703],{},[342,737,706],{},[342,739,740],{},"Interzone traffic - producer to broker: 5GB\u002Fs * ⅔ * $0.02\u002FGB(in+out) * 3600 = $240",[342,742,743],{},"Interzone traffic - replication: 10GB\u002Fs * $0.02\u002FGB(in+out) * 3600 = $720",[342,745,746],{},"Interzone traffic - broker to consumer: $0 (fetch from local zone)",[48,748,749,750,755],{},"Please note that we were unable to test ",[55,751,754],{"href":752,"rel":753},"https:\u002F\u002Fwww.redpanda.com\u002Fblog\u002Fcloud-topics-streaming-data-object-storage",[264],"Redpanda with Cloud Topics",", as it remains an announced but unreleased feature and is not yet available for evaluation. Based on the limited information available, while Cloud Topics may help optimize inter-zone data replication costs, producers still need to traverse inter-availability zones to connect to the topic partition owners and incur inter-zone traffic costs of up to $240 per hour.",[339,757,758,764],{},[342,759,760,763],{},[55,761,647],{"href":645,"rel":762},[264]," (when implemented) will help mitigate producer-to-broker inter-zone traffic, but it is not yet available. And it only works with the default partitioner (no record partition or key is specified).",[342,765,766],{},"Redpanda’s leader pinning helps only when all producers for the pinned topic are confined to a single AZ. In multi-AZ environments (like our benchmark), inter-zone producer traffic remains unavoidable.",[48,768,769],{},"Additionally, Redpanda’s Cloud Topics architecture is not documented publicly. Their blog mentions \"leader placement rules to optimize produce latency and ingress cost,\" but it is unclear whether this represents a shift away from a leader-based architecture or if it uses techniques similar to Ursa’s zone-aware approach.",[48,771,772],{},"We may revisit this comparison as more details become available.",[40,774,776],{"id":775},"comparing-total-cost-of-ownership","Comparing Total Cost of Ownership",[48,778,779],{},"As highlighted earlier, with a BYOC Ursa setup, you can achieve 5 GB\u002Fs throughput at just 5% of the infrastructure cost of a traditional leader-based data streaming engine, such as Kafka or RedPanda, while managing the infrastructure yourself. This significant cost reduction is enabled by Ursa’s leaderless architecture and lakehouse-native storage design, which eliminate overhead costs such as inter-zone traffic and leader-based data replication. By leveraging a lakehouse-native, leaderless architecture, Ursa reduces resource requirements, enabling you to handle high data throughput efficiently and at a fraction of the cost of RedPanda.",[48,781,782],{},"Now, let’s examine the total cost comparison, evaluating Ursa alongside other vendors, including those that have adopted a leaderless architecture (e.g., Confluent WarpStream). This comparison is based on a 5GB\u002Fs workload with a 7-day retention period, factoring in both storage cost and vendor costs Here are the key findings:",[339,784,785,788,791],{},[342,786,787],{},"Ursa ($164,353\u002Fmonth) is: 50% cheaper than Confluent WarpStream ($337,068\u002Fmonth)",[342,789,790],{},"85% cheaper than AWS MSK ($1,115,251\u002Fmonth)",[342,792,793],{},"86% cheaper than Redpanda ($1,202,853\u002Fmonth)",[48,795,796],{},"In addition to Ursa’s architectural advantages—eliminating most inter-AZ traffic and leveraging lakehouse storage for cost-effective data retention—it also adopts a more fair and cost-efficient pricing model: Elastic Throughput-based pricing. This approach aligns costs with actual usage, avoiding unnecessary overhead.",[48,798,799],{},"Unlike WarpStream, which charges for both storage and throughput, Ursa ensures that customers only pay for the throughput they actively use. Ursa’s pricing is based on compressed data sent by clients, meaning the more data compressed on the client side, the lower the cost. In contrast, WarpStream prices are based on uncompressed data, unfairly inflating expenses and failing to incentivize customers to optimize their client applications.",[48,801,802],{},"This distinction is crucial, as compressed data reduces both storage and network costs, making Ursa’s pricing model not only more cost-effective but also more transparent and predictable.",[48,804,805],{},[351,806],{"alt":18,"src":807},"\u002Fimgs\u002Fblogs\u002F679c602d194800c9206d9d58_AD_4nXcFlf755xgyz7htxhMhBV5fGrsxy642mQNodt61DTok_z1dwkw5A6lkO5hatXVneCaB0anbZPAyvLI3MlIMuQEYLEACHHvQMOr5UfaB37dfzkdqewDEvcT-20VGd_zzvJsuA00zGA.png",[48,809,810],{},[351,811],{"alt":18,"src":812},"\u002Fimgs\u002Fblogs\u002F679c62594e9c2e629fae73aa_AD_4nXeU6cOgItnjLsEZCOf13TEvMY_SHWWIxYP2OYUj-B1GUPyWO78OG08K_v03hwYSVcg06f9dqDiGmdwy76vynjmiDGL5bluZ5_XF4nSU_r59oOZdfViXndXt6s11vVOY7qwfZN8v.png",[32,814,816],{"id":815},"cost-breakdown","Cost Breakdown",[818,819,820],"h4",{"id":697},"StreamNative – Ursa",[339,822,823,826,829,832,835],{},[342,824,825],{},"EC2 (Server): 9 × $1.536\u002Fhr × 24 hr × 30 days = $9,953.28",[342,827,828],{},"S3 Write Requests: 1,350 r\u002Fs × $0.005\u002F1,000 r × 3,600 s × 24 hr × 30 days = $17,496",[342,830,831],{},"S3 Read Requests: 1,350 r\u002Fs × $0.0004\u002F1,000 r × 3,600 s × 24 hr × 30 days = $1,400",[342,833,834],{},"S3 Storage Costs: 5 GB\u002Fs × $0.021\u002FGB × 3,600 s × 24 hr × 7 days = $63,504",[342,836,837],{},"Vendor Cost: 200 ETU × $0.50\u002Fhr × 24 hr × 30 days = $72,000",[818,839,841],{"id":840},"warpstream","WarpStream",[339,843,844,847],{},[342,845,846],{},"Based on WarpStream’s pricing calculator (as of January 29, 2025), we assume a 4:1 client data compression ratio, meaning 20 GB\u002Fs of uncompressed data translates to 5 GB\u002Fs of compressed data.",[342,848,849,850,855],{},"It's important to note that WarpStream’s pricing structure has fluctuated frequently throughout January. We observed the cost reported by their calculator changing from $409,644 per month to $337,068 per month. This variability has been previously highlighted in the blog post “",[55,851,854],{"href":852,"rel":853},"https:\u002F\u002Fbigdata.2minutestreaming.com\u002Fp\u002Fthe-brutal-truth-about-apache-kafka-cost-calculators",[264],"The Brutal Truth About Kafka Cost Calculators","”. To ensure transparency, we have documented the pricing as of January 29, 2025.",[48,857,858],{},[351,859],{"alt":18,"src":860},"\u002Fimgs\u002Fblogs\u002F679c602e42713e0028e9af5e_AD_4nXcu5_VWTLu9jRYs6zX1MBAOtLQEo5gyfNSWPcbpnQHXTa8qNCFAXezRR2E8daygzYTTwd4dhJjaLaLM8C6y_3OGbu2NS7pdvEv3a8-ptNKOg7AeKnYqPQCAYvQ5EuxzuI3JYIvY.png",[818,862,864],{"id":863},"msk","MSK",[339,866,867,870,873],{},[342,868,869],{},"EC2 (Server): 15 * $3.264\u002Fhr × 24 hr × 30 days = $35,251",[342,871,872],{},"Interzone Traffic (Client-Server): 5 GB\u002Fs × ⅔ × $0.02\u002FGB (in+out) × 3,600 s × 24 hr × 30 days = $172,800",[342,874,875],{},"Storage: 5 GB\u002Fs × $0.1\u002FGB-month × 3,600 s × 24 hr × 7 days * 3 replicas = $907,200",[818,877,731],{"id":878},"redpanda-1",[339,880,881,884,886,889,892],{},[342,882,883],{},"EC2 (Server): 9 × $1.536\u002Fhr × 24 hr × 30 days = $9953",[342,885,872],{},[342,887,888],{},"Interzone Traffic (Replication): 5 GB\u002Fs × 2 × $0.02\u002FGB (in+out) × 3,600 s × 24 hr × 30 days = $518,400",[342,890,891],{},"Storage: 5 GB\u002Fs × $0.045\u002FGB-month(st1) × 3,600 s × 24 hr × 7 days * 3 replicas = $408,240",[342,893,894],{},"Vendor Cost: $93,333 per month (based on limited information. See additional notes below).",[818,896,898],{"id":897},"additional-notes","Additional Notes",[339,900,901],{},[342,902,903,904,909],{},"Redpanda does not publicly disclose its BYOC pricing, making it difficult to accurately assess its total costs. We refer to information from the whitepaper “",[55,905,908],{"href":906,"rel":907},"https:\u002F\u002Fwww.redpanda.com\u002Fresources\u002Fredpanda-vs-confluent-performance-tco-benchmark-report#form",[264],"Redpanda vs. Confluent: A Performance and TCO Benchmark Report by McKnight Consulting Group.","” for estimation purposes. Based on the Tier-8 pricing model in the whitepaper,  the estimated cost to support a 5GB\u002Fs workload would be $1.12 million per year ($93,333 per month). However, since this calculation is based on an estimation, we will revisit and refine the cost assessment once Redpanda publishes its BYOC pricing.",[48,911,912],{},[351,913],{"alt":18,"src":914},"\u002Fimgs\u002Fblogs\u002F679c602dc8a9859eed89a0ef_AD_4nXdbcO8vsNNPy4GtkNLlmNKf22fjxRvzLzH7CtOna1L08sTbvnZx3HhufeFqc1w4K2gEF7lxO2IR5supotxebAiGnA07Qa8Yr3Rd1pVK2LYKK4WurlJGwgdwwucZIFoF-N_2oBjY.png",[48,916,917],{},[351,918],{"alt":18,"src":919},"\u002Fimgs\u002Fblogs\u002F679c602d6bc1c2287e012540_AD_4nXfcHZnLfjbjIr3ZAgoQXT9dwP3aQCOQPmGZZJUtpNZSwE6qY6M3yehIaBxCwxEIeu5PVdUPY0zhyjnow26YfgjdYgSG4GnV9ibxu0YWTIpwng6z_F6FUGJMpERMKtpsFESzXSN_Sw.png",[339,921,922,925],{},[342,923,924],{},"When estimating the storage costs for Kafka and Redpanda, we assume the use of HDD storage at $0.045\u002FGB, based on the premise that both systems can fully utilize disk bandwidth without incurring the higher costs associated with GP2 or GP3 volumes. However, in practice, many users opt for GP2 or GP3, significantly increasing the total storage cost for Kafka and Redpanda.",[342,926,927],{},"Unlike disk-based solutions, S3 storage does not require capacity preallocation—Ursa only incurs costs for the actual data stored. This contrasts with Kafka and Redpanda, where preallocating storage can drive up expenses. As a result, the real-world storage costs for Kafka and Redpanda are often 50% higher than the estimates above.",[40,929,931],{"id":930},"conclusion","Conclusion",[48,933,934],{},"Ursa represents a transformative shift in streaming data infrastructure, offering cost efficiency, scalability, and flexibility without compromising durability or reliability. By leveraging a leaderless architecture and eliminating inter-zone data replication, Ursa reduces total cost of ownership by over 90% compared to traditional leader-based streaming engines like Kafka and Redpanda. Its direct integration with cloud storage and scalable metadata & index management via Oxia ensure high availability and simplified infrastructure management.",[32,936,938],{"id":937},"balancing-latency-and-cost","Balancing Latency and Cost",[48,940,941,945],{},[55,942,944],{"href":943},"\u002Fblog\u002Fcap-theorem-for-data-streaming","Ursa trades off slightly higher latency for ultra low cost",", making it an ideal choice for the majority of streaming workloads, especially those that prioritize throughput and cost savings over ultra-low latency. Meanwhile, StreamNative’s BookKeeper-based engine remains the preferred solution for real-time, latency-sensitive applications. By combining these two approaches, StreamNative empowers customers with the flexibility to choose the right engine for their specific needs—whether it's maximizing cost savings or achieving ultra low-latency real-time performance.",[32,947,949],{"id":948},"the-future-of-streaming-infrastructure","The Future of Streaming Infrastructure",[48,951,952],{},"In an era where data fuels AI, analytics, and real-time decision-making, managing infrastructure costs is critical to sustaining innovation. Ursa is not just a cost-cutting alternative—it is a forward-thinking, lakehouse-native platform that redefines how modern data streaming infrastructure should be built and operated.",[48,954,955,956,961],{},"Whether your priority is reducing costs, improving flexibility, or ingesting massive data into lakehouses, Ursa delivers a future-proof solution for the evolving demands of real-time data streaming. ",[55,957,960],{"href":958,"rel":959},"https:\u002F\u002Fconsole.streamnative.cloud\u002F",[264],"Get started"," with StreamNative Ursa today!",[963,964,966],"h1",{"id":965},"references","References",[48,968,969,972,973],{},[970,971,430],"span",{}," ",[55,974,975],{"href":975},"\u002Fblog\u002Fintroducing-oxia-scalable-metadata-and-coordination",[48,977,978,972,980],{},[970,979,379],{},[55,981,378],{"href":378},[48,983,984,972,987],{},[970,985,986],{},"StreamNative pricing",[55,988,989],{"href":989,"rel":990},"https:\u002F\u002Fdocs.streamnative.io\u002Fdocs\u002Fbilling-overview",[264],[48,992,993,972,996],{},[970,994,995],{},"WarpStream pricing",[55,997,998],{"href":998,"rel":999},"https:\u002F\u002Fwww.warpstream.com\u002Fpricing#pricingfaqs",[264],[48,1001,1002,972,1005],{},[970,1003,1004],{},"AWS S3 pricing",[55,1006,1007],{"href":1007,"rel":1008},"https:\u002F\u002Faws.amazon.com\u002Fs3\u002Fpricing\u002F",[264],[48,1010,1011,972,1014],{},[970,1012,1013],{},"AWS EBS pricing",[55,1015,1016],{"href":1016,"rel":1017},"https:\u002F\u002Faws.amazon.com\u002Febs\u002Fpricing\u002F",[264],[48,1019,1020,972,1023],{},[970,1021,1022],{},"AWS MSK pricing",[55,1024,1025],{"href":1025,"rel":1026},"https:\u002F\u002Faws.amazon.com\u002Fmsk\u002Fpricing\u002F",[264],[48,1028,1029,972,1032],{},[970,1030,1031],{},"The Brutal Truth about Kafka Cost Calculators",[55,1033,852],{"href":852,"rel":1034},[264],[48,1036,1037,972,1040],{},[970,1038,1039],{},"Redpanda vs. Confluent: A Performance and TCO Benchmark Report by McKnight Consulting Group",[55,1041,906],{"href":906,"rel":1042},[264],{"title":18,"searchDepth":19,"depth":19,"links":1044},[1045,1046,1047,1052,1056,1057,1066,1069],{"id":333,"depth":19,"text":334},{"id":372,"depth":19,"text":373},{"id":397,"depth":19,"text":398,"children":1048},[1049,1050,1051],{"id":409,"depth":279,"text":410},{"id":434,"depth":279,"text":435},{"id":455,"depth":279,"text":456},{"id":479,"depth":19,"text":480,"children":1053},[1054,1055],{"id":483,"depth":279,"text":484},{"id":498,"depth":279,"text":499},{"id":539,"depth":19,"text":540},{"id":551,"depth":19,"text":552,"children":1058},[1059,1060,1061,1062,1063,1064,1065],{"id":558,"depth":279,"text":559},{"id":604,"depth":279,"text":605},{"id":622,"depth":279,"text":623},{"id":669,"depth":279,"text":670},{"id":697,"depth":279,"text":698},{"id":715,"depth":279,"text":716},{"id":730,"depth":279,"text":731},{"id":775,"depth":19,"text":776,"children":1067},[1068],{"id":815,"depth":279,"text":816},{"id":930,"depth":19,"text":931,"children":1070},[1071,1072],{"id":937,"depth":279,"text":938},{"id":948,"depth":279,"text":949},"StreamNative Cloud","2025-01-31","Discover how Ursa achieves 5GB\u002Fs Kafka workloads at just 5% of the cost of traditional streaming engines like Redpanda and AWS MSK. See our benchmark results comparing infrastructure costs, total cost of ownership (TCO), and performance across leading Kafka vendors.","\u002Fimgs\u002Fblogs\u002F679c6593d25099b1cdcec4ca_image-31.png",{},"\u002Fblog\u002Fhow-we-run-a-5-gb-s-kafka-workload-for-just-50-per-hour","30 min",{"title":308,"description":1075},"blog\u002Fhow-we-run-a-5-gb-s-kafka-workload-for-just-50-per-hour",[1083,1084,303],"TCO","Apache Kafka","A0o_2xdJiLI6rf6xj4RKsxJNo_A6QN2fYzCp6gaLrFw",{"id":1087,"title":1088,"authors":1089,"body":1093,"category":1168,"createdAt":290,"date":1711,"description":1712,"extension":8,"featured":294,"image":1713,"isDraft":294,"link":290,"meta":1714,"navigation":7,"order":296,"path":1715,"readingTime":1716,"relatedResources":290,"seo":1717,"stem":1718,"tags":1719,"__hash__":1721},"blogs\u002Fblog\u002Fcomparison-of-messaging-platforms-apache-pulsar-vs-rabbitmq-vs-nats-jetstream.md","A Comparison of Messaging Platforms: Apache Pulsar vs. RabbitMQ vs. NATS JetStream",[1090,1091,1092],"Jihyun Tornow","Elliot West","Matteo Merli",{"type":15,"value":1094,"toc":1687},[1095,1099,1102,1115,1118,1120,1126,1158,1162,1165,1169,1172,1175,1179,1182,1185,1189,1192,1195,1199,1203,1206,1210,1213,1217,1220,1224,1227,1231,1234,1238,1247,1250,1253,1257,1260,1263,1267,1270,1273,1276,1280,1283,1286,1289,1293,1296,1299,1303,1306,1309,1312,1316,1322,1328,1334,1340,1343,1346,1349,1353,1356,1359,1362,1365,1368,1371,1377,1383,1389,1395,1398,1401,1409,1412,1415,1418,1421,1424,1427,1430,1433,1436,1442,1448,1454,1460,1466,1472,1475,1483,1486,1489,1492,1495,1498,1501,1504,1507,1510,1516,1522,1528,1534,1540,1546,1549,1552,1555,1558,1561,1564,1567,1570,1573,1575,1578,1581,1584,1599,1602,1606,1609,1641,1643,1649,1654,1665,1674,1680,1685],[40,1096,1098],{"id":1097},"executive-summary","Executive Summary",[48,1100,1101],{},"When building scalable, reliable, and efficient applications, choosing the right messaging and streaming platform is critical. In this benchmark report, we compare the technical performances of three of the most popular messaging platforms: Apache PulsarTM, RabbitMQTM, and NATS JetStream.",[48,1103,1104,1105,1110,1111,190],{},"The tests assessed each messaging platform’s throughput and latency under varying workloads, node failures, and backlogs. Please note that Apache Kafka was not included in our benchmark as ",[55,1106,1109],{"href":1107,"rel":1108},"https:\u002F\u002Fwww.splunk.com\u002Fen_us\u002Fblog\u002Fit\u002Fcomparing-pulsar-and-kafka-unified-queuing-and-streaming.html",[264],"Kafka does not support queuing scenarios",". For more information on Kafka, please refer to the ",[55,1112,1114],{"href":1113},"\u002Fwhitepapers\u002Fapache-pulsar-vs-apache-kafka-2022-benchmark","Pulsar vs. Kafka 2022 Benchmark report",[48,1116,1117],{},"Our objective was to provide guidance on each platform’s capabilities and reliability, and help potential users choose the right technology for their specific needs. The results of these tests provide valuable insights into the performance characteristics of each platform and will be helpful for those considering using these technologies.",[32,1119,334],{"id":333},[48,1121,1122],{},[351,1123],{"alt":1124,"src":1125},"Figure 1  - Apache Pulsar, RabbitMQ, and NATS JetStream Comparison","\u002Fimgs\u002Fblogs\u002F63ff8c703c54b76ef8085574_Figure-1-key-findings.png",[339,1127,1128,1131,1134,1137,1140,1143,1146,1149,1152,1155],{},[342,1129,1130],{},"Throughput:",[342,1132,1133],{},"~ Pulsar showed a higher peak consumer throughput of 2.6M msgs\u002Fs compared to RabbitMQ’s 48K msgs\u002Fs and NATS JetStream’s 160K msgs\u002Fs.",[342,1135,1136],{},"~ Pulsar was able to support a producer rate of 1M msgs\u002Fs — 33x faster than RabbitMQ and 20x faster than NATS JetStream.",[342,1138,1139],{},"Backlog:",[342,1141,1142],{},"~ Pulsar outperformed RabbitMQ during the backlog drain with a stable publish rate of 100K msgs\u002Fs, while RabbitMQ's publish rate dropped by more than 50%.",[342,1144,1145],{},"Latency:",[342,1147,1148],{},"~ Pulsar's p99 latency was 300x better than RabbitMQ and 40x better than NATS JetStream at a topic count of 50.",[342,1150,1151],{},"Scalability:",[342,1153,1154],{},"~ Pulsar achieved 1M msgs\u002Fs up to 50 topics and provided a publish rate above 200K msgs\u002Fs for up to 20K topics.",[342,1156,1157],{},"~ RabbitMQ was able to process 20K msgs\u002Fs, and NATS was able to support 30K msgs\u002Fs for topic counts up to 500.",[40,1159,1161],{"id":1160},"background","Background",[48,1163,1164],{},"Before we dive into the benchmark tests, let’s start with a brief overview of the architecture, features, and ideal applications for each messaging platform.",[32,1166,1168],{"id":1167},"apache-pulsar","Apache Pulsar",[48,1170,1171],{},"Apache Pulsar is an open-source, cloud-native messaging and streaming platform designed for building scalable, reliable applications in elastic cloud environments. Its multi-layer architecture includes multi-tenancy with resource separation and access control, geo-replication across regions, tiered storage, and support for five official client languages. These capabilities make Pulsar an ideal choice for building applications that require scalability and reliability.",[48,1173,1174],{},"One of the standout features of Pulsar is its shared subscription, which is handy for queuing applications and natively supports delayed and scheduled messages. Additionally, Pulsar simplifies application architecture by supporting up to 1M unique topics, making it widely used for high-performance data pipelines, event-driven microservices, real-time analytics, and other real-time workloads. Originally developed at Yahoo! and committed to open source in 2016, Pulsar has become popular among developers and leading organizations.",[32,1176,1178],{"id":1177},"rabbitmq","RabbitMQ",[48,1180,1181],{},"RabbitMQ is a popular and mature open-source distributed messaging platform that implements the Advanced Message Queuing Protocol (AMQP) — often used for asynchronous communication between services using the pub\u002Fsub model. The core of RabbitMQ’s architecture is the message exchange, which includes direct, topic, headers, and fanout exchanges. RabbitMQ is designed to be flexible, scalable, and reliable, making it an effective tool for building distributed systems that require asynchronous message exchange.",[48,1183,1184],{},"RabbitMQ is a good choice if you have simple applications where message durability, ordering, replay, and retention are not critical factors. However, RabbitMQ has limitations in dealing with massive data distribution and may not be suitable for applications with heavy messaging traffic. In addition, the platform does not support other messaging patterns such as request\u002Fresponse or event-driven applications.",[32,1186,1188],{"id":1187},"nats-jetstream","NATS JetStream",[48,1190,1191],{},"NATS is an open-source messaging platform optimized for cloud-native and microservices applications. Its lightweight, high-performance design supports pub\u002Fsub and queue-based messaging and stream data processing. NATS JetStream is a second-generation streaming platform that integrates directly into NATS. NATS JetStream replaces the older NATS streaming platform and addresses its limitations, such as the lack of message replay, retention policies, persistent storage, stream replication, stream mirroring, and exactly-once semantics.",[48,1193,1194],{},"NATS utilizes a single-server architecture, which makes it easy to deploy and manage, particularly in resource-constrained environments. However, NATS does not support message durability and may not be suitable for applications that require this or complex message routing and transformations. Despite this, NATS offers an asynchronous, event-driven model that is well-suited for simple pub\u002Fsub and queue-based messaging patterns due to its high performance and low latencies.",[40,1196,1198],{"id":1197},"overview-of-tests","Overview of Tests",[32,1200,1202],{"id":1201},"what-we-tested","What We Tested",[48,1204,1205],{},"We conducted four benchmark tests to evaluate each platform’s performance under various conditions, such as workload variations, node failure, and backlogs. The aim was to assess each platform’s responses to these conditions and to provide insights into their capabilities in a given environment.",[818,1207,1209],{"id":1208},"_1-node-failure","1. Node failure",[48,1211,1212],{},"Failures will inevitably occur in any platform, so it’s vital to understand how each platform will respond to and recover when such events occur. This test aimed to evaluate the performance of each platform in response to a single node failure and subsequent recovery. To simulate a node failure, we performed broker terminations and resumptions via systemctl stop on the node. We then monitored the performance of the remaining nodes as they took on the workload of the failed node. We anticipated a decrease in producer throughput and an increase in producer latency upon failure due to the overall reduction in the cluster’s resources.",[818,1214,1216],{"id":1215},"_2-topic-counts","2. Topic counts",[48,1218,1219],{},"This test examined the relationship between peak throughput and latency and the number of topics within a platform. We measured the performance of each platform at various topic counts, from very small to very large, to understand how the platform’s performance changed as the number of topics grew. We expected that for very small topic counts, the platform would exhibit sub-par performance due to its inability to utilize available concurrency effectively. On the other hand, for very large topic counts, we expected performance to degrade as resource contention became more pronounced. This test aimed to determine the maximum number of topics each messaging platform could support while maintaining acceptable performance levels.",[818,1221,1223],{"id":1222},"_3-subscription-counts","3. Subscription counts",[48,1225,1226],{},"Scaling a messaging platform can be a challenging task. As the number of subscribers per topic increases, changes in peak throughput and latency are expected due to the read-amplification effect. The imbalance between writes and reads occurs because each message is read multiple times. Despite this, we would expect the tail reads to be relatively lightweight compared to the producer’s writes, which are most likely coming from a cache. Increased competition among consumers to access each topic may also lead to a drop in performance. This test aimed to determine the maximum number of subscriptions per topic that could be achieved on each messaging platform while maintaining acceptable performance levels. However, scaling complexity increases non-linearly and potential bottlenecks arise from shared resources.",[818,1228,1230],{"id":1229},"_4-backlog-draining","4. Backlog draining",[48,1232,1233],{},"One of the essential roles of a messaging bus is to act as a buffer between different applications or platforms. When consumers are unavailable or not enough, the platform accumulates the data for later processing. In these situations, it is vital that consumers can quickly drain the backlog of accumulated data and catch up with the newly produced data. During this catch-up process, it is crucial that the performance of existing producers is not impacted in terms of throughput and latency, either on the same topic or on other topics within the cluster. This test aimed to evaluate the ability of each messaging bus to effectively support consumers in catching up with backlog data while minimizing the impact on the producer performance.",[32,1235,1237],{"id":1236},"how-we-set-up-the-tests","How We Set Up the Tests",[48,1239,1240,1241,1246],{},"We conducted all tests using the ",[55,1242,1245],{"href":1243,"rel":1244},"https:\u002F\u002Fgithub.com\u002Fopenmessaging\u002Fbenchmark",[264],"OpenMessaging Benchmark tool"," on AWS EC2 instances. For consistency, we utilized similar instances to test each messaging platform. Our workloads used 1KB messages with randomized payloads and a single partition per topic. We had 16 producers, and 16 consumers per subscription, with one subscription in total. To ensure durability, we configured topics to have two guaranteed copies of each message, resulting in a replica count of three. We documented any deviations from the protocol in the individual tests.",[48,1248,1249],{},"‍",[48,1251,1252],{},"We conducted these tests at each platform’s “maximum producer rate” for the outlined hardware and workload configuration. Although the OMB tool includes an adaptive producer throughput mode, this was not found to be reliable and would often undershoot or behave erratically. Instead, we adopted a manual protocol to determine appropriate producer throughput rates. For each workload, we ran multiple test instances at different rates, narrowing down the maximum attainable producer rate that resulted in no producer errors and no accumulating producer or consumer backlog. In this scenario, we could be confident that the platform would be in a steady state of near maximum end-to-end throughput. Given the discrete nature of this protocol, it is possible that real-world maximum producer rates could be slightly higher and have greater variability than those determined for the tests.",[818,1254,1256],{"id":1255},"infrastructure-topology","Infrastructure topology",[48,1258,1259],{},"Client instances:\t\t4 × m5n.8xlarge",[48,1261,1262],{},"Broker instances:\t\t3 × i3en.6xlarge",[818,1264,1266],{"id":1265},"platform-versions","Platform versions",[48,1268,1269],{},"Apache Pulsar:\t\t2.11.0",[48,1271,1272],{},"RabbitMQ:\t\t\t3.10.7",[48,1274,1275],{},"NATS JetStream:\t\t2.9.6",[818,1277,1279],{"id":1278},"platform-specific-caveats","Platform-specific caveats",[48,1281,1282],{},"Pulsar – Our Pulsar setup had the broker and bookies co-located on the same VM, 3 × i3en.6xlarge topology. The ZooKeeper instance was set up separately with 3 × i3en.2xlarge topology.",[48,1284,1285],{},"RabbitMQ – We conducted tests using Quorum Queues, the recommended method for implementing durable and replicated messaging. While the results indicated that this operating mode in RabbitMQ has slightly lower performance than the “classic” mode, it offers better resilience against single-node failures.",[48,1287,1288],{},"‍NATS JetStream – During our tests, we attempted to follow the recommended practices for deliverGroups and deliverSubjects in NATS, but encountered difficulties. Our NATS subscriptions failed to act in a shared mode and instead exhibited a fan-out behavior, resulting in a significant read amplification of 16 times. This likely significantly impacted the overall publisher performance in the subscription count test. Despite our best efforts, we were unable to resolve this issue.",[40,1290,1292],{"id":1291},"benchmark-parameters-results","Benchmark Parameters & Results",[48,1294,1295],{},"All reported message rates are platform aggregates, not for individual topics, producers, subscriptions, or consumers.",[32,1297,1209],{"id":1298},"_1-node-failure-1",[818,1300,1302],{"id":1301},"test-parameters","Test Parameters",[48,1304,1305],{},"In a departure from the standard test parameters, in this test we employed five broker nodes instead of three and five client nodes instead of three — two producing and three consuming. We made this change to satisfy the requirement for three replicas of a topic, even when one cluster node is absent.",[48,1307,1308],{},"In each case, messages were produced onto 100 topics by 16 producers per topic. Messages were consumed using a single subscription per topic, shared between 16 consumers.",[48,1310,1311],{},"We adopted the following test protocol: five minutes of warm-up traffic, clean termination of a single broker node, five minutes of reduced capacity operation, resumption of the terminated broker, and five minutes of normal operation. The broker was intentionally terminated and resumed using the systemctl stop command on the node to simulate a failure.",[818,1313,1315],{"id":1314},"test-results","Test Results",[48,1317,1318],{},[351,1319],{"alt":1320,"src":1321},"Figure 2 - Node Failure and Recovery - Producer Throughput (msgs\u002Fs)","\u002Fimgs\u002Fblogs\u002F63ff8c221baa29f4bc1cd25c_Figure-2-node-failure-and-recovery-producer-throughput.png",[48,1323,1324],{},[351,1325],{"alt":1326,"src":1327},"Results for average producer throughput before, during, and after node failure","\u002Fimgs\u002Fblogs\u002F63ff8bc1f5f5f52071333290_Screen-Shot-2023-03-01-at-9.30.13-AM.png",[48,1329,1330],{},[351,1331],{"alt":1332,"src":1333},"Figure 3 - Node Failure and Recovery - Producer P99 Latency (ms)","\u002Fimgs\u002Fblogs\u002F63ff8d07f5f5f5b22033ba0f_Figure-3-node-failure-and-recovery-producer-p99-latency.png",[48,1335,1336],{},[351,1337],{"alt":1338,"src":1339},"Table showing results for producer p99 latency before, during, and after node failure","\u002Fimgs\u002Fblogs\u002F63ff8d40045efd6e074a1c73_Screen-Shot-2023-03-01-at-9.36.50-AM.png",[48,1341,1342],{},"Pulsar – Given that Pulsar separates computing and storage, we ran two experiments to test the behavior in the event of a failed broker and a failed bookie. We consistently observed the expected publisher failover in both cases, with an average publish rate of 260K msgs\u002Fs. There was no noticeable decline in publish rate and an increase in latency from 113 milliseconds to 147 milliseconds when running on fewer nodes. Our results for both broker and bookie termination scenarios were very similar.",[48,1344,1345],{},"RabbitMQ – In the test with RabbitMQ, we noted a successful failover of producers from the terminated node, maintaining an average publish rate of 45K msgs\u002Fs. At the time of node failure, the publish latency increased from 6.6 seconds to 7.6 seconds. However, upon restart, RabbitMQ did not rebalance traffic back onto the restarted node, resulting in a degraded publish latency of 8.2 seconds. We suspect this behavior is attributed to the absence of a load balancer in the default configuration used. Nevertheless, it should be possible to implement an external load-balancing mechanism.",[48,1347,1348],{},"NATS JetStream – During the test with NATS, we observed successful failover of producers from the terminated node, with an average publish rate of 45K msgs\u002Fs. When we attempted to reach higher publish rates, however, the failover did not always occur, resulting in a corresponding increase in publish errors. The producers switched over to the alternate node within approximately 20 seconds of the broker termination. The publisher rates remained stable with minimal disruptions throughout the test. Despite this, there was an increase in p99 publish latency (as seen in Figure 3), rising from 15 milliseconds to 40 milliseconds. This latency increase persisted for the test's duration, even after the terminated broker was resumed.",[818,1350,1352],{"id":1351},"analysis","Analysis",[48,1354,1355],{},"All platforms successfully transferred the work of a failed broker to other nodes and maintained the target publisher rate. It’s important to note that NATS JetStream did not achieve this consistently. Both RabbitMQ and NATS JetStream showed an increase in p99 publish latency, which was expected, but they did not recover after the reintroduction of the terminated broker. This suggests that the platforms did not effectively redistribute the work to the resumed broker.",[48,1357,1358],{},"In contrast, Pulsar was the only platform that consistently and successfully transferred the work to other nodes and maintained an unaffected publish rate with a slight increase in p99 latency. Moreover, Pulsar was able to achieve an average publish rate of 260K msgs\u002Fs when running on fewer nodes, demonstrating its ability to scale efficiently even in the face of node failures.",[32,1360,1216],{"id":1361},"_2-topic-counts-1",[818,1363,1302],{"id":1364},"test-parameters-1",[48,1366,1367],{},"In this test, we ran multiple tests on each platform, varying the number of independent topics in each instance and measuring the publish throughput and latency.",[818,1369,1315],{"id":1370},"test-results-1",[48,1372,1373],{},[351,1374],{"alt":1375,"src":1376},"Figure 4 - Maximum Producer Throughput (msgs\u002Fs) by Number of Topics","\u002Fimgs\u002Fblogs\u002F63ff8d9d4c5dcf0387288ace_Figure-4-maximum-producer-throughput.png",[48,1378,1379],{},[351,1380],{"alt":1381,"src":1382},"Table showing maximum producer throughput by number of topics","\u002Fimgs\u002Fblogs\u002F63ff8dc88c4561a15cda7c31_Screen-Shot-2023-03-01-at-9.39.12-AM.png",[48,1384,1385],{},[351,1386],{"alt":1387,"src":1388},"Figure 5 - Producer P99 Latency (ms) by Number of Topics","\u002Fimgs\u002Fblogs\u002F63ff8e0984f06281f056c7b3_Figure-5-producer-p99-latency.png",[48,1390,1391],{},[351,1392],{"alt":1393,"src":1394},"Table showing producer p99 latency by number of topics","\u002Fimgs\u002Fblogs\u002F63ff8e3675c3e91338887177_Screen-Shot-2023-03-01-at-9.41.00-AM.png",[48,1396,1397],{},"Pulsar – The platform achieved an aggregate publisher throughput of 1M msgs\u002Fs with a topic count between 10 and 50. Across thousands of topics, Pulsar maintained low publisher p99 latency, ranging from single-digit milliseconds to low hundreds of milliseconds (~7 ms to 300 ms).",[48,1399,1400],{},"We can see from the chart that there was a negative inflection point in the throughput when the number of topics exceeded 100. This variation can be attributed to the effectiveness of batching at different topic counts:",[339,1402,1403,1406],{},[342,1404,1405],{},"With fewer topics, the throughput per topic is relatively high and it conducts for a very high batching ratio (messages\u002Fbatch). This means that it’s very efficient to move a large number of messages through the platform, in a small amount of batches. In these conditions, the bottleneck is typically on the I\u002FO system.",[342,1407,1408],{},"With more topics, we are spreading the throughput over a larger number of them. The per-topic throughput is therefore lower and the batching ratio decreases, until we end up with just one message per batch. At this point, the bottleneck has shifted to the CPU cost instead of the I\u002FO system.",[48,1410,1411],{},"RabbitMQ – The publisher throughput fluctuated between 20K and 40K msgs\u002Fs across the range of topics. Meanwhile, the p99 publish latency rose significantly. These latencies often exceeded multiple seconds, ranging from 344 milliseconds to nearly 14 seconds. Testing was stopped after 500 topics as it became challenging to construct the topics in a reasonable amount of time.",[48,1413,1414],{},"NATS JetStream – The best performance was observed when using 10 to 50 topics, with a rate of 50K msgs\u002Fs. As the number of topics increased beyond 50, the throughput gradually decreased. The p99 publisher latencies also started to increase, starting from 75 milliseconds at 10 topics to over one second at 100 topics. The testing was stopped at 500 topics due to the difficulty in constructing additional topics, but the system could still handle 30K msgs\u002Fs at this configuration.",[818,1416,1352],{"id":1417},"analysis-1",[48,1419,1420],{},"The results suggest that all of the platforms tested could handle larger topic counts in real-world scenarios where topics accumulate gradually over time, rather than the time-consuming process of generating test topics. Despite this, RabbitMQ and NATS JetStream demonstrated a performance decline when concurrently publishing a very large number of topics.",[48,1422,1423],{},"On the other hand, Pulsar outperformed RabbitMQ and NATS JetStream in the number of topics, publish rate, and latency. The results show that Pulsar could handle 10 times more topics. Pulsar achieved up to 1M msgs\u002Fs, surpassing RabbitMQ by 33 times and NATS JetStream by 20 times. Pulsar also demonstrated exceptional latency performance, with p99 latency 300 times better than RabbitMQ and 40 times better than NATS JetStream at 50 topics. Pulsar was able to maintain producer throughput of 200K msgs\u002Fs at 20K topics.",[32,1425,1223],{"id":1426},"_3-subscription-counts-1",[818,1428,1302],{"id":1429},"test-parameters-2",[48,1431,1432],{},"In this test, we expected a significant boost in reads with a larger number of subscribers. To achieve this, we limited the number of concurrent topics to 50, assigned a single consumer to each subscription, and set a minimum aggregate publish rate of 1K msgs\u002Fs for the platform.",[818,1434,1315],{"id":1435},"test-results-2",[48,1437,1438],{},[351,1439],{"alt":1440,"src":1441},"Figure 6 - Maximum Producer Throughput (msgs\u002Fs) by Number of Subscriptions","\u002Fimgs\u002Fblogs\u002F63ff8e7dc3b31433fb68acbf_Figure-6-maximum-producer-throughput-by-subscriptions.png",[48,1443,1444],{},[351,1445],{"alt":1446,"src":1447},"Table showing maximum producer throughput by number of subscriptions","\u002Fimgs\u002Fblogs\u002F63ff8ecfdff9895f71dc2cf6_Screen-Shot-2023-03-01-at-9.43.28-AM.png",[48,1449,1450],{},[351,1451],{"alt":1452,"src":1453},"Figure 7 - Maximum Consumer Throughput (msgs\u002Fs) by Number of Subscriptions","\u002Fimgs\u002Fblogs\u002F63ff8f1fdff98968d3dc5988_Figure-7-maximum-consumer-throughput-by-subscriptions.png",[48,1455,1456],{},[351,1457],{"alt":1458,"src":1459},"Table showing maximum consumer throughput by number of subscriptions","\u002Fimgs\u002Fblogs\u002F63ff8f66f3768148dfb0469d_Screen-Shot-2023-03-01-at-9.46.04-AM.png",[48,1461,1462],{},[351,1463],{"alt":1464,"src":1465},"Figure 8 - Producer P99 Latency (ms) by Number of Subscriptions","\u002Fimgs\u002Fblogs\u002F63ff8fba7516a596b978ecbd_Figure-8-producer-p99-latency.png",[48,1467,1468],{},[351,1469],{"alt":1470,"src":1471},"Table showing producer p99 latency by number of subscriptions","\u002Fimgs\u002Fblogs\u002F63ff8fe31baa296fba221c94_Screen-Shot-2023-03-01-at-9.48.10-AM.png",[48,1473,1474],{},"Pulsar – We were able to achieve a maximum of 5K subscriptions per topic before consumers started to fall behind. However, even with higher subscription numbers, the publish latency remained low. In fact, we measured peak consumer throughput at an impressive 2.6M msgs\u002Fs.",[48,1476,1477,1478,190],{},"During our test, we identified an issue with many concurrent I\u002FO threads competing for the same resource. However, we were able to address this in Pulsar version 2.11.1. For more information on this issue, please refer to the ",[55,1479,1482],{"href":1480,"rel":1481},"https:\u002F\u002Fgithub.com\u002Fapache\u002Fpulsar\u002Fpull\u002F19341",[264],"GitHub PR #19341",[48,1484,1485],{},"RabbitMQ – The maximum number of successful subscriptions per topic achieved was 64. Beyond that, the publish rate dropped to around 500 msgs\u002Fs, and the p99 publish latency increased significantly to tens of seconds. Additionally, the clients became unresponsive beyond 64 subscriptions. However, the aggregate consumer throughput remained around 35K msgs\u002Fs and reached a peak of 48K msgs\u002Fs when there were eight subscriptions.",[48,1487,1488],{},"NATS JetStream – We achieved a maximum of 128 subscriptions per topic. As the number of subscriptions increased, there was an increase in publisher errors and lagging consumers. Despite this, the publish latency remained consistently low, ranging from 3 milliseconds to 34 milliseconds across all subscriptions. The highest consumer throughput was recorded at 160K msgs\u002Fs during eight to 32 subscriptions.",[818,1490,1352],{"id":1491},"analysis-2",[48,1493,1494],{},"As expected in this test case, end-to-end throughput became limited by the consumer. Pulsar was able to support hundreds of subscriptions per topic while maintaining very low publish latency. RabbitMQ and NATS JetStream achieved fewer subscriptions, and RabbitMQ experienced a significant increase in publish latency as the number of subscriptions increased. Pulsar stood out as the most efficient platform, demonstrating a publish rate and an aggregate consumer throughput that were both an order of magnitude higher than the other platforms.",[32,1496,1230],{"id":1497},"_4-backlog-draining-1",[818,1499,1302],{"id":1500},"test-parameters-3",[48,1502,1503],{},"In this test, the conditions were set to generate a backlog of messages before consumer activity began. Once the desired backlog size was reached, consumers were started, and messages continued to be produced at the specified rate. The backlog size was set to 300GB, larger than the available RAM of the brokers, simulating a scenario in which reads would need to come from slower disks rather than memory-resident caches. This was done to evaluate the platform's ability to handle catch-up reads, a common challenge in real-world scenarios.",[48,1505,1506],{},"During the tests, messages were produced on 100 topics, with 16 producers per topic. Messages were consumed using a single subscription per topic, shared between 16 consumers.",[818,1508,1315],{"id":1509},"test-results-3",[48,1511,1512],{},[351,1513],{"alt":1514,"src":1515},"Figure 9 - Queue Backlog and Recovery - Producer Throughput (msgs\u002Fs)","\u002Fimgs\u002Fblogs\u002F63ff9029d2771029701f5d6a_Figure-9-queue-backlog-and-recovery.png",[48,1517,1518],{},[351,1519],{"alt":1520,"src":1521},"Table showing average producer throughput before, during, and after backlog drain","\u002Fimgs\u002Fblogs\u002F63ff904a2e4e1f9932eaa005_Screen-Shot-2023-03-01-at-9.49.53-AM.png",[48,1523,1524],{},[351,1525],{"alt":1526,"src":1527},"Figure 10 - Queue Backlog and Recovery - Consumer Throughput (msgs\u002Fs)","\u002Fimgs\u002Fblogs\u002F63ff90957fa1cc8947942ec2_Figure-10-queue-backlog-and-recovery.png",[48,1529,1530],{},[351,1531],{"alt":1532,"src":1533},"Table showing average consumer throughput before, during, and after backlog drain","\u002Fimgs\u002Fblogs\u002F63ff90bdca4b648ff37d4eb5_Screen-Shot-2023-03-01-at-9.51.47-AM.png",[48,1535,1536],{},[351,1537],{"alt":1538,"src":1539},"Figure 11  - Queue Backlog and Recovery - Producer P99 Latency (ms)","\u002Fimgs\u002Fblogs\u002F63ff91007223588f223a0a4e_Figure-11-backlog-drain-p99-latency.png",[48,1541,1542],{},[351,1543],{"alt":1544,"src":1545},"Figure showing average producer p99 latency before, during, and after backlog drain","\u002Fimgs\u002Fblogs\u002F63ff9126fffc706c6af719ef_Screen-Shot-2023-03-01-at-9.53.32-AM.png",[48,1547,1548],{},"Pulsar – In this test, Pulsar delivered impressive results in terms of producer and catch-up read rates. The producer rate remained stable at 100K msgs\u002Fs before, during, and after the drain, and catch-up reads averaged 200K msgs\u002Fs. The drain itself was completed in approximately 45 minutes.",[48,1550,1551],{},"During the backlog drain phase, a slight increase in p99 publish latency from 4.7 milliseconds to 5.3 milliseconds was observed. However, this was expected due to the increased contention between producers and consumers.",[48,1553,1554],{},"One of the most noteworthy findings of the test was that Pulsar’s consumer throughput returned to its pre-drain level after the drain was complete. This showcased Pulsar’s ability to handle high volumes of data without compromising performance.",[48,1556,1557],{},"RabbitMQ –RabbitMQ was able to achieve its target producer rate of 30K msgs\u002Fs, but the platform faced a challenge when reads dominated during backlog production, leading to a steal of IOPS and hindering message production. This resulted in a reduction of the producer rate to 12.5K msgs\u002Fs, with a latency increase of three times from 11 to 34 seconds. However, the catch-up reads were swift, starting at 80K msgs\u002Fs and steadily rising to 200K msgs\u002Fs. After 50 minutes, most of the backlog had been drained, and the producer throughput was regained, with the latency returning to approximately 13 seconds. Despite a consistent yet small consumer backlog, the platform remained stable.",[48,1559,1560],{},"NATS JetStream – Unfortunately, NATS could not produce any results in this test. The clients encountered OOM errors while building the backlog, which we suspect might be due to a potential issue in the jnats library.",[818,1562,1352],{"id":1563},"analysis-3",[48,1565,1566],{},"Pulsar demonstrated impressive producer and catch-up read rates during the test, with stable performance before, during, and after the drain. Pulsar's consumer throughput returned to its pre-drain level, showcasing its ability to handle high volumes of data without compromising performance. Pulsar also outperformed RabbitMQ by being 3.3 times faster in producing and consuming, and the drain would have been completed even faster if Pulsar had been set to a 30K msgs\u002Fs producer rate.",[48,1568,1569],{},"RabbitMQ demonstrated some impressive consumer rates when reading the backlog. However, this came at the cost of message production, as the consumers had clear priority. In a real-world scenario, applications would be unable to produce during the catch-up read and would have to either drop messages or take other mitigating actions.",[48,1571,1572],{},"It would have been interesting to see how NATS JetStream performed in this area, but further work will be needed to investigate and resolve the suspected client issue.",[40,1574,931],{"id":930},[48,1576,1577],{},"The benchmark tests showed that Pulsar can handle significantly larger workloads than RabbitMQ and NATS JetStream and remain highly performant in various scenarios. Pulsar proved its reliability in the presence of node failure and its high scalability for both topics and subscriptions. Conversely, RabbitMQ and NATS JetStream both showed a decline in performance when concurrently publishing a large number of topics.",[48,1579,1580],{},"The results suggest that while all three platforms are suitable for real-world scenarios, it is crucial to carefully evaluate and choose the technology that best aligns with the specific needs and priorities of the application.",[48,1582,1583],{},"Key findings summarizing Pulsar’s performance:",[1585,1586,1587,1590,1593,1596],"ol",{},[342,1588,1589],{},"Pulsar maintained high publish rates despite broker or bookie failure. No degradation in rates occurred when running on fewer nodes, with 5 times greater maximum publish rates than RabbitMQ and NATS JetStream.",[342,1591,1592],{},"Pulsar achieved high performance with 1M msgs\u002Fs, surpassing RabbitMQ by 33 times and NATS JestStream by 20 times. With a topic count of 50, p99 latency was 300 times better than RabbitMQ and 40 times better than NATS JetStream. Pulsar was able to maintain a producer throughput of 200K msgs\u002Fs at 20K topics. In contrast, RabbitMQ and NATS JetStream failed to construct topics beyond 500 counts.",[342,1594,1595],{},"Pulsar supported 1,024 subscriptions per topic without impacting consumer performance, while maintaining low publish latency and achieving a peak consumer throughput of 2.6M msgs\u002Fs. This was 54 times faster than RabbitMQ and 43 times faster than NATS JetStream.",[342,1597,1598],{},"Pulsar achieved stable publish rates and an average catch-up read throughput of 200K msgs\u002Fs during the backlog drain test case. In comparison, RabbitMQ’s publish rate dropped by over 50 percent during draining and resulted in an increase in publish latency by three times.",[48,1600,1601],{},"RabbitMQ may be a suitable option for applications with a small number of topics and a consistent publisher throughput, as the platform struggles to deal with node failures and large backlogs. NATS may be a good choice for applications with lower message rates and a limited number of topics (less than 50). Overall, the results show Pulsar outperforms RabbitMQ and NATS JetStream in terms of throughput, latency, and scalability, making Pulsar a strong candidate for large-scale messaging applications.",[32,1603,1605],{"id":1604},"want-to-learn-more","Want to Learn More?",[48,1607,1608],{},"For more on Pulsar, check out the resources below.",[1585,1610,1611,1618,1625,1633],{},[342,1612,1613,1614,190],{},"Learn more about how leading organizations are using Pulsar by checking out ",[55,1615,1617],{"href":1616},"\u002Fsuccess-stories","the latest Pulsar success stories",[342,1619,1620,1621,190],{},"Use StreamNative Cloud to spin up a Pulsar cluster in minutes. ",[55,1622,1624],{"href":1623},"\u002Fdeployment","Get started today",[342,1626,1627,1628,190],{},"Engage with the Pulsar community by joining the ",[55,1629,1632],{"href":1630,"rel":1631},"https:\u002F\u002Fcommunityinviter.com\u002Fapps\u002Fapache-pulsar\u002Fapache-pulsar",[264],"Pulsar Slack channel",[342,1634,1635,1636,190],{},"Expand your Pulsar knowledge today with free, on-demand courses and live training from ",[55,1637,1640],{"href":1638,"rel":1639},"https:\u002F\u002Fwww.academy.streamnative.io\u002F",[264],"StreamNative Academy",[32,1642,966],{"id":965},[48,1644,1645,1648],{},[970,1646,1647],{},"1"," Comparing Pulsar and Kafka: Unified Queuing and Streaming:",[48,1650,1651],{},[55,1652,1107],{"href":1107,"rel":1653},[264],[48,1655,1656,1659,1660],{},[970,1657,1658],{},"2"," The Linux Foundation Open Messaging Benchmark suite: ",[55,1661,1664],{"href":1662,"rel":1663},"http:\u002F\u002Fopenmessaging.cloud\u002Fdocs\u002Fbenchmarks\u002F",[264],"http:\u002F\u002Fopenmessaging.cloud\u002Fdocs\u002Fbenchmarks",[48,1666,1667,1670,1671],{},[970,1668,1669],{},"3"," The Open Messaging Benchmark Github repo: ",[55,1672,1243],{"href":1243,"rel":1673},[264],[48,1675,1676,1679],{},[970,1677,1678],{},"4"," GitHub Pull Request #19341:",[48,1681,1682],{},[55,1683,1480],{"href":1480,"rel":1684},[264],[48,1686,1249],{},{"title":18,"searchDepth":19,"depth":19,"links":1688},[1689,1692,1697,1701,1707],{"id":1097,"depth":19,"text":1098,"children":1690},[1691],{"id":333,"depth":279,"text":334},{"id":1160,"depth":19,"text":1161,"children":1693},[1694,1695,1696],{"id":1167,"depth":279,"text":1168},{"id":1177,"depth":279,"text":1178},{"id":1187,"depth":279,"text":1188},{"id":1197,"depth":19,"text":1198,"children":1698},[1699,1700],{"id":1201,"depth":279,"text":1202},{"id":1236,"depth":279,"text":1237},{"id":1291,"depth":19,"text":1292,"children":1702},[1703,1704,1705,1706],{"id":1298,"depth":279,"text":1209},{"id":1361,"depth":279,"text":1216},{"id":1426,"depth":279,"text":1223},{"id":1497,"depth":279,"text":1230},{"id":930,"depth":19,"text":931,"children":1708},[1709,1710],{"id":1604,"depth":279,"text":1605},{"id":965,"depth":279,"text":966},"2023-03-01","Our comparison of messaging platforms looks at the performance, architecture, features, and ideal applications for Apache Pulsar, RabbitMQ, and NATS JetStream.","\u002Fimgs\u002Fblogs\u002F63ff931387c5e89f84e91fb0_Pulsar-Rabbitmq-benchmark.png",{},"\u002Fblog\u002Fcomparison-of-messaging-platforms-apache-pulsar-vs-rabbitmq-vs-nats-jetstream","14 min read",{"title":1088,"description":1712},"blog\u002Fcomparison-of-messaging-platforms-apache-pulsar-vs-rabbitmq-vs-nats-jetstream",[1178,1084,1168,1720],"Benchmarks","-fBdQzc6VEazpJwZvZpe1PmbRdTzoJGD9_qpfRoGvmI",[1723,1735,1749],{"id":1724,"title":1090,"bioSummary":1725,"email":290,"extension":8,"image":1726,"linkedinUrl":290,"meta":1727,"position":1732,"stem":1733,"twitterUrl":290,"__hash__":1734},"authors\u002Fauthors\u002Fjihyun-tornow.md","Director of Product Marketing at StreamNative. She is a strategic and adaptive Product and Business Development leader with over 16 years experience driving cross-functional programs and global teams. Jihyun is located in San Jose, California.","\u002Fimgs\u002Fauthors\u002Fjihyun-tornow.png",{"body":1728},{"type":15,"value":1729,"toc":1730},[],{"title":18,"searchDepth":19,"depth":19,"links":1731},[],"Director of Product Marketing, StreamNative","authors\u002Fjihyun-tornow","jt1x-lrIWA9NwbJ48Lebor4nXY00vnRHM8Tq6Jyr7yI",{"id":1736,"title":1091,"bioSummary":1737,"email":290,"extension":8,"image":1738,"linkedinUrl":290,"meta":1739,"position":1746,"stem":1747,"twitterUrl":290,"__hash__":1748},"authors\u002Fauthors\u002Felliot-west.md","Elliot West is a Platform Engineer at StreamNative. He has been working with large-scale data platforms ever since creating systems with Hadoop for an early UK-based music social network. More recently he built an organisation-wide platform to deliver self-service streaming capabilities to Expedia Group’s family of data scientists, engineers, and analysts.","\u002Fimgs\u002Fauthors\u002Felliot-west.webp",{"body":1740},{"type":15,"value":1741,"toc":1744},[1742],[48,1743,1737],{},{"title":18,"searchDepth":19,"depth":19,"links":1745},[],"Platform Engineer at StreamNative","authors\u002Felliot-west","M9CUkPle0NFZY_uQfAosK-B6OMYbVheoiIUkEA5h3C4",{"id":1750,"title":1092,"bioSummary":1751,"email":290,"extension":8,"image":1752,"linkedinUrl":1753,"meta":1754,"position":1761,"stem":1762,"twitterUrl":290,"__hash__":1763},"authors\u002Fauthors\u002Fmatteo-merli.md","Matteo is the CTO at StreamNative, where he brings rich experience in distributed pub-sub messaging platforms. Matteo was one of the co-creators of Apache Pulsar during his time at Yahoo!. Matteo worked to create a global, distributed messaging system for Yahoo!, which would later become Apache Pulsar. Matteo is the PMC Chair of Apache Pulsar, where he helps to guide the community and ensure the success of the Pulsar project. He is also a PMC member for Apache BookKeeper. Matteo lives in Menlo Park, California.","\u002Fimgs\u002Fauthors\u002Fmatteo-merli.webp","https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fmatteomerli\u002F",{"body":1755},{"type":15,"value":1756,"toc":1759},[1757],[48,1758,1751],{},{"title":18,"searchDepth":19,"depth":19,"links":1760},[],"CTO, StreamNative & Co-Creator and PMC Chair Apache Pulsar","authors\u002Fmatteo-merli","MRLEjDgpe8SqHBoftSh_eiNGg-1oCJ30t7iV3Bb2NzQ",[1765,1773,1780],{"path":1766,"title":1767,"date":1768,"image":1769,"link":-1,"collection":1770,"resourceType":1771,"score":1772,"id":1766},"\u002Freports\u002F2023-messaging-benchmark-report-apache-pulsar-vs-rabbitmq-vs-nats-jetstream","2023 Messaging Benchmark Report: Apache Pulsar vs. RabbitMQ vs. NATS JetStream","2023-02-27","\u002Fimgs\u002Fwhitepapers\u002F63fcedfc3312794b99ea5f27_social-1200x627.png","reports","Report",0.825,{"path":1774,"title":1775,"date":1776,"image":1777,"link":-1,"collection":1778,"resourceType":1779,"score":1772,"id":1774},"\u002Fsuccess-stories\u002Fhow-apache-pulsar-helping-iterable-scale-its-customer-engagement-platform","How Apache Pulsar is Helping Iterable Scale its Customer Engagement Platform","2022-12-22","\u002Fimgs\u002Fsuccess-stories\u002F67942deea5d4e9499e9436b2_SN-SuccessStories-iterable.webp","successStories","Case Study",{"path":1781,"title":1782,"date":1783,"image":-1,"link":-1,"collection":1784,"resourceType":1785,"score":1786,"id":1781},"\u002Fblog\u002Fbenchmarking-pulsar-and-kafka-report-2020","Benchmarking Pulsar and Kafka - The Full Benchmark Report - 2020","2020-11-09","blogs","Blog",0.75,1775716418799]