[{"data":1,"prerenderedAt":1794},["ShallowReactive",2],{"active-banner":3,"navbar-featured-partner-blog":23,"navbar-pricing-featured":304,"blog-\u002Fblog\u002Fguide-apache-pulsar-compare-features-architecture-to-apache-kafka":1084,"blog-authors-\u002Fblog\u002Fguide-apache-pulsar-compare-features-architecture-to-apache-kafka":1728,"related-\u002Fblog\u002Fguide-apache-pulsar-compare-features-architecture-to-apache-kafka":1776},{"id":4,"title":5,"date":6,"dismissible":7,"extension":8,"link":9,"link2":10,"linkText":11,"linkText2":10,"meta":12,"stem":20,"variant":21,"__hash__":22},"banners\u002Fbanners\u002Fkafka-company-2025.md","Native Apache Kafka Service Is Coming Soon to StreamNative Cloud. Join the waitlist and get $1,000 in credits.","2026-04-01",true,"md","\u002Fnative-kafka-service-waitlist",null,"Join Waitlist",{"body":13},{"type":14,"value":15,"toc":16},"minimark",[],{"title":17,"searchDepth":18,"depth":18,"links":19},"",2,[],"banners\u002Fkafka-company-2025","default","IMIJszQOOWTfA_DV33eYUA5jqV7DrX1FWbBTBZfNvWc",{"id":24,"title":25,"authors":26,"body":28,"category":288,"createdAt":10,"date":289,"description":290,"extension":8,"featured":7,"image":291,"isDraft":292,"link":10,"meta":293,"navigation":7,"order":294,"path":295,"readingTime":296,"relatedResources":10,"seo":297,"stem":298,"tags":299,"__hash__":303},"blogs\u002Fblog\u002Fstreamnative-recognized-in-the-forrester-wave-streaming-data-platforms-2025.md","StreamNative Recognized as a Contender in The Forrester Wave™: Streaming Data Platforms, Q4 2025",[27],"David Kjerrumgaard",{"type":14,"value":29,"toc":275},[30,38,46,50,66,72,77,80,86,101,108,114,117,123,126,133,139,142,145,156,162,168,171,174,177,183,190,193,196,203,206,209,223,228,232,236,240,244,248,250,267,269],[31,32,34],"h3",{"id":33},"receives-highest-possible-scores-in-both-the-messaging-and-resource-optimization-criteria",[35,36,37],"em",{},"Receives Highest Possible Scores in BOTH the Messaging and Resource Optimization Criteria",[39,40,42],"h2",{"id":41},"introduction",[43,44,45],"strong",{},"Introduction",[47,48,49],"p",{},"Real-time data has become the backbone of modern innovation. As artificial intelligence (AI) and digital services demand instantaneous insights, organizations are realizing that streaming data is no longer optional – it's essential for delivering timely, context-rich experiences. StreamNative's data streaming platform is built precisely for this reality, ensuring data is immediate, reliable, and ready to power critical applications.",[47,51,52,53,62,63],{},"Today, we're excited to announce that Forrester Research has named StreamNative as a Contender in its evaluation, ",[54,55,57],"a",{"href":56},"\u002Freports\u002Frecognized-in-the-forrester-wave-tm-streaming-data-platforms-q4-2025",[35,58,59],{},[43,60,61],{},"The Forrester Wave™: Streaming Data Platforms, Q4 2025",". This report evaluated 15 top streaming data platform providers, and we're proud to share that ",[43,64,65],{},"StreamNative received the highest scores possible—5 out of 5—in both the Messaging and Resource Optimization criteria.",[47,67,68,69],{},"***Forrester's Take: ***",[35,70,71],{},"\"StreamNative is a good fit for enterprises that want an Apache Pulsar implementation that is also compatible with Kafka APIs.\"",[47,73,74],{},[35,75,76],{},"— The Forrester Wave™: Streaming Data Platforms, Q4 2025",[47,78,79],{},"Being recognized in the Forrester Wave is a proud milestone, and for us, it highlights how far StreamNative has come in enabling enterprises to unlock the power of real-time data. In the sections below, we'll dive into what we believe sets StreamNative apart—from our modern architecture and cloud-native design to our open-source foundation and real-time use cases—and how we see these strengths aligning with Forrester's findings.",[39,81,83],{"id":82},"trusted-by-industry-leaders",[43,84,85],{},"Trusted by Industry Leaders",[47,87,88,89,92,93,96,97,100],{},"Companies across industries are already leveraging StreamNative to drive real-time outcomes. Global enterprises like ",[43,90,91],{},"Cisco"," rely on StreamNative to handle massive IoT telemetry, supporting 245 million+ connected devices. Martech leaders such as ",[43,94,95],{},"Iterable"," process billions of events per day with StreamNative for hyper-personalized customer engagement. And in financial services, ",[43,98,99],{},"FICO"," trusts StreamNative to power its real-time fraud detection and analytics pipelines with a secure, scalable streaming backbone.",[47,102,103,104,107],{},"The Forrester report notes that, “",[35,105,106],{},"Customers appreciate the lower infrastructure costs that result from StreamNative’s cost-efficient, Kafka-compatible architecture. Customers note excellent support responsiveness…","”",[39,109,111],{"id":110},"modern-cloud-native-architecture-built-for-scale",[43,112,113],{},"Modern, Cloud-Native Architecture Built for Scale",[47,115,116],{},"From day one, StreamNative was designed with a modern architecture to meet the demanding scale and flexibility requirements of real-time data. Unlike legacy streaming systems that often rely on tightly coupled storage and compute, StreamNative's platform takes a cloud-native approach: it decouples these layers to enable elastic scalability and efficient resource utilization across any environment. The core is powered by Apache Pulsar—a distributed messaging and streaming engine—enhanced with multi-protocol support (including native Apache Kafka API compatibility) to unify diverse data streams under one roof. This means organizations can consolidate siloed messaging systems and handle both high-volume event streams and traditional message queues on a single platform, without sacrificing performance or reliability.",[47,118,119,120,107],{},"Forrester's evaluation described that “",[35,121,122],{},"StreamNative aims to provide a high-performance, multi-protocol streaming data platform: It uses Apache Pulsar with Kafka API compatibility to deliver cost-efficient, real-time applications for enterprises. It appeals to organizations that want a flexible, low-cost streaming solution, due to its focus on scalability and resource optimization, while its investments in Pulsar’s open-source ecosystem and performance optimization make it the primary platform for enterprises wishing to implement Pulsar.",[47,124,125],{},"Our cloud-first, leaderless architecture (with no single broker bottlenecks) and tiered storage model were built to maximize throughput and cost-efficiency for real-time workloads. By separating compute from storage and leveraging distributed object storage, StreamNative can retain huge volumes of event data indefinitely while keeping compute costs in check—effectively providing a flexible, low-cost streaming solution.",[47,127,128,129,132],{},"This modern design not only delivers high performance, but also ensures fault tolerance and geo-distribution out of the box, so enterprises can trust their streaming data is always available and durable. As Forrester’s evaluation noted, StreamNative ",[35,130,131],{},"\"excels at messaging and resource optimization\" and “Its platform supports use cases like real-time analytics and event-driven architectures with robust scalability.","” Our architecture provides the strong foundation that today's real-time applications demand, from ultra-fast data ingestion to seamless scale-out across hybrid and multi-cloud environments.",[39,134,136],{"id":135},"open-source-foundation-and-pulsar-expertise",[43,137,138],{},"Open Source Foundation and Pulsar Expertise",[47,140,141],{},"StreamNative's DNA is rooted in open source innovation. Our founders are the original creators of Apache Pulsar, and we've built our platform with the same open principles: freedom, flexibility, and community-driven innovation. For developers and data teams, this means adopting StreamNative comes with no proprietary lock-in—instead, you get a platform built on open standards and a thriving ecosystem. We offer broad API compatibility (Pulsar, Kafka, JMS, MQTT, and more) so that teams can work with familiar interfaces and integrate StreamNative into existing systems with ease.",[47,143,144],{},"StreamNative is the primary commercial contributor to the Apache Pulsar project and its surrounding ecosystem. We invest heavily in Pulsar's ongoing improvements our investments in Pulsar's open-source ecosystem and performance optimization bolster StreamNative's value. We also foster a vibrant community through initiatives like the Data Streaming Summit and free training resources.",[47,146,147,148,151,152,155],{},"Forrester's assessment noted that StreamNative’s “",[35,149,150],{},"events-driven agents, extensibility, and performance architecture are solid,","” and we're continuing to build on that foundation. ",[43,153,154],{},"We're actively investing in expanding our tooling for observability, governance, schema management, and developer productivity","—areas we recognize as critical for enterprise adoption and where we're committed to accelerating our roadmap.",[47,157,158,159],{},"Being open also means embracing an open ecosystem of technologies. StreamNative actively integrates with the tools and platforms that matter most to our users. We partner with industry leaders like Snowflake, Databricks, Google, and Ververica to ensure our streaming platform works seamlessly with data warehouses, lakehouse storage, and stream processing frameworks. Forrester’s evaluation observed that StreamNative’s ",[35,160,161],{},"\"investments in Pulsar’s open-source ecosystem and performance optimization make it the primary platform for enterprises wishing to implement Pulsar.\"",[39,163,165],{"id":164},"powering-real-time-use-cases-across-industries",[43,166,167],{},"Powering Real-Time Use Cases Across Industries",[47,169,170],{},"One of the greatest validations of StreamNative's approach is the success our customers are achieving with real-time data. StreamNative's platform is versatile and use-case agnostic—if an application demands high-volume, low-latency data movement, we can power it. This flexibility is why our customer base spans industries from finance and IoT to major automobile manufacturers and online gaming. The common thread is that these organizations need to process and react to data in milliseconds, and StreamNative is delivering the capabilities to make that possible.",[47,172,173],{},"Cisco uses StreamNative to underpin an IoT telemetry system of colossal scale, connecting hundreds of millions of devices and thousands of enterprise clients with real-time data streams. The platform's multi-tenant design and proven reliability allow Cisco to offer its customers a live feed of device data with unwavering confidence. In the financial sector, FICO has built streaming pipelines on StreamNative to detect fraud as transactions happen and to monitor systems in real time. With StreamNative's strong guarantees around message durability and ordering, FICO can catch anomalies or suspicious patterns within seconds. And in digital customer engagement, Iterable relies on StreamNative to process billions of events every day—clicks, views, purchases—so that marketers can trigger personalized campaigns instantly based on user behavior.",[47,175,176],{},"Our customers uniformly deal with mission-critical data streams, where downtime or delays are unacceptable. StreamNative's fault-tolerant, scalable infrastructure has proven equal to the task, handling scenarios like bursting to millions of events per second or seamlessly spanning multiple cloud regions. Forrester's report recognized StreamNative for supporting event-driven architectures with robust scalability—which for us is a reflection of our platform's ability to meet the most demanding enterprise requirements.",[39,178,180],{"id":179},"continuing-to-innovate-ursa-orca-and-the-road-ahead",[43,181,182],{},"Continuing to Innovate: Ursa, Orca, and the Road Ahead",[47,184,185,186,189],{},"While we are thrilled to be recognized in Forrester's Streaming Data Platforms Wave, we view this as just the beginning. StreamNative's vision has always been bold: to ",[43,187,188],{},"provide a unified platform that not only handles today's streaming needs but also anticipates the emerging requirements of tomorrow",".",[47,191,192],{},"One key area of focus is the convergence of streaming data with advanced analytics and AI. As Forrester points out in the report, technology leaders should look for platforms that natively integrate messaging, stream processing, and analytics to provide AI agents with real-time, contextualized information. We couldn't agree more. Our award-winning Ursa Engine and Orca Agent Engine are aimed at extending our platform up the stack—bridging the gap between data streams and data lakes, and between event streams and intelligent processing.",[47,194,195],{},"Our new Ursa Engine introduces a lakehouse-native approach to streaming: it can write events directly to table formats like Iceberg on cloud storage, eliminating entire classes of ETL jobs and making fresh data instantly available for analytics queries. By integrating streaming and lakehouse technologies, we help customers collapse data silos and accelerate their AI\u002FML pipelines.",[47,197,198,199,202],{},"Beyond analytics integration, we are also enhancing StreamNative with more out-of-the-box processing and governance capabilities. In the coming months, we plan to introduce new features for lightweight stream processing and transformation, making it easier to build reactive applications directly on the platform. We're also expanding our ecosystem of connectors and integrations, so that whether your data lands in Snowflake, Databricks, or an AI model, StreamNative will seamlessly feed it. ",[43,200,201],{},"We're investing significantly in enterprise features including security, schema registry, governance, and monitoring tooling","—capabilities that are essential for mission-critical deployments and where we're committed to continued improvement.",[47,204,205],{},"This recognition from Forrester energizes us to keep innovating at full speed. We're sharing this honor with our amazing customers, community, and partners who drive us forward every day. Your feedback and real-world challenges have helped shape StreamNative into what it is today, and together, we will shape the future of streaming data. Thank you for joining us on this journey—we're just getting started, and we can't wait to deliver even more value as we continue to evolve our platform. Onward to real-time everything!",[207,208],"hr",{},[31,210,212],{"id":211},"streamnative-in-the-forrester-wave-evaluation-findings",[43,213,214,215,222],{},"StreamNative in ",[43,216,217],{},[54,218,219],{"href":56},[43,220,221],{},"The Forrester Wave™",": Evaluation Findings",[224,225,227],"h5",{"id":226},"recognized-as-a-contender-among-15-streaming-data-platform-providers","• Recognized as a Contender among 15 streaming data platform providers",[224,229,231],{"id":230},"received-the-highest-scores-possible-50-in-both-the-messaging-and-resource-optimization-criteria","* Received the highest scores possible (5.0) in both the Messaging and Resource Optimization criteria",[224,233,235],{"id":234},"cited-as-the-primary-platform-for-enterprises-wishing-to-implement-pulsar","• Cited as the primary platform for enterprises wishing to implement Pulsar",[224,237,239],{"id":238},"noted-for-excelling-at-messaging-and-resource-optimization","• Noted for excelling at messaging and resource optimization",[224,241,243],{"id":242},"customers-cited-lower-infrastructure-costs-and-excellent-support-responsiveness","• Customers cited lower infrastructure costs and excellent support responsiveness",[224,245,247],{"id":246},"recognized-for-supporting-event-driven-architectures-with-robust-scalability","• Recognized for supporting event-driven architectures with robust scalability",[207,249],{},[251,252,254,255,258,259,189],"h6",{"id":253},"forrester-disclaimer-forrester-does-not-endorse-any-company-product-brand-or-service-included-in-its-research-publications-and-does-not-advise-any-person-to-select-the-products-or-services-of-any-company-or-brand-based-on-the-ratings-included-in-such-publications-information-is-based-on-the-best-available-resources-opinions-reflect-judgment-at-the-time-and-are-subject-to-change-for-more-information-read-about-forresters-objectivity-here","**Forrester Disclaimer: **",[35,256,257],{},"Forrester does not endorse any company, product, brand, or service included in its research publications and does not advise any person to select the products or services of any company or brand based on the ratings included in such publications. Information is based on the best available resources. Opinions reflect judgment at the time and are subject to change",". *For more information, read about Forrester’s objectivity *",[54,260,264],{"href":261,"rel":262},"https:\u002F\u002Fwww.forrester.com\u002Fabout-us\u002Fobjectivity\u002F",[263],"nofollow",[35,265,266],{},"here",[207,268],{},[251,270,272],{"id":271},"apache-apache-pulsar-apache-kafka-apache-flink-and-other-names-are-trademarks-of-the-apache-software-foundation-no-endorsement-by-apache-or-other-third-parties-is-implied",[35,273,274],{},"Apache®, Apache Pulsar®, Apache Kafka®, Apache Flink® and other names are trademarks of The Apache Software Foundation. No endorsement by Apache or other third parties is implied.",{"title":17,"searchDepth":18,"depth":18,"links":276},[277,279,280,281,282,283,284],{"id":33,"depth":278,"text":37},3,{"id":41,"depth":18,"text":45},{"id":82,"depth":18,"text":85},{"id":110,"depth":18,"text":113},{"id":135,"depth":18,"text":138},{"id":164,"depth":18,"text":167},{"id":179,"depth":18,"text":182,"children":285},[286],{"id":211,"depth":278,"text":287},"StreamNative in The Forrester Wave™: Evaluation Findings","Company","2025-12-16","StreamNative is recognized in The Forrester Wave™: Streaming Data Platforms, Q4 2025. Discover why Forrester highlights StreamNative's high-performance messaging, efficient resource use, and cost-effective Kafka API compatibility for real-time innovation.","\u002Fimgs\u002Fblogs\u002F693bd36cf01b217dcb67278f_Streamnative_blog_thumbnail.png",false,{},0,"\u002Fblog\u002Fstreamnative-recognized-in-the-forrester-wave-streaming-data-platforms-2025","10 mins read",{"title":25,"description":290},"blog\u002Fstreamnative-recognized-in-the-forrester-wave-streaming-data-platforms-2025",[300,301,302],"Announcements","Real-Time","Forrester","sOeeJtEO3O-IIfTPJjY1AFOMawZ_rf8FOH8A98NEKgU",{"id":305,"title":306,"authors":307,"body":312,"category":1071,"createdAt":10,"date":1072,"description":1073,"extension":8,"featured":7,"image":1074,"isDraft":292,"link":10,"meta":1075,"navigation":7,"order":294,"path":1076,"readingTime":1077,"relatedResources":10,"seo":1078,"stem":1079,"tags":1080,"__hash__":1083},"blogs\u002Fblog\u002Fhow-we-run-a-5-gb-s-kafka-workload-for-just-50-per-hour.md","How We Run a 5 GB\u002Fs Kafka Workload for Just $50 per Hour",[308,309,310,311],"Matteo Meril","Neng Lu","Hang Chen","Penghui Li",{"type":14,"value":313,"toc":1041},[314,317,320,323,326,329,333,336,346,352,355,363,368,372,379,382,385,393,397,400,405,409,412,415,418,421,430,434,437,448,451,455,458,461,472,475,479,483,491,494,498,506,535,539,542,547,551,554,558,561,564,569,578,583,586,589,600,604,607,618,622,625,628,633,636,665,669,671,677,680,685,690,693,697,711,715,726,730,745,754,765,768,771,775,778,781,792,795,798,801,806,811,815,819,836,840,854,859,863,874,877,893,897,908,913,918,926,930,933,937,944,948,951,960,965,974,980,989,998,1007,1016,1025,1033],[47,315,316],{},"The rise of DeepSeek has shaken the AI infrastructure market, forcing companies to confront the escalating costs of training and deploying AI models. But the real pressure point isn’t just compute—it’s data acquisition and ingestion costs.",[47,318,319],{},"As businesses rethink their AI cost-containment strategies, real-time data streaming is emerging as a critical enabler. The growing adoption of Kafka as a standard protocol has expanded cost-efficient options, allowing companies to optimize streaming analytics while keeping expenses in check.",[47,321,322],{},"Ursa, the data streaming engine powering StreamNative’s managed Kafka service, is built for this new reality. With its leaderless architecture and native lakehouse storage integration, Ursa eliminates costly inter-zone network traffic for data replication and client-to-broker communication while ensuring high availability at minimal operational cost.",[47,324,325],{},"In this blog post, we benchmarked the infrastructure cost and total cost of ownership (TCO) for running a 5GB\u002Fs Kafka workload across different Kafka vendors, including Redpanda, Confluent WarpStream, and AWS MSK. Our benchmark results show that Ursa can sustain 5GB\u002Fs Kafka workloads at just 5% of the cost of traditional streaming engines like Redpanda—making it the ideal solution for high-performance, cost-efficient ingestion and data streaming for data lakehouses and AI workloads.",[47,327,328],{},"Note: We also evaluated vanilla Kafka in our benchmark; however, for simplicity, we have focused our cost comparison on vendor solutions rather than self-managed deployments. That said, it is important to highlight that both Redpanda and vanilla Kafka use a leader-based data replication approach. In a data-intensive, network-bound workload like 5GB\u002Fs streaming, with the same machine type and replication factor, Redpanda and vanilla Kafka produced nearly identical cost profiles.",[39,330,332],{"id":331},"key-benchmark-findings","Key Benchmark Findings",[47,334,335],{},"Ursa delivered 5 GB\u002Fs of sustained throughput at an infrastructure cost of just $54 per hour. For comparison:",[337,338,339,343],"ul",{},[340,341,342],"li",{},"MSK: $303 per hour → 5.6x more expensive compared to Ursa",[340,344,345],{},"Redpanda: $988 per hour → 18x more expensive compared to Ursa",[47,347,348],{},[349,350],"img",{"alt":17,"src":351},"\u002Fimgs\u002Fblogs\u002F679c71b67d9046f26edc7977_AD_4nXfvTqyBNUBu2lObdkKAx-5UNkpNP8UYULLZyOcixE6z99VMZUUEsUqWjzexI7vjyNGRNSAUoM9smYvdTP55ctAhIbrs5lmQgcSVMWdaoigbWouCl95DVSQsxooY-qqfGcYqS4g4zA.png",[47,353,354],{},"Beyond infrastructure costs, when factoring in both storage pricing, vendor pricing and operational expenses, Ursa’s total cost of ownership (TCO) for a 5GB\u002Fs workload with a 7-day retention period is:",[337,356,357,360],{},[340,358,359],{},"50% cheaper than Confluent WarpStream",[340,361,362],{},"85% cheaper than MSK and Redpanda",[47,364,365],{},[349,366],{"alt":17,"src":367},"\u002Fimgs\u002Fblogs\u002F679c602d77e9c706de5343b8_AD_4nXeDv8rrv_C1CTCCiqYo1zpvlGYbdBk1r0VEqovAPu22iFMQZgh54Hfw9PBMLzM7jDFxKwAFDxbdG0np4XVk_tGsWhEKMloLRcmmea7lvueCx-0cFsyaE3Mya4Mxc1Dox95A6JEc.png",[39,369,371],{"id":370},"ursa-highly-cost-efficient-data-streaming-at-scale","Ursa: Highly Cost-Efficient Data Streaming at Scale",[47,373,374,378],{},[54,375,377],{"href":376},"\u002Fblog\u002Fursa-reimagine-apache-kafka-for-the-cost-conscious-data-streaming","Ursa"," is a next-generation data streaming engine designed to deliver high performance at a fraction of the cost of traditional disk-based solutions. It is fully compatible with Apache Kafka and Apache Pulsar APIs, while leveraging a leaderless, lakehouse-native architecture to maximize scalability, efficiency, and cost savings.",[47,380,381],{},"Ursa’s key innovation is separating storage from compute and decoupling metadata\u002Findex operations from data operations by utilizing cloud object storage (e.g., AWS S3) instead of costly inter-zone disk-based replication. It also employs open lakehouse formats (Iceberg and Delta Lake), enabling columnar compression to significantly reduce storage costs while maintaining durability and availability.",[47,383,384],{},"In contrast, traditional streaming systems—like Kafka and Redpanda—depend on leader-based architectures, which drive up inter-zone traffic costs due to replication and client communication. Ursa mitigates these costs by:",[337,386,387,390],{},[340,388,389],{},"Eliminating inter-zone traffic costs via a leaderless architecture.",[340,391,392],{},"Replacing costly inter-zone replication with direct writes to cloud storage using open lakehouse formats.",[39,394,396],{"id":395},"how-ursa-eliminates-inter-zone-traffic","How Ursa Eliminates Inter-Zone Traffic",[47,398,399],{},"Ursa minimizes inter-zone traffic by leveraging a leaderless architecture, which eliminates inter-zone communication between clients and brokers, and lakehouse-native storage, which removes the need for inter-zone data replication. This approach ensures high availability and scalability while avoiding unnecessary cross-zone data movement.",[47,401,402],{},[349,403],{"alt":17,"src":404},"\u002Fimgs\u002Fblogs\u002F679c602e21b3571bb7117dca_AD_4nXd7Oahc77NjRLNvA9clLt0tsyU6MrIqVibFYv5pW5giTIcCHPr3EA_yTGzfVEUIVO3VXK56qWK8zmBCp5lY0E_4nmlWIPFrHjtHylA5NhwELjn-UB0fLG2h_kbrxrc7Cs_edvveNA.png",[31,406,408],{"id":407},"leaderless-architecture","Leaderless architecture",[47,410,411],{},"Traditional streaming engines such as Kafka, Pulsar, or RedPanda rely on a leader-based model, where each partition is assigned to a single leader broker that handles all writes and reads.",[47,413,414],{},"Pros of Leader-Based Architectures:\n✔ Maintains message ordering via local sequence IDs\n✔ Delivers low latency and high performance through message caching",[47,416,417],{},"Cons of Leader-Based Architectures:\n✖ Throughput bottlenecked by a single broker per partition\n✖ Inter-zone traffic required for high availability in multi-AZ deployments",[47,419,420],{},"While Kafka and Pulsar offer partial solutions (e.g., reading from followers, shadow topics) to reduce read-related inter-zone traffic, producers still send data to a single leader.",[47,422,423,424,429],{},"Ursa removes the concept of topic ownership, allowing any broker in the cluster to handle reads or writes for any partition. The primary challenge—ensuring message ordering—is solved with ",[54,425,428],{"href":426,"rel":427},"https:\u002F\u002Fgithub.com\u002Fstreamnative\u002Foxia",[263],"Oxia",", a scalable metadata and index service created by StreamNative in 2022.",[31,431,433],{"id":432},"oxia-the-metadata-layer-enabling-leaderless-architecture","Oxia: The Metadata Layer Enabling Leaderless Architecture",[47,435,436],{},"Ensuring message ordering in a leaderless architecture is complex, but Ursa solves this with Oxia:",[337,438,439,442,445],{},[340,440,441],{},"Handles millions of metadata\u002Findex operations per second",[340,443,444],{},"Generates sequential IDs to maintain strict message ordering",[340,446,447],{},"Optimized for Kubernetes with horizontal scalability",[47,449,450],{},"Producers and consumers can connect to any broker within their local AZ, eliminating inter-zone traffic costs while maintaining performance through localized caching.",[31,452,454],{"id":453},"zero-interzone-data-replication","Zero interzone data replication",[47,456,457],{},"In most distributed systems, data replication from a leader (primary) to followers (replicas) is crucial for fault tolerance and availability. However, replication across zones can inflate infrastructure expenses substantially.",[47,459,460],{},"Ursa avoids these costs by writing data directly to cloud storage (e.g., AWS S3, Google GCS):",[337,462,463,466,469],{},[340,464,465],{},"Built-In Resilience: Cloud storage inherently offers high availability and fault tolerance without inter-zone traffic fees.",[340,467,468],{},"Tradeoff: Slightly higher latency (sub-second, with p99 at 500 milliseconds) compared to local disk\u002FEBS (single-digit to sub-100 milliseconds), in exchange for significantly lower costs (up to 10x lower).",[340,470,471],{},"Flexible Modes: Ursa is an addition to the classic BookKeeper-based engine, providing users with the flexibility to optimize for either cost or low latency based on their workload requirements.",[47,473,474],{},"By foregoing conventional replication, Ursa slashes inter-zone traffic costs and associated complexities—making it a compelling option for organizations seeking to balance high-performance data streaming with strict budget constraints.",[39,476,478],{"id":477},"how-we-ran-a-5-gbs-test-with-ursa","How We Ran a 5 GB\u002Fs Test with Ursa",[31,480,482],{"id":481},"ursa-cluster-deployment","Ursa Cluster Deployment",[337,484,485,488],{},[340,486,487],{},"9 brokers across 3 availability zones, each on m6i.8xlarge (Fixed 12.5 Gbps bandwidth, 32 vCPU cores, 128 GB memory).",[340,489,490],{},"Oxia cluster (metadata store) with 3 nodes of m6i.8xlarge, distributed across three availability zones (AZs).",[47,492,493],{},"During peak throughput (5 GB\u002Fs), each broker’s network usage was about 10 Gbps.",[31,495,497],{"id":496},"openmessaging-benchmark-workers-configuration","OpenMessaging Benchmark Workers & Configuration",[47,499,500,501,505],{},"The OpenMessaging Benchmark(OMB) Framework is a suite of tools that make it easy to benchmark distributed messaging systems in the cloud. Please check ",[54,502,503],{"href":503,"rel":504},"https:\u002F\u002Fopenmessaging.cloud\u002Fdocs\u002Fbenchmarks\u002F",[263]," for details.",[337,507,508,523,532],{},[340,509,510,511,516,517,522],{},"12 OMB workers: 6 for ",[54,512,515],{"href":513,"rel":514},"https:\u002F\u002Fgist.github.com\u002Fcodelipenghui\u002Fd1094122270775e4f1580947f80c5055",[263],"producers",", 6 for ",[54,518,521],{"href":519,"rel":520},"https:\u002F\u002Fgist.github.com\u002Fcodelipenghui\u002F06bada89381fb77a7862e1b4c1d8963d",[263],"consumers"," across 3 availability zones, on m6i.8xlarge instances. Each worker is configured with 12 CPU cores and 48 GB memory.",[340,524,525,526,531],{},"Sample YAML ",[54,527,530],{"href":528,"rel":529},"https:\u002F\u002Fgist.github.com\u002Fcodelipenghui\u002F204c1f26c4d44a218ae235bf2de99904",[263],"scripts"," provided for Kafka-compatible configuration and rate limits.",[340,533,534],{},"Achieved consistent 5 GB\u002Fs publish\u002Fsubscribe throughput.",[39,536,538],{"id":537},"ursa-benchmark-tests-results","Ursa Benchmark Tests & Results",[47,540,541],{},"The following diagram demonstrates that Ursa can consistently handle 5 GB\u002Fs of traffic, fully saturating the network across all broker nodes.",[47,543,544],{},[349,545],{"alt":17,"src":546},"\u002Fimgs\u002Fblogs\u002F679c602d7b261bac1113f7d6_AD_4nXdDPsRc3koXICiFF0bqSmGWbJt_RlUy4FE3ruuWOfbCfpcqZ1dejjqGbkaCJv2hQFL1nirRouBVRW2l5uMWBvY9naMqGB_wHcLI14dBM0f85TXhmdm3UxEv1yGX9Y4hf5FttSkZew.png",[39,548,550],{"id":549},"comparing-infrastructure-cost","Comparing Infrastructure Cost",[47,552,553],{},"This benchmark first evaluates infrastructure costs of running a 5 GB\u002Fs streaming workload (1:1 producer-to-consumer ratio) across different data streaming engines, including Ursa, Redpanda, and AWS MSK, with a focus on multi-AZ deployments to ensure a fair comparison.",[31,555,557],{"id":556},"test-setup-key-assumptions","Test Setup & Key Assumptions",[47,559,560],{},"All tests use multi-AZ configurations, with clusters and clients distributed across three AWS availability zones (AZs). Cluster size scales proportionally to the number of AZs, and rack-awareness is enabled for all engines to evenly distribute topic partitions and leaders.",[47,562,563],{},"To ensure a fair comparison, we selected the same machine type capable of fully utilizing both network and storage bandwidth for Ursa and Redpanda in this 5GB\u002Fs test:",[337,565,566],{},[340,567,568],{},"9 × m6i.8xlarge instances",[47,570,571,572,577],{},"However, MSK's storage bandwidth limits vary depending on the selected instance type, with the highest allowed limit capped at 1000 MiB\u002Fs per broker, according to",[54,573,576],{"href":574,"rel":575},"https:\u002F\u002Fdocs.aws.amazon.com\u002Fmsk\u002Flatest\u002Fdeveloperguide\u002Fmsk-provision-throughput-management.html#throughput-bottlenecks",[263]," AWS documentation",". Given this constraint, achieving 5 GB\u002Fs throughput with a replication factor of 3 required the following setup:",[337,579,580],{},[340,581,582],{},"15 × kafka.m7g.8xlarge (32 vCPUs, 128 GB memory, 15 Gbps network, 4000 GiB EBS).",[47,584,585],{},"This configuration was necessary to work around MSK's storage bandwidth limitations, ensuring a comparable cost basis to other evaluated streaming engines.",[47,587,588],{},"Additional key assumptions include:",[337,590,591,594,597],{},[340,592,593],{},"Inter-AZ producer traffic: For leader-based engines, two-thirds of producer-to-broker traffic crosses AZs due to leader distribution.",[340,595,596],{},"Consumer optimizations: Follower fetch is enabled across all tests, eliminating inter-AZ consumer traffic.",[340,598,599],{},"Storage cost exclusions: This benchmark only evaluates streaming costs, assuming no long-term data retention.",[31,601,603],{"id":602},"inter-broker-replication-costs","Inter-Broker Replication Costs",[47,605,606],{},"Inter-broker (cross-AZ) replication is a major cost driver for data streaming engines:",[337,608,609,612,615],{},[340,610,611],{},"RedPanda: Inter-broker replication is not free, leading to substantial costs when data must be copied across multiple availability zones.",[340,613,614],{},"AWS MSK: Inter-broker replication is free, but MSK instance pricing is significantly higher (e.g., $3.264 per hour for kafka.m7g.8xlarge vs $1.306 per hour for an on-demand m7g.8xlarge). The storage price of MSK is $0.10 per GB-month which is significantly higher than st1, which costs $0.045 per GB-month. Even though replication is free, client-to-broker traffic still incurs inter-AZ charges.",[340,616,617],{},"Ursa: No inter-broker replication costs due to its leaderless architecture, eliminating inter-zone replication costs entirely.",[31,619,621],{"id":620},"zone-affinity-reducing-inter-az-costs","Zone Affinity: Reducing Inter-AZ Costs",[47,623,624],{},"We evaluated zone affinity mechanisms to further reduce inter-AZ data transfer costs.",[47,626,627],{},"Consumers:",[337,629,630],{},[340,631,632],{},"Follower fetch is enabled across all tests, ensuring consumers fetch data from replicas in their local AZ—eliminating inter-zone consumer traffic except for metadata lookups",[47,634,635],{},"Producers:",[337,637,638,647,656],{},[340,639,640,641,646],{},"Kafka protocol lacks an easy way to enforce producer AZ affinity (though ",[54,642,645],{"href":643,"rel":644},"https:\u002F\u002Fcwiki.apache.org\u002Fconfluence\u002Fdisplay\u002FKAFKA\u002FKIP-1123:+Rack-aware+partitioning+for+Kafka+Producer",[263],"KIP-1123"," aims to address this). And it only works with the default partitioner (i.e., when no record partition or record key is specified).",[340,648,649,650,655],{},"Redpanda recently introduced ",[54,651,654],{"href":652,"rel":653},"https:\u002F\u002Fdocs.redpanda.com\u002Fredpanda-cloud\u002Fdevelop\u002Fproduce-data\u002Fleader-pinning\u002F",[263],"leader pinning",", but this only benefits setups where producers are confined to a single AZ—not applicable to our multi-AZ benchmark.",[340,657,658,659,664],{},"Ursa is the only system in this test with ",[54,660,663],{"href":661,"rel":662},"https:\u002F\u002Fdocs.streamnative.io\u002Fdocs\u002Fconfig-kafka-client#eliminate-cross-az-networking-traffic",[263],"built-in zone affinity for both producers and consumers",". It achieves this by embedding producer AZ information in client.id, allowing metadata lookups to route clients to local-AZ brokers, eliminating inter-AZ producer traffic.",[31,666,668],{"id":667},"cost-comparison-results","Cost Comparison Results",[47,670,335],{},[337,672,673,675],{},[340,674,342],{},[340,676,345],{},[47,678,679],{},"Ursa’s leaderless architecture, zone affinity, and native cloud storage integration deliver unparalleled cost efficiency, making it the most cost-effective choice for high-throughput data streaming workloads.",[47,681,682],{},[349,683],{"alt":17,"src":684},"\u002Fimgs\u002Fblogs\u002F679c72208198ca36a352f228_AD_4nXeeZuM8T-xBlD4Vf3j67K618n08qh8wIDLLtiLJG0ssA1Wj1V26u7wIDTX9sqLrtw8mB2c299dwzarGen62CG0Vh7nWstn5qbPGFcBaKJYEepTsLr5fHWv1U8uqbg8Y0UOK6fJ7.png",[47,686,687],{},[349,688],{"alt":17,"src":689},"\u002Fimgs\u002Fblogs\u002F679c625978031f40229de484_AD_4nXdLkLLJ30KKr-_A_rN1j8akVwBYacAWIPzWHoOReJF421890kfByZoQQxkLczihVSmiw5Q9J51-V9I2SEKITbwsYnANDDTlAVL5nQ_jfaHNTe9VEWhSoa7DZooCnilDYL6l6msmJg.png",[47,691,692],{},"The detailed infrastructure cost calculations for each data streaming engine are listed below:",[31,694,696],{"id":695},"streamnative-ursa","StreamNative - Ursa",[337,698,699,702,705,708],{},[340,700,701],{},"Server EC2 costs: 9 * $1.536\u002Fhr = $14",[340,703,704],{},"Client EC2 costs: 9 * $1.536\u002Fhr =$14",[340,706,707],{},"S3 write requests costs: 1350 r\u002Fs * $0.005\u002F1000r * 3600s = $24",[340,709,710],{},"S3 read requests costs: 1350 r\u002Fs * $0.0004\u002F1000r * 3600s = $2",[31,712,714],{"id":713},"aws-msk","AWS MSK",[337,716,717,720,723],{},[340,718,719],{},"Server EC2 costs: 15 * $3.264\u002Fhr = $49",[340,721,722],{},"Client side EC2 costs: 9 * $1.536\u002Fhr =$14",[340,724,725],{},"Interzone traffic - producer to broker: 5GB\u002Fs * ⅔ * $0.02\u002FG(in+out) * 3600 = $240",[31,727,729],{"id":728},"redpanda","RedPanda",[337,731,732,734,736,739,742],{},[340,733,701],{},[340,735,704],{},[340,737,738],{},"Interzone traffic - producer to broker: 5GB\u002Fs * ⅔ * $0.02\u002FGB(in+out) * 3600 = $240",[340,740,741],{},"Interzone traffic - replication: 10GB\u002Fs * $0.02\u002FGB(in+out) * 3600 = $720",[340,743,744],{},"Interzone traffic - broker to consumer: $0 (fetch from local zone)",[47,746,747,748,753],{},"Please note that we were unable to test ",[54,749,752],{"href":750,"rel":751},"https:\u002F\u002Fwww.redpanda.com\u002Fblog\u002Fcloud-topics-streaming-data-object-storage",[263],"Redpanda with Cloud Topics",", as it remains an announced but unreleased feature and is not yet available for evaluation. Based on the limited information available, while Cloud Topics may help optimize inter-zone data replication costs, producers still need to traverse inter-availability zones to connect to the topic partition owners and incur inter-zone traffic costs of up to $240 per hour.",[337,755,756,762],{},[340,757,758,761],{},[54,759,645],{"href":643,"rel":760},[263]," (when implemented) will help mitigate producer-to-broker inter-zone traffic, but it is not yet available. And it only works with the default partitioner (no record partition or key is specified).",[340,763,764],{},"Redpanda’s leader pinning helps only when all producers for the pinned topic are confined to a single AZ. In multi-AZ environments (like our benchmark), inter-zone producer traffic remains unavoidable.",[47,766,767],{},"Additionally, Redpanda’s Cloud Topics architecture is not documented publicly. Their blog mentions \"leader placement rules to optimize produce latency and ingress cost,\" but it is unclear whether this represents a shift away from a leader-based architecture or if it uses techniques similar to Ursa’s zone-aware approach.",[47,769,770],{},"We may revisit this comparison as more details become available.",[39,772,774],{"id":773},"comparing-total-cost-of-ownership","Comparing Total Cost of Ownership",[47,776,777],{},"As highlighted earlier, with a BYOC Ursa setup, you can achieve 5 GB\u002Fs throughput at just 5% of the infrastructure cost of a traditional leader-based data streaming engine, such as Kafka or RedPanda, while managing the infrastructure yourself. This significant cost reduction is enabled by Ursa’s leaderless architecture and lakehouse-native storage design, which eliminate overhead costs such as inter-zone traffic and leader-based data replication. By leveraging a lakehouse-native, leaderless architecture, Ursa reduces resource requirements, enabling you to handle high data throughput efficiently and at a fraction of the cost of RedPanda.",[47,779,780],{},"Now, let’s examine the total cost comparison, evaluating Ursa alongside other vendors, including those that have adopted a leaderless architecture (e.g., Confluent WarpStream). This comparison is based on a 5GB\u002Fs workload with a 7-day retention period, factoring in both storage cost and vendor costs Here are the key findings:",[337,782,783,786,789],{},[340,784,785],{},"Ursa ($164,353\u002Fmonth) is: 50% cheaper than Confluent WarpStream ($337,068\u002Fmonth)",[340,787,788],{},"85% cheaper than AWS MSK ($1,115,251\u002Fmonth)",[340,790,791],{},"86% cheaper than Redpanda ($1,202,853\u002Fmonth)",[47,793,794],{},"In addition to Ursa’s architectural advantages—eliminating most inter-AZ traffic and leveraging lakehouse storage for cost-effective data retention—it also adopts a more fair and cost-efficient pricing model: Elastic Throughput-based pricing. This approach aligns costs with actual usage, avoiding unnecessary overhead.",[47,796,797],{},"Unlike WarpStream, which charges for both storage and throughput, Ursa ensures that customers only pay for the throughput they actively use. Ursa’s pricing is based on compressed data sent by clients, meaning the more data compressed on the client side, the lower the cost. In contrast, WarpStream prices are based on uncompressed data, unfairly inflating expenses and failing to incentivize customers to optimize their client applications.",[47,799,800],{},"This distinction is crucial, as compressed data reduces both storage and network costs, making Ursa’s pricing model not only more cost-effective but also more transparent and predictable.",[47,802,803],{},[349,804],{"alt":17,"src":805},"\u002Fimgs\u002Fblogs\u002F679c602d194800c9206d9d58_AD_4nXcFlf755xgyz7htxhMhBV5fGrsxy642mQNodt61DTok_z1dwkw5A6lkO5hatXVneCaB0anbZPAyvLI3MlIMuQEYLEACHHvQMOr5UfaB37dfzkdqewDEvcT-20VGd_zzvJsuA00zGA.png",[47,807,808],{},[349,809],{"alt":17,"src":810},"\u002Fimgs\u002Fblogs\u002F679c62594e9c2e629fae73aa_AD_4nXeU6cOgItnjLsEZCOf13TEvMY_SHWWIxYP2OYUj-B1GUPyWO78OG08K_v03hwYSVcg06f9dqDiGmdwy76vynjmiDGL5bluZ5_XF4nSU_r59oOZdfViXndXt6s11vVOY7qwfZN8v.png",[31,812,814],{"id":813},"cost-breakdown","Cost Breakdown",[816,817,818],"h4",{"id":695},"StreamNative – Ursa",[337,820,821,824,827,830,833],{},[340,822,823],{},"EC2 (Server): 9 × $1.536\u002Fhr × 24 hr × 30 days = $9,953.28",[340,825,826],{},"S3 Write Requests: 1,350 r\u002Fs × $0.005\u002F1,000 r × 3,600 s × 24 hr × 30 days = $17,496",[340,828,829],{},"S3 Read Requests: 1,350 r\u002Fs × $0.0004\u002F1,000 r × 3,600 s × 24 hr × 30 days = $1,400",[340,831,832],{},"S3 Storage Costs: 5 GB\u002Fs × $0.021\u002FGB × 3,600 s × 24 hr × 7 days = $63,504",[340,834,835],{},"Vendor Cost: 200 ETU × $0.50\u002Fhr × 24 hr × 30 days = $72,000",[816,837,839],{"id":838},"warpstream","WarpStream",[337,841,842,845],{},[340,843,844],{},"Based on WarpStream’s pricing calculator (as of January 29, 2025), we assume a 4:1 client data compression ratio, meaning 20 GB\u002Fs of uncompressed data translates to 5 GB\u002Fs of compressed data.",[340,846,847,848,853],{},"It's important to note that WarpStream’s pricing structure has fluctuated frequently throughout January. We observed the cost reported by their calculator changing from $409,644 per month to $337,068 per month. This variability has been previously highlighted in the blog post “",[54,849,852],{"href":850,"rel":851},"https:\u002F\u002Fbigdata.2minutestreaming.com\u002Fp\u002Fthe-brutal-truth-about-apache-kafka-cost-calculators",[263],"The Brutal Truth About Kafka Cost Calculators","”. To ensure transparency, we have documented the pricing as of January 29, 2025.",[47,855,856],{},[349,857],{"alt":17,"src":858},"\u002Fimgs\u002Fblogs\u002F679c602e42713e0028e9af5e_AD_4nXcu5_VWTLu9jRYs6zX1MBAOtLQEo5gyfNSWPcbpnQHXTa8qNCFAXezRR2E8daygzYTTwd4dhJjaLaLM8C6y_3OGbu2NS7pdvEv3a8-ptNKOg7AeKnYqPQCAYvQ5EuxzuI3JYIvY.png",[816,860,862],{"id":861},"msk","MSK",[337,864,865,868,871],{},[340,866,867],{},"EC2 (Server): 15 * $3.264\u002Fhr × 24 hr × 30 days = $35,251",[340,869,870],{},"Interzone Traffic (Client-Server): 5 GB\u002Fs × ⅔ × $0.02\u002FGB (in+out) × 3,600 s × 24 hr × 30 days = $172,800",[340,872,873],{},"Storage: 5 GB\u002Fs × $0.1\u002FGB-month × 3,600 s × 24 hr × 7 days * 3 replicas = $907,200",[816,875,729],{"id":876},"redpanda-1",[337,878,879,882,884,887,890],{},[340,880,881],{},"EC2 (Server): 9 × $1.536\u002Fhr × 24 hr × 30 days = $9953",[340,883,870],{},[340,885,886],{},"Interzone Traffic (Replication): 5 GB\u002Fs × 2 × $0.02\u002FGB (in+out) × 3,600 s × 24 hr × 30 days = $518,400",[340,888,889],{},"Storage: 5 GB\u002Fs × $0.045\u002FGB-month(st1) × 3,600 s × 24 hr × 7 days * 3 replicas = $408,240",[340,891,892],{},"Vendor Cost: $93,333 per month (based on limited information. See additional notes below).",[816,894,896],{"id":895},"additional-notes","Additional Notes",[337,898,899],{},[340,900,901,902,907],{},"Redpanda does not publicly disclose its BYOC pricing, making it difficult to accurately assess its total costs. We refer to information from the whitepaper “",[54,903,906],{"href":904,"rel":905},"https:\u002F\u002Fwww.redpanda.com\u002Fresources\u002Fredpanda-vs-confluent-performance-tco-benchmark-report#form",[263],"Redpanda vs. Confluent: A Performance and TCO Benchmark Report by McKnight Consulting Group.","” for estimation purposes. Based on the Tier-8 pricing model in the whitepaper,  the estimated cost to support a 5GB\u002Fs workload would be $1.12 million per year ($93,333 per month). However, since this calculation is based on an estimation, we will revisit and refine the cost assessment once Redpanda publishes its BYOC pricing.",[47,909,910],{},[349,911],{"alt":17,"src":912},"\u002Fimgs\u002Fblogs\u002F679c602dc8a9859eed89a0ef_AD_4nXdbcO8vsNNPy4GtkNLlmNKf22fjxRvzLzH7CtOna1L08sTbvnZx3HhufeFqc1w4K2gEF7lxO2IR5supotxebAiGnA07Qa8Yr3Rd1pVK2LYKK4WurlJGwgdwwucZIFoF-N_2oBjY.png",[47,914,915],{},[349,916],{"alt":17,"src":917},"\u002Fimgs\u002Fblogs\u002F679c602d6bc1c2287e012540_AD_4nXfcHZnLfjbjIr3ZAgoQXT9dwP3aQCOQPmGZZJUtpNZSwE6qY6M3yehIaBxCwxEIeu5PVdUPY0zhyjnow26YfgjdYgSG4GnV9ibxu0YWTIpwng6z_F6FUGJMpERMKtpsFESzXSN_Sw.png",[337,919,920,923],{},[340,921,922],{},"When estimating the storage costs for Kafka and Redpanda, we assume the use of HDD storage at $0.045\u002FGB, based on the premise that both systems can fully utilize disk bandwidth without incurring the higher costs associated with GP2 or GP3 volumes. However, in practice, many users opt for GP2 or GP3, significantly increasing the total storage cost for Kafka and Redpanda.",[340,924,925],{},"Unlike disk-based solutions, S3 storage does not require capacity preallocation—Ursa only incurs costs for the actual data stored. This contrasts with Kafka and Redpanda, where preallocating storage can drive up expenses. As a result, the real-world storage costs for Kafka and Redpanda are often 50% higher than the estimates above.",[39,927,929],{"id":928},"conclusion","Conclusion",[47,931,932],{},"Ursa represents a transformative shift in streaming data infrastructure, offering cost efficiency, scalability, and flexibility without compromising durability or reliability. By leveraging a leaderless architecture and eliminating inter-zone data replication, Ursa reduces total cost of ownership by over 90% compared to traditional leader-based streaming engines like Kafka and Redpanda. Its direct integration with cloud storage and scalable metadata & index management via Oxia ensure high availability and simplified infrastructure management.",[31,934,936],{"id":935},"balancing-latency-and-cost","Balancing Latency and Cost",[47,938,939,943],{},[54,940,942],{"href":941},"\u002Fblog\u002Fcap-theorem-for-data-streaming","Ursa trades off slightly higher latency for ultra low cost",", making it an ideal choice for the majority of streaming workloads, especially those that prioritize throughput and cost savings over ultra-low latency. Meanwhile, StreamNative’s BookKeeper-based engine remains the preferred solution for real-time, latency-sensitive applications. By combining these two approaches, StreamNative empowers customers with the flexibility to choose the right engine for their specific needs—whether it's maximizing cost savings or achieving ultra low-latency real-time performance.",[31,945,947],{"id":946},"the-future-of-streaming-infrastructure","The Future of Streaming Infrastructure",[47,949,950],{},"In an era where data fuels AI, analytics, and real-time decision-making, managing infrastructure costs is critical to sustaining innovation. Ursa is not just a cost-cutting alternative—it is a forward-thinking, lakehouse-native platform that redefines how modern data streaming infrastructure should be built and operated.",[47,952,953,954,959],{},"Whether your priority is reducing costs, improving flexibility, or ingesting massive data into lakehouses, Ursa delivers a future-proof solution for the evolving demands of real-time data streaming. ",[54,955,958],{"href":956,"rel":957},"https:\u002F\u002Fconsole.streamnative.cloud\u002F",[263],"Get started"," with StreamNative Ursa today!",[961,962,964],"h1",{"id":963},"references","References",[47,966,967,970,971],{},[968,969,428],"span",{}," ",[54,972,973],{"href":973},"\u002Fblog\u002Fintroducing-oxia-scalable-metadata-and-coordination",[47,975,976,970,978],{},[968,977,377],{},[54,979,376],{"href":376},[47,981,982,970,985],{},[968,983,984],{},"StreamNative pricing",[54,986,987],{"href":987,"rel":988},"https:\u002F\u002Fdocs.streamnative.io\u002Fdocs\u002Fbilling-overview",[263],[47,990,991,970,994],{},[968,992,993],{},"WarpStream pricing",[54,995,996],{"href":996,"rel":997},"https:\u002F\u002Fwww.warpstream.com\u002Fpricing#pricingfaqs",[263],[47,999,1000,970,1003],{},[968,1001,1002],{},"AWS S3 pricing",[54,1004,1005],{"href":1005,"rel":1006},"https:\u002F\u002Faws.amazon.com\u002Fs3\u002Fpricing\u002F",[263],[47,1008,1009,970,1012],{},[968,1010,1011],{},"AWS EBS pricing",[54,1013,1014],{"href":1014,"rel":1015},"https:\u002F\u002Faws.amazon.com\u002Febs\u002Fpricing\u002F",[263],[47,1017,1018,970,1021],{},[968,1019,1020],{},"AWS MSK pricing",[54,1022,1023],{"href":1023,"rel":1024},"https:\u002F\u002Faws.amazon.com\u002Fmsk\u002Fpricing\u002F",[263],[47,1026,1027,970,1030],{},[968,1028,1029],{},"The Brutal Truth about Kafka Cost Calculators",[54,1031,850],{"href":850,"rel":1032},[263],[47,1034,1035,970,1038],{},[968,1036,1037],{},"Redpanda vs. Confluent: A Performance and TCO Benchmark Report by McKnight Consulting Group",[54,1039,904],{"href":904,"rel":1040},[263],{"title":17,"searchDepth":18,"depth":18,"links":1042},[1043,1044,1045,1050,1054,1055,1064,1067],{"id":331,"depth":18,"text":332},{"id":370,"depth":18,"text":371},{"id":395,"depth":18,"text":396,"children":1046},[1047,1048,1049],{"id":407,"depth":278,"text":408},{"id":432,"depth":278,"text":433},{"id":453,"depth":278,"text":454},{"id":477,"depth":18,"text":478,"children":1051},[1052,1053],{"id":481,"depth":278,"text":482},{"id":496,"depth":278,"text":497},{"id":537,"depth":18,"text":538},{"id":549,"depth":18,"text":550,"children":1056},[1057,1058,1059,1060,1061,1062,1063],{"id":556,"depth":278,"text":557},{"id":602,"depth":278,"text":603},{"id":620,"depth":278,"text":621},{"id":667,"depth":278,"text":668},{"id":695,"depth":278,"text":696},{"id":713,"depth":278,"text":714},{"id":728,"depth":278,"text":729},{"id":773,"depth":18,"text":774,"children":1065},[1066],{"id":813,"depth":278,"text":814},{"id":928,"depth":18,"text":929,"children":1068},[1069,1070],{"id":935,"depth":278,"text":936},{"id":946,"depth":278,"text":947},"StreamNative Cloud","2025-01-31","Discover how Ursa achieves 5GB\u002Fs Kafka workloads at just 5% of the cost of traditional streaming engines like Redpanda and AWS MSK. See our benchmark results comparing infrastructure costs, total cost of ownership (TCO), and performance across leading Kafka vendors.","\u002Fimgs\u002Fblogs\u002F679c6593d25099b1cdcec4ca_image-31.png",{},"\u002Fblog\u002Fhow-we-run-a-5-gb-s-kafka-workload-for-just-50-per-hour","30 min",{"title":306,"description":1073},"blog\u002Fhow-we-run-a-5-gb-s-kafka-workload-for-just-50-per-hour",[1081,1082,301],"TCO","Apache Kafka","A0o_2xdJiLI6rf6xj4RKsxJNo_A6QN2fYzCp6gaLrFw",{"id":1085,"title":1086,"authors":1087,"body":1091,"category":1716,"createdAt":10,"date":1717,"description":1718,"extension":8,"featured":292,"image":1719,"isDraft":292,"link":10,"meta":1720,"navigation":7,"order":294,"path":1721,"readingTime":1722,"relatedResources":10,"seo":1723,"stem":1724,"tags":1725,"__hash__":1727},"blogs\u002Fblog\u002Fguide-apache-pulsar-compare-features-architecture-to-apache-kafka.md","A Guide to Apache Pulsar: Compare Features and Architecture to Apache Kafka",[1088,1089,1090],"Carolyn King","Sijie Guo","Addison Higham",{"type":14,"value":1092,"toc":1685},[1093,1099,1102,1127,1130,1138,1144,1148,1152,1155,1161,1164,1167,1171,1174,1180,1183,1186,1189,1193,1196,1200,1204,1207,1221,1224,1227,1230,1276,1279,1283,1286,1334,1338,1341,1344,1347,1351,1354,1363,1389,1393,1408,1431,1434,1448,1452,1456,1459,1468,1476,1480,1483,1494,1502,1506,1510,1513,1516,1519,1522,1526,1535,1544,1548,1551,1554,1558,1562,1565,1598,1602,1605,1608,1612,1615,1618,1620,1623,1626,1629,1632,1636,1639,1643],[47,1094,1095],{},[349,1096],{"alt":1097,"src":1098},"logo pulsar and kafka on blue an black background","\u002Fimgs\u002Fblogs\u002F63a3758fe1d5c02d6ae8dc76_top.png",[47,1100,1101],{},"The shift to real-time streaming technologies has bolstered the adoption of Pulsar and there has been a marked increase in both the interest and adoption of Pulsar. With Pulsar being sought out by companies developing messaging and event-streaming applications — from Fortune 100 companies to forward-thinking start-ups — and so much growth around the Pulsar project, it has garnered a lot of recent press and attention.",[47,1103,1104,1105,1110,1111,1110,1115,1120,1121,1126],{},"For the most part, the recent press and articles have helped to provide valuable education and transparency into Pulsar’s use cases and capabilities. Companies such as ",[54,1106,1109],{"href":1107,"rel":1108},"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=FXQvsHz_S1A",[263],"Verizon Media",", ",[54,1112,95],{"href":1113,"rel":1114},"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=NrDvSNewNT0",[263],[54,1116,1119],{"href":1117,"rel":1118},"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=zAHxgG_U67Q",[263],"Nutanix",", and ",[54,1122,1125],{"href":1123,"rel":1124},"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=pmaCG1SHAW8",[263],"Overstock.com",", are just a handful of companies who have recently presented their Pulsar use cases and shared insights into how they are leveraging Pulsar to achieve their business goals.",[47,1128,1129],{},"However, not all recent press has been entirely accurate and we have received a number of requests from the Pulsar community to address a recent Confluent blog comparing Kafka, Pulsar, and RabbitMQ. We appreciate that Pulsar is a quickly growing and evolving technology and we would like to take this opportunity to provide a deep dive into Pulsar’s capabilities.",[47,1131,1132,1133,1137],{},"In today’s post, we will leverage in-depth knowledge of the Pulsar technology, community, and ecosystem to provide a more balanced and holistic picture of the event-streaming landscape. This post will be the first in a two-part series and here we will concentrate on the differences between Pulsar and Kafka in terms of performance, architecture, and features. In the ",[54,1134,1136],{"href":1135},"\u002Fblog\u002Fpulsar-vs-kafka-part-2-adoption-use-cases-differentiators-and-community","second post",", we focus on adoption, use cases, support, and community.",[1139,1140,1141],"blockquote",{},[47,1142,1143],{},"Note Given that Kafka is more widely-known and has widespread documentation available, we will focus our efforts on providing education and transparency into the lesser-known Pulsar technology.",[39,1145,1147],{"id":1146},"pulsar-fundamentals","Pulsar Fundamentals",[31,1149,1151],{"id":1150},"components-of-a-pulsar-cluster","Components of a Pulsar Cluster",[47,1153,1154],{},"Pulsar is composed of 3 main components: a broker, which is a stateless service that clients connect to for core messaging, and two stateful services, Apache BookKeeper and Apache ZooKeeper. BookKeeper nodes (bookies) store the actual messages and cursor positions while ZooKeeper is used strictly for metadata storage by both brokers and bookies. Additionally, BookKeeper leverages RocksDB as an embedded database, which is used to store internal indices, but it is not managed independently of BookKeeper.",[47,1156,1157],{},[349,1158],{"alt":1159,"src":1160},"An illustration of main components that pulsar is composed","\u002Fimgs\u002Fblogs\u002F63a3758ff0583901aeac52c6_1.png",[47,1162,1163],{},"Unlike Kafka, which employs a monolithic architecture model that tightly couples serving and storage, Pulsar leverages a multi-layer design which allows it to manage these functions in separate layers. Pulsar’s broker performs computing on one layer and the bookie manages stateful storage on another.",[47,1165,1166],{},"While, on the surface, it may seem like Pulsar’s architecture is more complicated compared with Kafka’s, the reality is more nuanced. Architectural decisions come with trade-offs and Pulsar’s inclusion of BookKeeper enables it to provide more flexible scalability, lower operational burden, faster, and more consistent performance. We will talk in more detail about each of these benefits later on.",[31,1168,1170],{"id":1169},"pulsars-storage-architecture","Pulsar's Storage Architecture",[47,1172,1173],{},"The architectural differences in Pulsar also extend to how Pulsar stores data. Pulsar breaks topic partitions into segments and then distributes the segments across the storage nodes in Apache BookKeeper to get better performance, scalability, and availability.",[47,1175,1176],{},[349,1177],{"alt":1178,"src":1179},"storage architecture of pulsar","\u002Fimgs\u002Fblogs\u002F63a1eb9b2deb275c1af75e32_pulsar-partition-log-segment.png",[47,1181,1182],{},"Pulsar’s infinite distributed log is segment centric and implemented by leveraging scale-out log storage (via Apache BookKeeper) with built-in tiered storage support which enables segments to be distributed evenly across storage nodes. Because the data associated with any given topic is not tied to any specific storage node, it is easy to replace nodes and to scale up or down. Moreover, the smallest or slowest node in the cluster cannot impose any storage or bandwidth limitations.",[47,1184,1185],{},"Pulsar’s partition-rebalance-free architecture ensures instant scalability and higher availability. Both of these factors are extremely important and make Pulsar well-suited for building mission-critical services such as billing platforms for financial use cases, transaction processing systems for e-commerce and retailers, and real-time risk control systems for financial institutions.",[47,1187,1188],{},"By leveraging the powerful Netty framework, data is zero-copied when it is transferred from producers to brokers to bookies. This works extremely well for all streaming use cases because the data is transferred directly over the network or to disk without any performance penalties.",[31,1190,1192],{"id":1191},"message-consumption-on-pulsar","Message Consumption on Pulsar",[47,1194,1195],{},"Pulsar’s consumption model takes a streaming-pull approach. This is an enhanced version of long-polling as it eliminates the wait time between individual calls and requests and provides bi-directional message streaming. The streaming-pull model enables Pulsar to achieve lower end-to-end latency than any other existing long-polling-based messaging solutions, such as Kafka.",[39,1197,1199],{"id":1198},"ease-of-use","Ease of Use",[31,1201,1203],{"id":1202},"operational-simplicity","Operational Simplicity",[47,1205,1206],{},"When evaluating the operational simplicity for a given technology, it’s important to consider not only the initial set-up but also its long-term maintenance and scalability. Helpful questions to consider include:",[337,1208,1209,1212,1215,1218],{},[340,1210,1211],{},"How quickly and simply can you scale your cluster to keep up with your business growth?",[340,1213,1214],{},"Does your cluster provide out-of-the-box features for multi-tenancy that map well to multiple teams and users?",[340,1216,1217],{},"Will the operational tasks, such as replacing hardware, require maintenance that potentially can impact the availability and reliability of your business?",[340,1219,1220],{},"Can your system easily replicate data for geographic redundancy or different access patterns?",[47,1222,1223],{},"Long-time Kafka users will know these are not easy questions to answer when operating Kafka. Most of these tasks require a suite of tools external to Kafka, such as cruise control for managing rebalancing of clusters and Kafka mirror-maker\u002Freplicator for any replication needs.",[47,1225,1226],{},"Many organizations also develop tooling for provisioning and managing multiple distinct clusters as Kafka can be difficult to share across teams. These types of tools are critical to run Kafka at scale successfully but also add to its complexity. The most capable tools for managing Kafka clusters have been developed as proprietary, closed source tooling. It is no surprise that Kafka’s complex overhead and operations have pushed many businesses to use Confluent.",[47,1228,1229],{},"By contrast, Pulsar’s goal is to streamline operations and scalability. Below we respond to the same questions with respect to Pulsar’s capabilities:",[337,1231,1232,1234,1243,1245,1254,1256,1265,1267],{},[340,1233,1211],{},[340,1235,1236,1237,1242],{},"New compute and storage capacity is automatically and immediately utilized with Pulsar’s automatic ",[54,1238,1241],{"href":1239,"rel":1240},"https:\u002F\u002Fpulsar.apache.org\u002Fdocs\u002Fen\u002Fadministration-load-balance\u002F",[263],"load balancing",". This allows migrating topics to equalize load among brokers and new bookie nodes immediately receiving write traffic for new segments, with no manual rebalancing or broker management required.",[340,1244,1214],{},[340,1246,1247,1248,1253],{},"Pulsar provides a hierarchical structure of ",[54,1249,1252],{"href":1250,"rel":1251},"https:\u002F\u002Fpulsar.apache.org\u002Fdocs\u002Fen\u002Fnext\u002Fconcepts-multi-tenancy\u002F",[263],"tenants and namespaces"," which map logically to organizations and teams, with these same constructs allowing for simple ACLs, quotas, self-service controls, and even resources isolation to allow cluster operators to confidently manage shared clusters.",[340,1255,1217],{},[340,1257,1258,1259,1264],{},"The stateless ",[54,1260,1263],{"href":1261,"rel":1262},"https:\u002F\u002Fpulsar.apache.org\u002Fdocs\u002Fen\u002Fconcepts-architecture-overview\u002F#brokers",[263],"broker"," of Pulsar is able to be replaced easily, as there is no risk of data loss. Bookie nodes will automatically replicate any under-replicated segments of data and tools for decommissioning and replacing nodes is built-in and easily automatable.",[340,1266,1220],{},[340,1268,1269,1270,1275],{},"Pulsar has built-in replication, which can be used to seamlessly ",[54,1271,1274],{"href":1272,"rel":1273},"https:\u002F\u002Fpulsar.apache.org\u002Fdocs\u002Fen\u002Fadministration-geo\u002F",[263],"span geographic regions"," or replicate data to additional clusters for other purposes (disaster recovery, analytics, and so on.)",[47,1277,1278],{},"In comparison to Kafka, Pulsar’s batteries included approach provides a more complete solution to the real-world problems of streaming data. With this added perspective, the overall simplicity of use favors Pulsar as it offers a more complete core feature set and allows operators and developers to focus on the core needs of their business.",[31,1280,1282],{"id":1281},"documentation-and-learning","Documentation and Learning",[47,1284,1285],{},"Pulsar has been rapidly building out its documentation and training resources. Here are some of the most notable accomplishments:",[337,1287,1288,1296,1304,1311,1319,1326],{},[340,1289,1290,1295],{},[54,1291,1294],{"href":1292,"rel":1293},"https:\u002F\u002Fpulsar-summit.org\u002F",[263],"7 Pulsar Summits"," across North America, Asia, and Europe, featured hundreds of sessions with speakers from top companies such as Google, AWS, Intuit, and Databricks, attracting thousands of attendees sign-ups.",[340,1297,1298,1303],{},[54,1299,1302],{"href":1300,"rel":1301},"https:\u002F\u002Fwww.academy.streamnative.io\u002F",[263],"On-demand Pulsar courses, tutorials, and hands-on labs"," for developers, operators, and business leaders.",[340,1305,1306,1310],{},[54,1307,1309],{"href":1308},"\u002Ftraining","Instructor-led, hands-on Pulsar training"," for developers and operators.",[340,1312,1313,1318],{},[54,1314,1317],{"href":1315,"rel":1316},"https:\u002F\u002Fyoutube.com\u002Fplaylist?list=PLqRma1oIkcWhfmUuJrMM5YIG8hjju62Ev",[263],"Meetups and webinars"," featuring speakers from adjacent communities like Flink and Nifi and companies including Splunk, Uber, and Elastic.",[340,1320,1321,1325],{},[54,1322,1324],{"href":1323},"\u002Fresources","eBooks, whitepapers, and case studies"," from Iterable, Tencent, Weibo, Tuya, and more.",[340,1327,1328,1333],{},[54,1329,1332],{"href":1330,"rel":1331},"https:\u002F\u002Fpulsar.apache.org\u002Fdocs\u002F2.11.x\u002F",[263],"Documentation portal"," that holds a variety of topics, tutorials, guides, and reference material to help you work with Pulsar.",[31,1335,1337],{"id":1336},"enterprise-support","Enterprise Support",[47,1339,1340],{},"Kafka and Pulsar both have enterprise-grade support offerings. Kafka has enterprise-grade support offerings from multiple large vendors, including Confluent. Pulsar has enterprise-grade support from StreamNative, a newer entrant on the scene. StreamNative offers fully managed Pulsar services for enterprises as well as enterprise-grade support for Pulsar.",[47,1342,1343],{},"StreamNative has a fast-growing and highly-experienced team with deep roots in the messaging and event-streaming space. StreamNative was founded by the core team of Pulsar and BookKeeper. In just a few short years, StreamNative has helped to significantly grow the Pulsar ecosystem — more on this in our next post — including garnering the support of committed strategic partners who are helping to further Pulsar development to meet the needs of a wide number of use cases.",[47,1345,1346],{},"Some major recent developments include the launch of Kafka-on-Pulsar, or KoP, which was launched in March 2020 by OVHCloud and StreamNative. By adding the KoP protocol handler to an existing Pulsar cluster, you can now migrate existing Kafka applications and services to Pulsar without modifying the code. In June 2020, China Mobile and StreamNative announced the launch of another major platform upgrade, AMQP on Pulsar (AoP). This enables RabbitMQ applications to leverage Pulsar’s powerful features, such as infinite event stream retention with Apache BookKeeper and tiered storage. We will talk about each of these in more detail in our next post.",[31,1348,1350],{"id":1349},"integrations","Integrations",[47,1352,1353],{},"Alongside the rapid growth in the number of Pulsar adoptions, we have seen the Pulsar community develop into a large, highly-engaged, and global user community. This active Pulsar community has played a key role in driving growth in the number of integrations in the ecosystem. In just the past six months, the number of officially supported connectors in the Pulsar ecosystem has grown tremendously.",[47,1355,1356,1357,1362],{},"To further support this community effort, StreamNative recently launched ",[54,1358,1361],{"href":1359,"rel":1360},"https:\u002F\u002Fhub.streamnative.io\u002F",[263],"StreamNative Hub",", which provides a convenient central location where users can find and download integrations. This resource will help accelerate the growth of Pulsar’s connector and plug-in ecosystem.",[47,1364,1365,1366,1371,1372,1377,1378,1383,1384,1388],{},"The Pulsar community has also been actively working with other communities on integrating with their projects. For example, Pulsar has been working closely with the Flink community on developing the ",[54,1367,1370],{"href":1368,"rel":1369},"https:\u002F\u002Fgithub.com\u002Fstreamnative\u002Fpulsar-flink",[263],"Pulsar-Flink Connector"," as part of ",[54,1373,1376],{"href":1374,"rel":1375},"https:\u002F\u002Fcwiki.apache.org\u002Fconfluence\u002Fdisplay\u002FFLINK\u002FFLIP-72%3A+Introduce+Pulsar+Connector",[263],"FLIP-72",". ",[54,1379,1382],{"href":1380,"rel":1381},"https:\u002F\u002Fgithub.com\u002Fstreamnative\u002Fpulsar-spark",[263],"Pulsar-Spark Connector"," provides developers the capability of using Apache Spark to process events in Apache Pulsar. ",[54,1385,1387],{"href":1386},"\u002Fblog\u002Fuse-apache-skywalking-to-trace-apache-pulsar-messages","SkyWalking Pulsar Plugin"," integrates Apache SkyWalking with Apache Pulsar, allowing people to trace Pulsar messages using SkyWalking. These are just a few examples of a large collection of integrations the Pulsar community is currently working on.",[31,1390,1392],{"id":1391},"client-library-diversity","Client Library Diversity",[47,1394,1395,1396,1401,1402,1407],{},"Pulsar currently supports 7 languages officially, compared with Kafka’s 1 language. While the Confluent post reported that Kafka currently supports 22 languages, it is important to note that most of the 22 languages Confluent referred to are not official clients, and many are no longer actively maintained. At last count, ",[54,1397,1400],{"href":1398,"rel":1399},"https:\u002F\u002Fgithub.com\u002Fapache\u002Fkafka\u002Ftree\u002Ftrunk\u002Fclients\u002Fsrc\u002Fmain\u002Fjava\u002Forg\u002Fapache\u002Fkafka\u002Fclients",[263],"the Apache Kafka project had only one officially released client",", compared with the ",[54,1403,1406],{"href":1404,"rel":1405},"http:\u002F\u002Fpulsar.apache.org\u002Fdocs\u002Fen\u002Fclient-libraries\u002F",[263],"seven officially supported by Apache Pulsar",":",[337,1409,1410,1413,1416,1419,1422,1425,1428],{},[340,1411,1412],{},"Java",[340,1414,1415],{},"C",[340,1417,1418],{},"C++",[340,1420,1421],{},"Python",[340,1423,1424],{},"Go",[340,1426,1427],{},".NET",[340,1429,1430],{},"Node",[47,1432,1433],{},"Pulsar also supports a rapidly growing list of community developed clients, which includes the following:",[337,1435,1436,1439,1442,1445],{},[340,1437,1438],{},"Rust",[340,1440,1441],{},"Scala",[340,1443,1444],{},"Ruby",[340,1446,1447],{},"Erlang",[39,1449,1451],{"id":1450},"performance-and-availability","Performance and Availability",[31,1453,1455],{"id":1454},"throughput-latency-and-scale","Throughput, Latency, and Scale",[47,1457,1458],{},"Both Pulsar and Kafka have successfully been leveraged in a number of enterprise use cases and each system has its advantages, with both systems being capable of handling large amounts of traffic with similar amounts of hardware. One common misconception of Pulsar is that because it has more components, it must require more servers to achieve the same performance. While this may be true in some hardware configurations, in many configurations Pulsar can get more from the same resources.",[47,1460,1461,1462,1467],{},"As an example, Splunk recently shared that one of the reasons they choose Pulsar over Kafka is that ",[54,1463,1466],{"href":1464,"rel":1465},"https:\u002F\u002Fwww.slideshare.net\u002Fstreamnative\u002Fwhy-splunk-chose-pulsarkarthik-ramasamy",[263],"Pulsar is 1.5x - 2x lower in CAPEX cost with 5x - 50x improvement in latency and 2x - 3x lower in OPEX due to layered architecture"," (from slide 34). They found this was due to Pulsar being better able to utilize disk IO with lower CPU utilization and better control over memory.",[47,1469,1470,1471,1475],{},"More generally, companies such as Tencent have chosen Pulsar in large part due to its performance attributes. As discussed in a recent whitepaper ",[54,1472,1474],{"href":1473},"\u002Fwhitepaper\u002Fcase-study-apache-pulsar-tencent-billing","Tencent’s billing platform, which serves over a million merchants and manages 30 billion escrow accounts",", is currently using Pulsar to process hundreds of millions of dollars in revenue per day. Tencent chose Pulsar over Kafka for its predictable low latency, stronger consistency, and durability guarantees.",[31,1477,1479],{"id":1478},"ordering-guarantees","Ordering Guarantees",[47,1481,1482],{},"Apache Pulsar offers four distinct subscription modes. The four modes and their associated ordering guarantees are described below. An individual application’s ordering and consumption scalability requirements determine which subscription mode is appropriate for that application.",[337,1484,1485,1488,1491],{},[340,1486,1487],{},"Both the Exclusive and Failover subscription modes provide very strong ordering guarantees at a partition level even when consuming a topic in parallel across many consumers.",[340,1489,1490],{},"Shared mode allows you to scale the number of consumers beyond the number of partitions, thus making this mode well-suited for worker queue use cases.",[340,1492,1493],{},"Key_Shared mode combines the advantages of the other subscription modes. It allows scaling the number of consumers beyond the number of partitions and provides a strong ordering guarantee at a key level.",[47,1495,1496,1497,189],{},"For more information about Pulsar’s subscription types and their associated ordering guarantees, see ",[54,1498,1501],{"href":1499,"rel":1500},"http:\u002F\u002Fpulsar.apache.org\u002Fdocs\u002Fen\u002Fconcepts-messaging\u002F#subscriptions",[263],"subscriptions",[39,1503,1505],{"id":1504},"feature","Feature",[31,1507,1509],{"id":1508},"built-in-stream-processing","Built-In Stream Processing",[47,1511,1512],{},"Pulsar and Kafka have two different goals when it comes to built-in stream processing. Pulsar integrates with Flink and Spark, two mature, full-fledged stream processing frameworks, for more complex stream processing needs and developed Pulsar Functions to focus on lightweight computation. Kafka developed Kafka Streams with the goal of providing a full-fledged stream processing engine.",[47,1514,1515],{},"As a result, Kafka Streams is more complex. Users need to figure out where and how to run the KStreams application and it is unnecessarily complicated for most lightweight computing use cases.",[47,1517,1518],{},"Pulsar Functions, on the other hand, makes lightweight computing use cases easy to implement and enables developers to create complex processing logic without deploying a separate neighboring system. Additionally, it provides language-native and easy-to-use API. Developers don’t have to learn a complicated API in order to start writing event streaming applications.",[47,1520,1521],{},"A Pulsar Improvement Proposal (PIP) was recently submitted to the Pulsar project to introduce Function Mesh. Function Mesh is a serverless event-streaming framework that combines multiple Pulsar Functions together to facilitate building complex event-streaming applications.",[31,1523,1525],{"id":1524},"exactly-once-processing","Exactly-Once Processing",[47,1527,1528,1529,1534],{},"Pulsar currently supports exactly-once producers via ",[54,1530,1533],{"href":1531,"rel":1532},"https:\u002F\u002Fgithub.com\u002Fapache\u002Fpulsar\u002Fwiki\u002FPIP-6:-Guaranteed-Message-Deduplication",[263],"broker-side deduplication"," and we are happy to share a major upgrade is presently in development and will be available soon!",[47,1536,1537,1538,1543],{},"Support for transactional message streaming started in ",[54,1539,1542],{"href":1540,"rel":1541},"https:\u002F\u002Fgithub.com\u002Fapache\u002Fpulsar\u002Fwiki\u002FPIP-31:-Transaction-Support",[263],"PIP-31"," and is currently in development. This feature will improve Pulsar’s message delivery semantics and processing guarantees. With transactional streaming, each message is written or processed exactly once with no duplication or data loss, even when a broker or function instance fails. Transactional messaging not only makes it easier to write applications using Pulsar or Pulsar Functions, but it also expands the scope of the use cases that Pulsar can support. We are making rapid progress on this feature and it will be included in Pulsar 2.7.0 which is scheduled for release in September 2020.",[31,1545,1547],{"id":1546},"topic-log-compaction","Topic (Log) Compaction",[47,1549,1550],{},"Pulsar was designed to provide users a choice of formats for consuming data. Applications can choose to consume either raw data or compacted data, as appropriate. By doing this, Pulsar allows for non-compacted data to have a retention policy, keeping control over unbounded growth, but still allowing periodic compaction to generate the most recent materialized view around. The built-in tiered storage feature also allows Pulsar to offload the non-compacted data from BookKeeper to cloud storage and makes it much cheaper to store events for a much longer period.",[47,1552,1553],{},"Unlike Pulsar, Kafka does not offer users the option to consume raw data. Kafka removes raw data immediately after it is compacted.",[39,1555,1557],{"id":1556},"use-case","Use Case",[31,1559,1561],{"id":1560},"event-streaming","Event Streaming",[47,1563,1564],{},"Pulsar was originally developed as a unified pub\u002Fsub messaging platform in Yahoo! (known as Cloud Messaging). However, Pulsar has grown beyond a messaging platform and become a unified messaging and event streaming platform. Pulsar includes a complete set of tools as part of the platform, to provide all the fundamentals necessary for building event streaming applications. Pulsar encompasses the following event streaming capabilities:",[337,1566,1567,1570,1573,1586,1589,1592,1595],{},[340,1568,1569],{},"Infinite event stream storage makes it possible to store events at scale by leveraging scale-out log storage (via Apache BookKeeper) with built-in tiered storage support to cost-effective systems like S3, HDFS, and so on.",[340,1571,1572],{},"Unified pub\u002Fsub messaging model allows developers to add messaging to their applications easily. This model can be scaled both based on traffic and on the user’s needs.",[340,1574,1575,1576,1580,1581,1585],{},"Protocol handler framework and protocol compatibility with Kafka (via ",[54,1577,1579],{"href":1578},"\u002Fblog\u002Ftech\u002F2020-03-24-bring-native-kafka-protocol-support-to-apache-pulsar","Kafka-on-Pulsar",") and AMQP (via ",[54,1582,1584],{"href":1583},"\u002Fblog\u002Ftech\u002F2020-06-15-announcing-aop-on-pulsar","AMQP-on-Pulsar",") allow applications to produce and consume events from anywhere using any existing protocols.",[340,1587,1588],{},"Pulsar IO provides a set of connectors integrating larger ecosystems, allowing users to ingest data from external systems without writing any code.",[340,1590,1591],{},"Integration with Flink enables comprehensive event processing.",[340,1593,1594],{},"Pulsar Functions offers a lightweight serverless framework for processing events as they arrive.",[340,1596,1597],{},"Integration with Presto (Pulsar SQL) allows data scientists and developers to use ANSI-compliant SQL to gain insights into their data and business.",[31,1599,1601],{"id":1600},"message-routing","Message Routing",[47,1603,1604],{},"Pulsar provides comprehensive routing capabilities through Pulsar IO, Pulsar Functions, and Pulsar Protocol Handler. Pulsar’s routing capabilities include content-based routing, message transformation, and message enrichment.",[47,1606,1607],{},"Pulsar has more robust routing capabilities compared with Kafka. Pulsar provides a flexible deployment model for connectors and functions. These can be run within a broker, allowing for easy deployment. Alternatively, they can be run in a dedicated pool of nodes (similar to Kafka Streams) which allows for massive scale-out. Pulsar also integrates natively with Kubernetes. In addition, Pulsar can be configured to schedule function and connector workloads as pods, thus fully leveraging the elasticity of Kubernetes.",[31,1609,1611],{"id":1610},"message-queuing","Message Queuing",[47,1613,1614],{},"As noted above, Pulsar was originally developed as a unified pub\u002Fsub messaging platform. The Pulsar team learned a lot of the pros and cons of operating existing open-source messaging systems and applied their experiences to designing Pulsar’s unified messaging model. The Pulsar messaging API combines both queueing and streaming capabilities. It not only allows implementing a worker queue that delivers messages round-robin to competing consumers (via Shared subscription) but also supports event streaming by delivering messages based on the order of messages in a partition (via Failover subscription) or a key range (via Key_Shared subscription). Developers are able to build both messaging and event streaming applications on the same set of data without duplicating it to different siloed systems.",[47,1616,1617],{},"Additionally, The Pulsar community is also working on bringing the native support of different messaging protocols (such as AoP and KoP) to Apache Pulsar to extend Pulsar’s messaging capabilities.",[39,1619,929],{"id":928},[47,1621,1622],{},"This is a very exhilarating time marked by tremendous growth and change in the Pulsar community. Pulsar’s ecosystem is developing and expanding as its technology continues to evolve and new use cases are added.",[47,1624,1625],{},"Pulsar offers many advantages that make it an attractive choice for companies seeking to adopt a unified messaging and event streaming platform. Compared with Kafka, Pulsar is more resilient and less complex to operate and scale.",[47,1627,1628],{},"Like any new technology, it can take time to roll-out and adopt, however, Pulsar provides a turnkey solution that is ready for production upon installation with lower ongoing maintenance costs. Pulsar covers all the fundamentals necessary for building event streaming applications and incorporates many built-in features, including a rich set of tools. Pulsar’s tools are available for immediate use and do not require additional installation steps.",[47,1630,1631],{},"At StreamNative, we are continuously working on developing new features and enhancements to strengthen Pulsar’s capabilities and grow the community.",[39,1633,1635],{"id":1634},"special-thanks","Special Thanks",[47,1637,1638],{},"We would be remiss not to thank the many members across the Pulsar community who contributed to this article. Namely, Jerry Peng, Jesse Anderson, Joe Francis, Matteo Merli, Sanjeev Kulkarni, and Addison Higham.",[39,1640,1642],{"id":1641},"more-resources","More Resources",[337,1644,1645,1653,1661,1670,1678],{},[340,1646,1647,1648,1652],{},"Read the ",[54,1649,1651],{"href":1650},"\u002Fwhitepapers\u002Fapache-pulsar-vs-apache-kafka-2022-benchmark","2022 Pulsar vs. Kafka benchmark"," for a side-by-side comparison of Pulsar and Kafka performance, including tests on throughput, latency, and more.",[340,1654,1655,1656,1660],{},"Watch sessions from ",[54,1657,1659],{"href":1658},"\u002Fpulsar-summit","Pulsar Summit San Francisco 2022"," for best practices and the future of messaging and event streaming technologies.",[340,1662,1663,1664,1669],{},"Join the ",[54,1665,1668],{"href":1666,"rel":1667},"https:\u002F\u002Fcommunityinviter.com\u002Fapps\u002Fapache-pulsar\u002Fapache-pulsar",[263],"Pulsar Slack Channel"," to connect with the community.",[340,1671,1672,1677],{},[54,1673,1676],{"href":1674,"rel":1675},"https:\u002F\u002Fhubs.ly\u002FQ016_Wgd0",[263],"Sign up"," for the monthly StreamNative Newsletter for Apache Pulsar.",[340,1679,1680,1684],{},[54,1681,1683],{"href":1300,"rel":1682},[263],"Learn Pulsar"," from the original creators of Pulsar. Watch on-demand videos, enroll in self-paced courses, and complete our certification program to demonstrate your Pulsar knowledge.",{"title":17,"searchDepth":18,"depth":18,"links":1686},[1687,1692,1699,1703,1708,1713,1714,1715],{"id":1146,"depth":18,"text":1147,"children":1688},[1689,1690,1691],{"id":1150,"depth":278,"text":1151},{"id":1169,"depth":278,"text":1170},{"id":1191,"depth":278,"text":1192},{"id":1198,"depth":18,"text":1199,"children":1693},[1694,1695,1696,1697,1698],{"id":1202,"depth":278,"text":1203},{"id":1281,"depth":278,"text":1282},{"id":1336,"depth":278,"text":1337},{"id":1349,"depth":278,"text":1350},{"id":1391,"depth":278,"text":1392},{"id":1450,"depth":18,"text":1451,"children":1700},[1701,1702],{"id":1454,"depth":278,"text":1455},{"id":1478,"depth":278,"text":1479},{"id":1504,"depth":18,"text":1505,"children":1704},[1705,1706,1707],{"id":1508,"depth":278,"text":1509},{"id":1524,"depth":278,"text":1525},{"id":1546,"depth":278,"text":1547},{"id":1556,"depth":18,"text":1557,"children":1709},[1710,1711,1712],{"id":1560,"depth":278,"text":1561},{"id":1600,"depth":278,"text":1601},{"id":1610,"depth":278,"text":1611},{"id":928,"depth":18,"text":929},{"id":1634,"depth":18,"text":1635},{"id":1641,"depth":18,"text":1642},"Apache Pulsar","2020-07-08","Learn the differences between Pulsar and Kafka in architecture, ease of use, performance and availability, and use cases.","\u002Fimgs\u002Fblogs\u002F63d0699ba635bf2a05fe2088_Screen-Shot-2023-01-24-at-3.28.16-PM.png",{},"\u002Fblog\u002Fguide-apache-pulsar-compare-features-architecture-to-apache-kafka","11 min read",{"title":1086,"description":1718},"blog\u002Fguide-apache-pulsar-compare-features-architecture-to-apache-kafka",[1082,1726],"Intro","DR1YpMnF5cqiU7ZaBqBheTrRQZUc6mK8uPsBg0nVnoo",[1729,1745,1761],{"id":1730,"title":1088,"bioSummary":1731,"email":10,"extension":8,"image":1732,"linkedinUrl":1733,"meta":1734,"position":1741,"stem":1742,"twitterUrl":1743,"__hash__":1744},"authors\u002Fauthors\u002Fcarolyn-king.md","Carolyn has dedicated the past 15 years to helping companies develop growth strategies to drive customer acquisition and revenue. At StreamNative, she leads all things growth, including global Marketing and Community, global Training and Documentation, Developer Relations and Sales for the US and EMEA. She holds an MBA from UCLA Anderson and a BA in Business-Economics from UCLA. Carolyn lives in Santa Monica, California.","\u002Fimgs\u002Fauthors\u002Fcarolyn-king.webp","https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fcarolynnicoleking\u002F",{"body":1735},{"type":14,"value":1736,"toc":1739},[1737],[47,1738,1731],{},{"title":17,"searchDepth":18,"depth":18,"links":1740},[],"Former VP of Growth, StreamNative","authors\u002Fcarolyn-king","https:\u002F\u002Ftwitter.com\u002Fcarolynking22","yTyJgeMMyV9lLQiJEQq9_me9Vb3o5cMqh8lVfZBceDY",{"id":1746,"title":1089,"bioSummary":1747,"email":10,"extension":8,"image":1748,"linkedinUrl":1749,"meta":1750,"position":1757,"stem":1758,"twitterUrl":1759,"__hash__":1760},"authors\u002Fauthors\u002Fsijie-guo.md","Sijie’s journey with Apache Pulsar began at Yahoo! where he was part of the team working to develop a global messaging platform for the company. He then went to Twitter, where he led the messaging infrastructure group and co-created DistributedLog and Twitter EventBus. In 2017, he co-founded Streamlio, which was acquired by Splunk, and in 2019 he founded StreamNative. He is one of the original creators of Apache Pulsar and Apache BookKeeper, and remains VP of Apache BookKeeper and PMC Member of Apache Pulsar. Sijie lives in the San Francisco Bay Area of California.","\u002Fimgs\u002Fauthors\u002Fsijie-guo.webp","https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fsijieg\u002F",{"body":1751},{"type":14,"value":1752,"toc":1755},[1753],[47,1754,1747],{},{"title":17,"searchDepth":18,"depth":18,"links":1756},[],"CEO and Co-Founder, StreamNative, Apache Pulsar PMC Member","authors\u002Fsijie-guo","https:\u002F\u002Ftwitter.com\u002Fsijieg","krzMgsbADqGZT1TnpWTVzT4HJ9U7oZB9hzOMiDT5Wd0",{"id":1762,"title":1090,"bioSummary":1763,"email":10,"extension":8,"image":1764,"linkedinUrl":10,"meta":1765,"position":1772,"stem":1773,"twitterUrl":1774,"__hash__":1775},"authors\u002Fauthors\u002Faddison-higham.md","Addison Higham has deep experience with streaming technologies such as Flink and Spark. Seeking a new stream storage technology for his previous company, Instructure, Addison discovered Pulsar and quickly became a Pulsar champion and drove the company’s adoption of the technology. Addison then joined StreamNative, where he leads development of StreamNative Cloud and helps customers to successfully adopt Pulsar. Addison lives in Salt Lake City, Utah.","\u002Fimgs\u002Fauthors\u002Faddison-higham.webp",{"body":1766},{"type":14,"value":1767,"toc":1770},[1768],[47,1769,1763],{},{"title":17,"searchDepth":18,"depth":18,"links":1771},[],"Chief Architect, StreamNative","authors\u002Faddison-higham","https:\u002F\u002Ftwitter.com\u002Faddisonjh?lang=en","jzIyP69DmPgDfuOwbZvvSU4LsXYSvJn9n31qhQCFqBg",[1777,1785,1790],{"path":1778,"title":1779,"date":1780,"image":1781,"link":-1,"collection":1782,"resourceType":1783,"score":1784,"id":1778},"\u002Fblog\u002Fchallenges-in-kafka-the-data-retention-stories-of-kevin-and-patricia","Challenges in Kafka: the Data Retention Stories of Kevin and Patricia","2024-03-15","\u002Fimgs\u002Fblogs\u002F65f42b471a81499b808ac93c_kevin-patrici-1200x630.png","blogs","Blog",1,{"path":1786,"title":1787,"date":1788,"image":1789,"link":-1,"collection":1782,"resourceType":1783,"score":1784,"id":1786},"\u002Fblog\u002Funderstanding-pulsar-10-minutes-guide-kafka-users","Understanding Pulsar in 10 Minutes: A Guide for Kafka Users","2022-09-15","\u002Fimgs\u002Fblogs\u002F63c7c1eeff0c0c587c486628_63b537a7338d4e7215582ac8_kp-top.png",{"path":1791,"title":1792,"date":1793,"image":-1,"link":-1,"collection":1782,"resourceType":1783,"score":1784,"id":1791},"\u002Fblog\u002Fperspective-on-pulsars-performance-compared-to-kafka","A More Accurate Perspective on Pulsar’s Performance Compared to Kafka","2020-11-09",1775235690092]