Introducing the StreamNative AI Hub — Agent Engine, MCP Server & more.

By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.

Blog

January 30, 2025

30 min

How We Run a 5 GB/s Kafka Workload for Just $50 per Hour

Matteo Meril

Co-Founder and CTO, StreamNative

Neng Lu

Director of Platform, StreamNative

Hang Chen

Director of Storage, StreamNative & Apache Pulsar PMC Member

Penghui Li

Director of Streaming, StreamNative & Apache Pulsar PMC Member

Text Link

StreamNative Cloud

Total Cost of Ownership

The rise of DeepSeek has shaken the AI infrastructure market, forcing companies to confront the escalating costs of training and deploying AI models. But the real pressure point isn’t just compute—it’s data acquisition and ingestion costs.

As businesses rethink their AI cost-containment strategies, real-time data streaming is emerging as a critical enabler. The growing adoption of Kafka as a standard protocol has expanded cost-efficient options, allowing companies to optimize streaming analytics while keeping expenses in check.

Ursa, the data streaming engine powering StreamNative’s managed Kafka service, is built for this new reality. With its leaderless architecture and native lakehouse storage integration, Ursa eliminates costly inter-zone network traffic for data replication and client-to-broker communication while ensuring high availability at minimal operational cost.

In this blog post, we benchmarked the infrastructure cost and total cost of ownership (TCO) for running a 5GB/s Kafka workload across different Kafka vendors, including Redpanda, Confluent WarpStream, and AWS MSK. Our benchmark results show that Ursa can sustain 5GB/s Kafka workloads at just 5% of the cost of traditional streaming engines like Redpanda—making it the ideal solution for high-performance, cost-efficient ingestion and data streaming for data lakehouses and AI workloads.

Note: We also evaluated vanilla Kafka in our benchmark; however, for simplicity, we have focused our cost comparison on vendor solutions rather than self-managed deployments. That said, it is important to highlight that both Redpanda and vanilla Kafka use a leader-based data replication approach. In a data-intensive, network-bound workload like 5GB/s streaming, with the same machine type and replication factor, Redpanda and vanilla Kafka produced nearly identical cost profiles.

Key Benchmark Findings

Ursa delivered 5 GB/s of sustained throughput at an infrastructure cost of just $54 per hour. For comparison:

MSK: $303 per hour → 5.6x more expensive compared to Ursa
Redpanda: $988 per hour → 18x more expensive compared to Ursa

Beyond infrastructure costs, when factoring in both storage pricing, vendor pricing and operational expenses, Ursa’s total cost of ownership (TCO) for a 5GB/s workload with a 7-day retention period is:

50% cheaper than Confluent WarpStream
85% cheaper than MSK and Redpanda

Ursa: Highly Cost-Efficient Data Streaming at Scale

Ursa is a next-generation data streaming engine designed to deliver high performance at a fraction of the cost of traditional disk-based solutions. It is fully compatible with Apache Kafka and Apache Pulsar APIs, while leveraging a leaderless, lakehouse-native architecture to maximize scalability, efficiency, and cost savings.

Ursa’s key innovation is separating storage from compute and decoupling metadata/index operations from data operations by utilizing cloud object storage (e.g., AWS S3) instead of costly inter-zone disk-based replication. It also employs open lakehouse formats (Iceberg and Delta Lake), enabling columnar compression to significantly reduce storage costs while maintaining durability and availability.

In contrast, traditional streaming systems—like Kafka and Redpanda—depend on leader-based architectures, which drive up inter-zone traffic costs due to replication and client communication. Ursa mitigates these costs by:

Eliminating inter-zone traffic costs via a leaderless architecture.
Replacing costly inter-zone replication with direct writes to cloud storage using open lakehouse formats.

How Ursa Eliminates Inter-Zone Traffic

Ursa minimizes inter-zone traffic by leveraging a leaderless architecture, which eliminates inter-zone communication between clients and brokers, and lakehouse-native storage, which removes the need for inter-zone data replication. This approach ensures high availability and scalability while avoiding unnecessary cross-zone data movement.

Leaderless architecture

Traditional streaming engines such as Kafka, Pulsar, or RedPanda rely on a leader-based model, where each partition is assigned to a single leader broker that handles all writes and reads.

Pros of Leader-Based Architectures:
✔ Maintains message ordering via local sequence IDs
✔ Delivers low latency and high performance through message caching

Cons of Leader-Based Architectures:
✖ Throughput bottlenecked by a single broker per partition
✖ Inter-zone traffic required for high availability in multi-AZ deployments

While Kafka and Pulsar offer partial solutions (e.g., reading from followers, shadow topics) to reduce read-related inter-zone traffic, producers still send data to a single leader.

Ursa removes the concept of topic ownership, allowing any broker in the cluster to handle reads or writes for any partition. The primary challenge—ensuring message ordering—is solved with Oxia, a scalable metadata and index service created by StreamNative in 2022.

Oxia: The Metadata Layer Enabling Leaderless Architecture

Ensuring message ordering in a leaderless architecture is complex, but Ursa solves this with Oxia:

Handles millions of metadata/index operations per second
Generates sequential IDs to maintain strict message ordering
Optimized for Kubernetes with horizontal scalability

Producers and consumers can connect to any broker within their local AZ, eliminating inter-zone traffic costs while maintaining performance through localized caching.

Zero interzone data replication

In most distributed systems, data replication from a leader (primary) to followers (replicas) is crucial for fault tolerance and availability. However, replication across zones can inflate infrastructure expenses substantially.

Ursa avoids these costs by writing data directly to cloud storage (e.g., AWS S3, Google GCS):

Built-In Resilience: Cloud storage inherently offers high availability and fault tolerance without inter-zone traffic fees.
Tradeoff: Slightly higher latency (sub-second, with p99 at 500 milliseconds) compared to local disk/EBS (single-digit to sub-100 milliseconds), in exchange for significantly lower costs (up to 10x lower).
Flexible Modes: Ursa is an addition to the classic BookKeeper-based engine, providing users with the flexibility to optimize for either cost or low latency based on their workload requirements.

By foregoing conventional replication, Ursa slashes inter-zone traffic costs and associated complexities—making it a compelling option for organizations seeking to balance high-performance data streaming with strict budget constraints.

How We Ran a 5 GB/s Test with Ursa

Ursa Cluster Deployment

9 brokers across 3 availability zones, each on m6i.8xlarge (Fixed 12.5 Gbps bandwidth, 32 vCPU cores, 128 GB memory).
Oxia cluster (metadata store) with 3 nodes of m6i.8xlarge, distributed across three availability zones (AZs).

During peak throughput (5 GB/s), each broker’s network usage was about 10 Gbps.

OpenMessaging Benchmark Workers & Configuration

The OpenMessaging Benchmark(OMB) Framework is a suite of tools that make it easy to benchmark distributed messaging systems in the cloud. Please check https://openmessaging.cloud/docs/benchmarks/ for details.

12 OMB workers: 6 for producers, 6 for consumers across 3 availability zones, on m6i.8xlarge instances. Each worker is configured with 12 CPU cores and 48 GB memory.
Sample YAML scripts provided for Kafka-compatible configuration and rate limits.
Achieved consistent 5 GB/s publish/subscribe throughput.

Ursa Benchmark Tests & Results

The following diagram demonstrates that Ursa can consistently handle 5 GB/s of traffic, fully saturating the network across all broker nodes.

Comparing Infrastructure Cost

This benchmark first evaluates infrastructure costs of running a 5 GB/s streaming workload (1:1 producer-to-consumer ratio) across different data streaming engines, including Ursa, Redpanda, and AWS MSK, with a focus on multi-AZ deployments to ensure a fair comparison.

Test Setup & Key Assumptions

All tests use multi-AZ configurations, with clusters and clients distributed across three AWS availability zones (AZs). Cluster size scales proportionally to the number of AZs, and rack-awareness is enabled for all engines to evenly distribute topic partitions and leaders.

To ensure a fair comparison, we selected the same machine type capable of fully utilizing both network and storage bandwidth for Ursa and Redpanda in this 5GB/s test:

9 × m6i.8xlarge instances

However, MSK's storage bandwidth limits vary depending on the selected instance type, with the highest allowed limit capped at 1000 MiB/s per broker, according to AWS documentation. Given this constraint, achieving 5 GB/s throughput with a replication factor of 3 required the following setup:

15 × kafka.m7g.8xlarge (32 vCPUs, 128 GB memory, 15 Gbps network, 4000 GiB EBS).

This configuration was necessary to work around MSK's storage bandwidth limitations, ensuring a comparable cost basis to other evaluated streaming engines.

Additional key assumptions include:

Inter-AZ producer traffic: For leader-based engines, two-thirds of producer-to-broker traffic crosses AZs due to leader distribution.
Consumer optimizations: Follower fetch is enabled across all tests, eliminating inter-AZ consumer traffic.
Storage cost exclusions: This benchmark only evaluates streaming costs, assuming no long-term data retention.

Inter-Broker Replication Costs

Inter-broker (cross-AZ) replication is a major cost driver for data streaming engines:

RedPanda: Inter-broker replication is not free, leading to substantial costs when data must be copied across multiple availability zones.
AWS MSK: Inter-broker replication is free, but MSK instance pricing is significantly higher (e.g., $3.264 per hour for kafka.m7g.8xlarge vs $1.306 per hour for an on-demand m7g.8xlarge). The storage price of MSK is $0.10 per GB-month which is significantly higher than st1, which costs $0.045 per GB-month. Even though replication is free, client-to-broker traffic still incurs inter-AZ charges.
Ursa: No inter-broker replication costs due to its leaderless architecture, eliminating inter-zone replication costs entirely.

Zone Affinity: Reducing Inter-AZ Costs

We evaluated zone affinity mechanisms to further reduce inter-AZ data transfer costs.

Consumers:

Follower fetch is enabled across all tests, ensuring consumers fetch data from replicas in their local AZ—eliminating inter-zone consumer traffic except for metadata lookups

Producers:

Kafka protocol lacks an easy way to enforce producer AZ affinity (though KIP-1123 aims to address this). And it only works with the default partitioner (i.e., when no record partition or record key is specified).
Redpanda recently introduced leader pinning, but this only benefits setups where producers are confined to a single AZ—not applicable to our multi-AZ benchmark.
Ursa is the only system in this test with built-in zone affinity for both producers and consumers. It achieves this by embedding producer AZ information in client.id, allowing metadata lookups to route clients to local-AZ brokers, eliminating inter-AZ producer traffic.

Cost Comparison Results

Ursa delivered 5 GB/s of sustained throughput at an infrastructure cost of just $54 per hour. For comparison:

MSK: $303 per hour → 5.6x more expensive compared to Ursa
Redpanda: $988 per hour → 18x more expensive compared to Ursa

Ursa’s leaderless architecture, zone affinity, and native cloud storage integration deliver unparalleled cost efficiency, making it the most cost-effective choice for high-throughput data streaming workloads.

The detailed infrastructure cost calculations for each data streaming engine are listed below:

StreamNative - Ursa

Server EC2 costs: 9 * $1.536/hr = $14
Client EC2 costs: 9 * $1.536/hr =$14
S3 write requests costs: 1350 r/s * $0.005/1000r * 3600s = $24
S3 read requests costs: 1350 r/s * $0.0004/1000r * 3600s = $2

AWS MSK

Server EC2 costs: 15 * $3.264/hr = $49
Client side EC2 costs: 9 * $1.536/hr =$14
Interzone traffic - producer to broker: 5GB/s * ⅔ * $0.02/G(in+out) * 3600 = $240

RedPanda

Server EC2 costs: 9 * $1.536/hr = $14
Client EC2 costs: 9 * $1.536/hr =$14
Interzone traffic - producer to broker: 5GB/s * ⅔ * $0.02/GB(in+out) * 3600 = $240
Interzone traffic - replication: 10GB/s * $0.02/GB(in+out) * 3600 = $720
Interzone traffic - broker to consumer: $0 (fetch from local zone)

Please note that we were unable to test Redpanda with Cloud Topics, as it remains an announced but unreleased feature and is not yet available for evaluation. Based on the limited information available, while Cloud Topics may help optimize inter-zone data replication costs, producers still need to traverse inter-availability zones to connect to the topic partition owners and incur inter-zone traffic costs of up to $240 per hour.

KIP-1123 (when implemented) will help mitigate producer-to-broker inter-zone traffic, but it is not yet available. And it only works with the default partitioner (no record partition or key is specified).
Redpanda’s leader pinning helps only when all producers for the pinned topic are confined to a single AZ. In multi-AZ environments (like our benchmark), inter-zone producer traffic remains unavoidable.

Additionally, Redpanda’s Cloud Topics architecture is not documented publicly. Their blog mentions "leader placement rules to optimize produce latency and ingress cost," but it is unclear whether this represents a shift away from a leader-based architecture or if it uses techniques similar to Ursa’s zone-aware approach.

We may revisit this comparison as more details become available.

Comparing Total Cost of Ownership

As highlighted earlier, with a BYOC Ursa setup, you can achieve 5 GB/s throughput at just 5% of the infrastructure cost of a traditional leader-based data streaming engine, such as Kafka or RedPanda, while managing the infrastructure yourself. This significant cost reduction is enabled by Ursa’s leaderless architecture and lakehouse-native storage design, which eliminate overhead costs such as inter-zone traffic and leader-based data replication. By leveraging a lakehouse-native, leaderless architecture, Ursa reduces resource requirements, enabling you to handle high data throughput efficiently and at a fraction of the cost of RedPanda.

Now, let’s examine the total cost comparison, evaluating Ursa alongside other vendors, including those that have adopted a leaderless architecture (e.g., Confluent WarpStream). This comparison is based on a 5GB/s workload with a 7-day retention period, factoring in both storage cost and vendor costs Here are the key findings:

Ursa ($164,353/month) is:
- 50% cheaper than Confluent WarpStream ($337,068/month)
- 85% cheaper than AWS MSK ($1,115,251/month)
- 86% cheaper than Redpanda ($1,202,853/month)

In addition to Ursa’s architectural advantages—eliminating most inter-AZ traffic and leveraging lakehouse storage for cost-effective data retention—it also adopts a more fair and cost-efficient pricing model: Elastic Throughput-based pricing. This approach aligns costs with actual usage, avoiding unnecessary overhead.

Unlike WarpStream, which charges for both storage and throughput, Ursa ensures that customers only pay for the throughput they actively use. Ursa’s pricing is based on compressed data sent by clients, meaning the more data compressed on the client side, the lower the cost. In contrast, WarpStream prices are based on uncompressed data, unfairly inflating expenses and failing to incentivize customers to optimize their client applications.

This distinction is crucial, as compressed data reduces both storage and network costs, making Ursa’s pricing model not only more cost-effective but also more transparent and predictable.

Cost Breakdown

StreamNative – Ursa

EC2 (Server): 9 × $1.536/hr × 24 hr × 30 days = $9,953.28
S3 Write Requests: 1,350 r/s × $0.005/1,000 r × 3,600 s × 24 hr × 30 days = $17,496
S3 Read Requests: 1,350 r/s × $0.0004/1,000 r × 3,600 s × 24 hr × 30 days = $1,400
S3 Storage Costs: 5 GB/s × $0.021/GB × 3,600 s × 24 hr × 7 days = $63,504
Vendor Cost: 200 ETU × $0.50/hr × 24 hr × 30 days = $72,000

WarpStream

Based on WarpStream’s pricing calculator (as of January 29, 2025), we assume a 4:1 client data compression ratio, meaning 20 GB/s of uncompressed data translates to 5 GB/s of compressed data.
It's important to note that WarpStream’s pricing structure has fluctuated frequently throughout January. We observed the cost reported by their calculator changing from $409,644 per month to $337,068 per month. This variability has been previously highlighted in the blog post “The Brutal Truth About Kafka Cost Calculators”. To ensure transparency, we have documented the pricing as of January 29, 2025.

MSK

EC2 (Server): 15 * $3.264/hr × 24 hr × 30 days = $35,251
Interzone Traffic (Client-Server): 5 GB/s × ⅔ × $0.02/GB (in+out) × 3,600 s × 24 hr × 30 days = $172,800
Storage: 5 GB/s × $0.1/GB-month × 3,600 s × 24 hr × 7 days * 3 replicas = $907,200

RedPanda

EC2 (Server): 9 × $1.536/hr × 24 hr × 30 days = $9953
Interzone Traffic (Client-Server): 5 GB/s × ⅔ × $0.02/GB (in+out) × 3,600 s × 24 hr × 30 days = $172,800
Interzone Traffic (Replication): 5 GB/s × 2 × $0.02/GB (in+out) × 3,600 s × 24 hr × 30 days = $518,400
Storage: 5 GB/s × $0.045/GB-month(st1) × 3,600 s × 24 hr × 7 days * 3 replicas = $408,240
Vendor Cost: $93,333 per month (based on limited information. See additional notes below).

Additional Notes

Redpanda does not publicly disclose its BYOC pricing, making it difficult to accurately assess its total costs. We refer to information from the whitepaper “Redpanda vs. Confluent: A Performance and TCO Benchmark Report by McKnight Consulting Group.” for estimation purposes. Based on the Tier-8 pricing model in the whitepaper, the estimated cost to support a 5GB/s workload would be $1.12 million per year ($93,333 per month). However, since this calculation is based on an estimation, we will revisit and refine the cost assessment once Redpanda publishes its BYOC pricing.

When estimating the storage costs for Kafka and Redpanda, we assume the use of HDD storage at $0.045/GB, based on the premise that both systems can fully utilize disk bandwidth without incurring the higher costs associated with GP2 or GP3 volumes. However, in practice, many users opt for GP2 or GP3, significantly increasing the total storage cost for Kafka and Redpanda.
Unlike disk-based solutions, S3 storage does not require capacity preallocation—Ursa only incurs costs for the actual data stored. This contrasts with Kafka and Redpanda, where preallocating storage can drive up expenses. As a result, the real-world storage costs for Kafka and Redpanda are often 50% higher than the estimates above.

Conclusion

Ursa represents a transformative shift in streaming data infrastructure, offering cost efficiency, scalability, and flexibility without compromising durability or reliability. By leveraging a leaderless architecture and eliminating inter-zone data replication, Ursa reduces total cost of ownership by over 90% compared to traditional leader-based streaming engines like Kafka and Redpanda. Its direct integration with cloud storage and scalable metadata & index management via Oxia ensure high availability and simplified infrastructure management.

Balancing Latency and Cost

Ursa trades off slightly higher latency for ultra low cost, making it an ideal choice for the majority of streaming workloads, especially those that prioritize throughput and cost savings over ultra-low latency. Meanwhile, StreamNative’s BookKeeper-based engine remains the preferred solution for real-time, latency-sensitive applications. By combining these two approaches, StreamNative empowers customers with the flexibility to choose the right engine for their specific needs—whether it's maximizing cost savings or achieving ultra low-latency real-time performance.

The Future of Streaming Infrastructure

In an era where data fuels AI, analytics, and real-time decision-making, managing infrastructure costs is critical to sustaining innovation. Ursa is not just a cost-cutting alternative—it is a forward-thinking, lakehouse-native platform that redefines how modern data streaming infrastructure should be built and operated.

Whether your priority is reducing costs, improving flexibility, or ingesting massive data into lakehouses, Ursa delivers a future-proof solution for the evolving demands of real-time data streaming. Get started with StreamNative Ursa today!

References

[Oxia] https://streamnative.io/blog/introducing-oxia-scalable-metadata-and-coordination

[Ursa] https://streamnative.io/blog/ursa-reimagine-apache-kafka-for-the-cost-conscious-data-streaming

[StreamNative pricing] https://docs.streamnative.io/docs/billing-overview

[WarpStream pricing] https://www.warpstream.com/pricing#pricingfaqs

[AWS S3 pricing] https://aws.amazon.com/s3/pricing/

[AWS EBS pricing] https://aws.amazon.com/ebs/pricing/

[AWS MSK pricing] https://aws.amazon.com/msk/pricing/

[The Brutal Truth about Kafka Cost Calculators] https://bigdata.2minutestreaming.com/p/the-brutal-truth-about-apache-kafka-cost-calculators

[Redpanda vs. Confluent: A Performance and TCO Benchmark Report by McKnight Consulting Group] https://www.redpanda.com/resources/redpanda-vs-confluent-performance-tco-benchmark-report#form

This is some text inside of a div block.

Button Text

Matteo Meril

Matteo is the CTO at StreamNative, where he brings rich experience in distributed pub-sub messaging platforms. Matteo was one of the co-creators of Apache Pulsar during his time at Yahoo!. Matteo worked to create a global, distributed messaging system for Yahoo!, which would later become Apache Pulsar. Matteo is the PMC Chair of Apache Pulsar, where he helps to guide the community and ensure the success of the Pulsar project. He is also a PMC member for Apache BookKeeper. Matteo lives in Menlo Park, California.

Neng Lu

Neng Lu is currently the Director of Platform at StreamNative, where he leads the engineering team in developing the StreamNative ONE Platform and the next-generation Ursa engine. As an Apache Pulsar Committer, he specializes in advancing Pulsar Functions and Pulsar IO Connectors, contributing to the evolution of real-time data streaming technologies. Prior to joining StreamNative, Neng was a Senior Software Engineer at Twitter, where he focused on the Heron project, a cutting-edge real-time computing framework. He holds a Master's degree in Computer Science from the University of California, Los Angeles (UCLA) and a Bachelor's degree from Zhejiang University.

Hang Chen

Hang Chen, an Apache Pulsar and BookKeeper PMC member, is Director of Storage at StreamNative, where he leads the design of next-generation storage architectures and Lakehouse integrations. His work delivers scalable, high-performance infrastructure powering modern cloud-native event streaming platforms.

Penghui Li

Penghui Li is passionate about helping organizations to architect and implement messaging services. Prior to StreamNative, Penghui was a Software Engineer at Zhaopin.com, where he was the leading Pulsar advocate and helped the company adopt and implement the technology. He is an Apache Pulsar Committer and PMC member.

Show all

Blog

Jul 18, 2025

6 min read