Native Apache Kafka Service Is Coming Soon to StreamNative Cloud. Join the waitlist and get $1,000 in credits.

Join Waitlist >
StreamNative Logo
VideoMay 29, 202525 min

Efficient Kafka Topic Compaction at Scale on Ursa

Unlock Instant Access

Complete the form to start watching.

Session Overview

Discover cost-effective topic compaction strategies using Ursa. Learn to efficiently manage data in large-scale streaming environments while minimizing storage costs.

TL;DR

In large-scale data streaming environments, efficiently managing stateful applications while controlling storage costs is a major challenge. Ursa, an S3-based, Kafka-compatible server, addresses this challenge with advanced topic compaction techniques that leverage S3 object storage for cost-effective durability while maintaining essential Kafka guarantees. This approach allows for scalable, cost-effective topic compaction in cloud-native architectures.

Opening

Imagine managing a massive stream of data without the crushing costs of traditional storage systems. That's the challenge Ursa seeks to solve with its innovative compaction techniques. By leveraging S3 object storage, Ursa provides a scalable solution to efficiently maintain the latest state of data, akin to keeping only the most recent updates in a chaotic world of ever-changing information. This approach not only reduces storage costs but also enhances the durability and availability of data.

What You'll Learn (Key Takeaways)

  • Efficient Topic Compaction – Ursa employs advanced topic compaction techniques using S3 object storage, which ensures data durability while minimizing storage costs.
  • Cost-Effective Storage Management – By intelligently minimizing S3 requests through minor and major compactions, Ursa effectively balances storage cost and compaction efficiency.
  • Scalable Cloud-Native Architecture – Ursa’s design supports large volumes of data and numerous keys, making it a robust choice for scalable cloud-native data streaming.

Q&A Highlights

Q: How does Ursa manage the cost associated with S3 API calls? A: Ursa employs minor and major compaction strategies to reduce the frequency of S3 API calls, striking a balance between storage efficiency and cost-effectiveness.

Q: What guarantees does Ursa provide regarding data durability? A: Ursa ensures that the latest value per key is retained and never removed until manually deleted, maintaining critical Kafka-compatible guarantees.

Q: How does Ursa handle large volumes of keys in data streams? A: By leveraging its unique compaction service and indexing system, Ursa efficiently manages high volumes of data, ensuring scalable performance in cloud-native environments.

About Speaker

Penghui Li

Penghui Li Penghui Li is passionate about helping organizations to architect and implement messaging services. Prior to StreamNative, Penghui was a Software Engineer at Zhaopin.com, where he was the leading Pulsar advocate and helped the company adopt and implement the technology. He is an Apache Pulsar Committer and PMC member.