Cost-Effective Data Streaming with StreamNative's Ursa

Ursa Wins VLDB 2025 Best Industry Paper: The First Lakehouse-Native Streaming Engine for Kafka

By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.

Deny Accept

Breakout session

30 min

StreamNative's Ursa Engine and Built-in Catalog Integrations: Cost-Effective Streaming to Lakehouses

Dustin Nest

Resources

Download Slide Deck ↓

TL;DR

StreamNative addresses the challenge of cost-effective, scalable data streaming to lakehouses by introducing its Ursa engine. This solution leverages object storage to minimize inter-zone traffic and integrates seamlessly with catalogs like Amazon S3 Tables, Snowflake Open Catalog, and Databricks Unity Catalog. The result is a reduction in costs and complexity, enabling real-time applications to access streaming data as tables efficiently.

Opening

In the rapidly evolving landscape of AI and data-driven applications, organizations face the dual challenge of scaling their data infrastructure while managing costs. Traditional streaming solutions often incur hidden expenses related to inter-zone network traffic and complex data pipelines. StreamNative's Ursa engine offers a breakthrough by providing a cost-effective way to stream data directly to popular lakehouse formats like Iceberg and Delta Lake, seamlessly integrating with leading catalog services to ensure data discoverability and governance.

What You'll Learn (Key Takeaways)

Efficient Data Streaming: StreamNative's Ursa engine reduces inter-zone traffic, enhancing cost-efficiency by utilizing a leaderless architecture and direct object storage.
Seamless Catalog Integration: Built-in support for catalogs such as Amazon S3 Tables, Snowflake Open Catalog, and Databricks Unity Catalog simplifies data management by automatically registering streaming data as queryable tables.
Real-time Data Availability: Ursa enables real-time applications to access and query streaming data in table formats, facilitating immediate insights and decision-making.
Scalable Architecture: The engine is 100% compatible with the Kafka API, supporting scalable and flexible data streaming solutions tailored to modern cloud environments.

Q&A Highlights

Q: Who manages the table lifecycle, Ursa or the catalog provider?
A: The lifecycle management depends on whether the tables are external or internal. For external tables, management is by the catalog provider; for internal, it's by StreamNative, ensuring a single data copy that backs streaming and historical reads.

Q: Can you use your own compaction tools?
A: Custom compaction tools may not be supported for internal tables, but StreamNative collaborates with customers to optimize data maintenance and compaction strategies.

Q: How is compatibility with schema evolution managed?
A: Currently, schema evolution is not supported but is on the roadmap. StreamNative is actively working on enhancing this feature based on customer feedback and evolving needs.

Q: What permissions are needed for setup?
A: Permissions are required for catalog integration, object storage writing by StreamNative, and data reading by catalog providers like Databricks or Snowflake. Comprehensive setup guides are available through StreamNative Academy.

Dustin Nest

Technical Trainer, StreamNative

Dustin Nest specializes in the creation of engaging technical training content and is StreamNative's full-time Technical Trainer. He brings to StreamNative over seven years of experience in software technical support, debugging, and training content development. He has both a BS and PhD in Chemical Engineering, with additional training and experience in C++, C#, Java, JS, and React. Dustin is excited to be your guide as you learn how to build scalable messaging and streaming applications using Apache Pulsar.

Recommended resources

Watch more events.

Show all

Video

34 mins

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Recommended resources

High-throughput streaming Lakehouse with Apache Hudi

Schema Management and Streaming Data Products

Flink Streaming Ingestion to Cloud-lake at Scale

Newsletter