Streaming Lakehouse
Real‑time streams. Open tables. One system.
Write streaming data directly into open table formats (Iceberg/Delta) and query it in seconds—without Kafka/Pulsar connector pipelines or dual systems. Drop‑in Kafka/Pulsar compatibility included.
Core Benefits
<1
s
ingest‑to‑consume
Immediacy
Sub-second ingest-to-consume freshness—events become visible within ~1s, enabling fraud detection, recommendations, and operational monitoring to run on fresh data.
95
%
lower cost
Architectural simplicity
Eliminate Kafka-to-Lakehouse ETL, offset juggling, and recovery playbooks. One system powers streams and tables with unified metadata, data storage, and governance.
1
copy of data
Cost efficiency
Streams land in object storage as Iceberg/Delta tables—no broker disks, duplicate copies, or inter-AZ replication—so compute scales independently and costs drop by up to 95%.
Why Streaming Lakehouse
End the "two‑systems" tax
Traditional stacks run Kafka/Pulsar beside a Lakehouse and sync via connectors. That doubles operations, creates stale data, and breaks "single source of truth". Streaming Lakehouse folds streaming into the Lakehouse so new data is immediately queryable as a table—no staging topics, no micro‑batches, no fragile connectors
Stream‑table duality
Each event is both an ordered stream record and a row in an Iceberg/Delta table—one copy, two views. Low-latency consumers read the stream while SQL engines query the same bytes immediately, with consistent offsets, schema, debugging, and replay.
Anatomy of a Streaming Lakehouse
A three-layer model: Data · Metadata · Protocol
Learn more →
1
Data layer
Stream format (WAL & Open Tables)
Data layer
Stream format (WAL → Parquet, open tables)
Events land durably in a write-ahead log and compact into Parquet with atomic catalog updates. Query engines see fresh + historical data through a union read path. Choose latency- or cost-optimized mode per stream.
2
Metadata layer
Stream catalog (Streaming index + Unified Governance)
Metadata layer
Stream catalog (offset index + governance)
A streaming-aware catalog tracks schemas and a streaming offset index that maps offsets to WAL/Parquet files. That enables high performance ingestion and unified governance across streams and tables.
3
Protocol layer
Streaming API (Stateless & Multi-protocol)
Protocol layer
Streaming API (stateless, multi-protocol)
Stateless services speak Kafka or Pulsar protocols and translate client calls to storage operations. Because brokers are stateless, you scale compute and storage independently, add capacity in seconds, and keep drop-in client compatibility.
Powered by StreamNative Ursa
StreamNative Ursa is the reference implementation of the Streaming Lakehouse blueprint—pairing leaderless, stateless brokers with an object-store WAL to turn Kafka streams into Iceberg/Delta tables with high performance ingestion, elastic scale, predictable latency, and lower cost.
Read the Ursa paper →
Leaderless & Diskless
Ursa decouples compute from storage—brokers are leaderless and hold no local disks, eliminating elections, rebalancing, and hot partitions. Failover is instant, scale is elastic, and durability comes from a shared WAL plus object storage for predictable latency and lower cost.
Stream-as-table storage
Events land in a durable write-ahead log and compact into Parquet with atomic catalog commits. A range-based offset index keeps streams and tables in lockstep, enabling exactly-once ingestion, time travel, and a single copy of data in Iceberg/Delta.
Kafka-compatible, flexible ingest
Drop in with existing Kafka clients—idempotent producers, transactions, consumer groups—while stateless brokers translate protocol calls to the storage engine. Choose latency-optimized WAL or cost-optimized direct object-store writes to meet each workload’s SLOs and budget.
Compare Streaming Architectures
How Streaming Lakehouse stacks up against Kafka → ETL → Lakehouse, Kafka tiered storage, and streaming databases across data copies, freshness, compatibility, analytics, and ops.
Feature
Streaming Lakehouse
Kafka → ETL → Lakehouse
Kafka Tiered Storage
Streaming DBs
Data copies
1 (open table)
Multiple
1 (proprietary log)
2 (replicate to tables)
Freshness to query
Seconds
Minutes–hours
Not table‑native
Varies
Ingestion Protocol
Kafka or Pulsar
Kafka or Pulsar
Kafka only
Custom
Query engine
Any SQL on Iceberg/Delta
SQL after ETL
Not columnar
Vendor‑specific
Ops
Single system
Two systems + connectors
Kafka ops + object store
New system + replication
Why Streaming Augmented Lakehouse (SAL) wins
One system, one copy. SAL writes streams directly to Iceberg/Delta as Parquet and keeps a unified catalog + streaming offset index, so the same bytes serve streams and tables. No connectors, less to break, lower cost, and immediate analytics.
More to explore
On-demand webinar: Streaming Lakehouse
A quick walkthrough of the three-layer architecture, real-world use cases, and live demos.
Watch now →
Blog series: De-composing Streaming Systems
Why streams need their Iceberg moment—data, metadata, and protocol explained with diagrams.
Read the series →
Talk to an expert
Get 1:1 guidance on architecture, migration paths, and POC scoping for your environment.
Contact us →
FAQ
Don’t see an answer to your question?  Check our
docs
, or
contact us
directly.
Can I use my existing Kafka clients?
Yes. StreamNative Ursa is Kafka-protocol compatible, so you can migrate apps without code changes.
Do I still need Kafka Connect or Pulsar IO?
Not to land streams in open tables—Ursa writes directly to Delta/Iceberg. Use Kafka Connect or Pulsar IO when integrating with other external systems.
Which table formats do you support?
Delta Lake and Apache Iceberg.
What latency can I expect?
Classic Engine uses low-latency BookKeeper storage for latency-sensitive workloads; Ursa’s cost-optimized S3 WAL targets sub-second writes, typically ~200–500 ms, trading a bit of latency for major cost savings.
How do you handle failover and scaling?
Ursa’s brokers are leaderless/stateless—any broker can serve produce/fetch—reducing inter-AZ replication and improving availability.
Build your Streaming Lakehouse.
Unify streams and tables on open formats. Ship real‑time products faster—at a fraction of the cost.