Ursa Wins VLDB 2025 Best Industry Paper: The First Lakehouse-Native Streaming Engine for Kafka

By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.

Blog

November 11, 2025

8 min read

Data Streaming Summit 2025 — On-Demand Is Live

Kathy Song

Operations at StreamNative

Text Link

Data Streaming Summit

Ursa

Orca

We wrapped the Data Streaming Summit San Francisco on September 30. Since then, we’ve been editing and polishing every talk so you can revisit the ideas—or catch the ones you missed. Today, the full set of DSS 2025 session videos is live on demand, and we’re spotlighting the morning Keynote that set the tone for the day.

Watch the Keynote

“Streaming cost is very high. Streaming and analytics live in silos. And real-time AI is now a real requirement.” With that, StreamNative CEO Sijie Guo opened the second in-person Data Streaming Summit and framed the morning around three concrete goals: scale without runaway bills, erase the boundary between streams and tables, and prepare infrastructure for agents that operate on live data.

Watch the DSS 2025 Keynote Now ▶︎

Keynote Recap: From Data to Intelligence — Cost, Lakehouse, and AI Agents

What followed was a tightly connected story that doubles as a blueprint for data streaming in the Agentic Era. The pattern is intentionally simple and repeatable: stream the data, accelerate the insights, and empower the agents. StreamNative showed how Ursa, a lakehouse‑native engine, is production‑ready and now plugs directly into classic Pulsar clusters, giving teams a way to bend the cloud cost curve without rewriting applications. StreamNative Cloud added managed Apache Iceberg tables in Databricks Unity Catalog, turning topics into queryable, governed tables the moment events arrive—no connectors, no cron jobs, no duplicate copies to keep in sync. And the new Orca Agent Engine places AI agents in the event fabric with state, governance, delayed delivery, and replay built in. Leaders from LinkedIn and OpenAI rounded out the morning with hard‑earned patterns for scaling beyond partition‑era limits, simplifying consumption for product teams, and preparing for the next 10×.

Sijie centered the conversation on three pressures nearly every team feels. Cost comes first: in public clouds, inter‑AZ data transfer, disk‑based replication, and monolithic cluster sizing conspire to make steady‑state streaming expensive. Silos come next: moving data from topics to tables taxes teams with a connector tax—extra compute, extra network hops, and duplicated data—just to make streams usable for analytics. And AI is no longer speculative: agents, retrieval, online features, and human‑in‑the‑loop workflows now depend on fresh signals and clean decoupling, with governance, end to end. The keynote answered each of these with incremental architecture—choices you can adopt at your own pace—that carry data from motion and rest to action.

Scale Without Runaway Cost — Ursa, the Lakehouse‑Native Engine

Matteo Merli, StreamNative Co-founder and CTO, began by retracing the pain that motivated Ursa. Crossing availability‑zone boundaries for replication and connectors incurs hefty inter‑AZ cost; disk‑based replication forces brokers to chat constantly; and monolithic, partition‑bound clusters must be over‑provisioned to survive peaks. Downstream, every new sink process adds connector compute, network overhead, and duplicate storage. Ursa addresses these directly by treating object storage as primary for lakehouse‑native streams while preserving a disk write‑ahead path for low‑latency topics. The effect is one pipeline with two profiles—cost‑optimized and latency‑optimized—that still lands one canonical copy in the lakehouse so analytics never fall behind.

Earlier this year Ursa moved from preview to GA, gained production proof, and earned a Best Industry Paper award for its architecture. The most important operational detail is adoption: Ursa now ships as a storage extension inside classic Pulsar, so operators can enable lakehouse integration per namespace or per topic, keep disk where latency is critical, and let everything else flow to object storage over time—no migration day. In a 5 GB/s benchmark, the design removed inter‑AZ churn, trimmed over‑provisioned compute, and eliminated disk‑replication tax, yielding dramatic cost reductions.

Make Streams First‑Class Tables — Unity Catalog + Iceberg, Natively Governed

Kundan from StreamNative and Michelle from Databricks made the stream‑to‑table path feel built‑in. Over the past year, StreamNative Cloud added Iceberg REST catalogs, Delta Lake with Unity Catalog, Snowflake Open Catalog, and Amazon S3 Tables. On stage, the integration matured again: managed Iceberg tables in Unity Catalog now sit alongside Delta, so teams can choose their open format without changing their pipeline. In practice, you register a catalog once, point a cluster at it, and select the topics that should materialize as tables.

The “Acme Commerce” demo showed orders, products, and customers streaming into StreamNative Cloud; those topics immediately surfaced as Iceberg tables in Unity Catalog and landed as Parquet and manifests in S3. In Catalog Explorer, attribute‑based access control masked PII as soon as it arrived, and Genie answered natural‑language questions by generating SQL to confirm shape and freshness. The takeaway is strategic and simple: governed, open tables should be the default target for streaming data, and catalogs should manage access, discovery, lineage, metrics, and quality from the first write. This is how you erase the stream/warehouse divide and run a streaming lakehouse that keeps analytics in lockstep with operations.

Data in Action — Orca Places Agents in the Event Fabric

Neng Lu, Director of Platform Engineering at StreamNative, introduced the Orca Agent Engine with the keynote’s central premise: agents only become reliable when they are event‑driven. Orca is Python‑first and bring‑your‑own‑agent; if you’ve built on the OpenAI SDK or Google ADK, you package the agent and deploy it into the stream. The agent subscribes to topics, calls tools, and emits events. Under the hood, Orca guarantees at‑least‑once delivery, supports delayed messages for scheduled work, enables replay for backfill and recovery, enforces rate limits, and honors RBAC through StreamNative’s GA role‑based controls. Because Orca speaks MCP, agents can discover and invoke other tools and agents dynamically, turning brittle point‑to‑point chains into composable, event‑driven systems.

The live demo kept the code small and the lesson clear. A “weather agent” was zipped with a short YAML file and deployed, scaled from one to two replicas on command, and—after a restart—re‑hydrated context from persistent memory. A second agent queried the MCP registry, discovered the weather tool, and called it—proof that agents can cooperate through events rather than tight RPCs. Patterns like scheduled triggers, automatic retries, parallel fan‑out, agent meshes, and policy‑driven governance fall naturally out of this design, and Oka is available today across Serverless, Dedicated, and BYOC.

Architectures That Validate the Blueprint — LinkedIn and OpenAI

LinkedIn unveiled Northgard, a next‑generation log store built for 32+ trillion records/day, 17+ PB/day, roughly 400,000 topics, and ~10,000 brokers across ~150 clusters. The shift is to make the segment—not the partition—the unit of replication. Topics are composed of ranges (sequences of segments). When a segment seals, the next segment chooses a fresh replica set—often including new brokers—so capacity is used immediately and clusters self‑balance without shuffling history. Metadata is sharded across Raft‑backed vnodes arranged on a consistent‑hash ring, so there is no single hot controller, and brokers only hold minimal global state. Operationally, brokers can be added without Cruise‑Control‑style moves, producer failover is near‑instant as a newly sealed segment takes over, and acknowledgments correspond to fsync on all replicas—on the order of every 10 ms, 20,000 records, or 10 MB.

OpenAI showed how streaming powers the company’s data flywheel—usage events train better models which drive more usage—plus experimentation, model distillation, conversation search and memory extraction, rate limiting, counters, and ML features (via Chronon). The stack pairs Kafka for durable storage with Flink for processing, deployed across regions with a developer experience that hides infrastructure complexity. A publish proxy called Prism distributes writes across clusters; on the other side, uForwarder (from Uber) pulls from Kafka and pushes to consumers, handling retries and DLQs while fanning out beyond partition counts. The trade‑offs are explicit—no global ordering or cross‑cluster partitioning, at‑least‑once delivery by default—while the remedies are practical: sort by logical clocks downstream when needed, and rely on idempotent sinks or de‑duplication for exactly‑once behavior. Adoption tells the story: 200+ processors, roughly 80 GB/s peak throughput, and growth around 3× per quarter. The team is exploring tiered storage, disaggregated brokers, self‑healing control planes, and a native lakehouse injection path to further collapse the stream/warehouse divide.

Proof in the Wild — Customer Impact That Counts

Motorq, a connected‑vehicle intelligence platform, outlined a clear target: elastic scale without hand‑rolled isolation, native multi‑tenancy, lower cost, native lakehouse writes, and schema contracts to prevent drift. By leveraging StreamNative and Ursa, Motorq reports about 50% lower streaming cost, lakehouse latency down from about an hour to minutes, and ingestion cost ~60% lower. The pipeline itself is simpler—no custom connectors or sync jobs—while Schema Registry catches errors early. Features like Key_Shared scale consumers without giving up order per key, and Iceberg tables (via the REST catalog) expose insights to both internal teams and external partners.

Where the Lakehouse Goes Next — A Fireside on Streaming, Lakehouse, and Agents

In the closing conversation, Sijie sat down with Reynold Xin, Databricks co‑founder and chief architect. Reynold revisited why the lakehouse exists: put data on open tables in object storage and run every workload there. He emphasized single‑file commit semantics—baked into Delta from the start and now surfacing across open formats—as essential for low‑latency ingestion. He argued that governance must reach streams, not just tables, and predicted that Parquet will evolve to be more stream‑friendly and faster to decode as the lakehouse becomes the default home for hot data. For agents, he offered a practical test: if you want them to act safely, infrastructure needs git‑for‑data—branching, checkpoint, replay—so exploration and automation don’t endanger production. Looking forward, he encouraged leaders to keep an open mind, experiment through the hype cycle, and optimize not only for throughput and cost but also for provisioning speed, elastic scale, and rapid branching—capabilities that matter when thousands of agents collaborate at machine bandwidth.

Real-World Insights: Breakout Sessions and Use Cases

With the stage set by the keynote, the Summit’s breakout sessions dove into practical challenges and innovations from across the industry. With four tracks running in parallel, attendees had to choose from a smorgasbord of topics—but a few talks truly stood out.

OpenAI — “Streaming to Scale: Real-Time Infrastructure for AI.” A deep look at how OpenAI’s engineers manage streaming data pipelines to serve AI models in production. The talk covered the architectural choices that ensure AI workloads get the data they need with minimal latency, and how streaming fits into an AI-driven organization’s stack.

Netflix — “Kafka Under Pressure.” An eye-opening tale of pushing Apache Kafka to its limits. Netflix shared the trials and triumphs of operating Kafka at massive scale, what happens when you max out throughput, and how they addressed bottlenecks—from broker tuning to architectural guardrails—to keep the platform reliable.

Salesforce — “Streaming 300B+ Telemetry Events per Day with Flink.” Yes, 300 billion daily events. Salesforce discussed their unified observability pipeline on Apache Flink—stateful processing, exactly-once guarantees, and the operational practices that keep services performant and monitored in real time.

Uber — “Safe Streams at Scale.” A masterclass in reliability: guaranteeing delivery across geo-distributed datacenters and implementing guardrails that prevent bad data or spikes from cascading into outages.

Blueshift — “Building a Scalable Customer Engagement Pipeline with Pulsar.” A startup perspective on moving from legacy queues to Pulsar for event ingestion and notifications—lower latency, higher fault tolerance, and patterns any team can borrow when modernizing messaging.

Google — “Beyond Stream Ingestion with Just SQL.” An exploration of streaming analytics with familiar tools. The team showed how far standard SQL (on Pub/Sub + BigQuery and friends) can go—and when “just SQL” simplifies problems that once demanded bespoke stream processors.

These are just a few highlights—30+ sessions covered everything from emerging streaming benchmarks to fintech, IoT, and AI case studies. A common thread ran through the day: operationalizing streaming. It’s not only about fast pipelines; it’s about making them cost-efficient, reliable, and integrated with the rest of the data estate—lakehouse tables, ML features, governance, and quality. The community vibe matched the content: Pulsar committers chatting with Kafka veterans, cloud engineers swapping tips with AI researchers.

Start Watching

Begin with the Keynote to see where data streaming is headed and how teams are delivering it in production. From there, dive into the full DSS 2025 on-demand playlist for the Netflix, OpenAI, and Uber architecture talks, Motorq’s customer story, and deep dives across Pulsar, Kafka, Flink, and Iceberg.

If you operate real-time systems, the path forward is more pragmatic than ever. Ursa lets you keep latency-sensitive topics on disk and move everything else to object storage while landing a consistent copy in the lakehouse. Unity Catalog integration removes the connector tax and brings governance to the moment of arrival. And Orca puts agents where they belong—in the stream—so they can perceive, reason, and act with the guarantees and controls you already trust.

Watch the Keynote, then explore the sessions that matter most to your roadmap. The videos are live; the ideas are yours to ship.

This is some text inside of a div block.

Button Text