keynote
120 mins
Keynote - Data Streaming Summit San Francisco 2025
Sijie Guo
Matteo Meril
Kundan Vyas
Neng Lu
Aravind Suresh
Ashwin Raja
Onur Karaman
Michelle Leon
Reynold Xin

From Data to Intelligence — Cost, Lakehouse, and AI Agents | Data Streaming Summit 2025 Keynote

How do we bridge streaming, analytics, and AI into a single, intelligent system that scales without runaway cost? At Data Streaming Summit 2025, StreamNative presented a bold new blueprint for the Agentic Era — a time when real-time data, lakehouse architectures, and AI agents converge into one unified, incremental stack.

This keynote session brings together leaders from StreamNative, Databricks, LinkedIn, and OpenAI to explore the technologies, architectures, and success stories shaping the future of intelligent data infrastructure.

🔷 A Unified Blueprint for the Agentic Era

The keynote opens with a connected story that doubles as a roadmap for the next decade of data infrastructure:
Stream the data. Accelerate the insights. Empower the agents.
From ingestion to decision-making, this architecture bridges motion (streaming), rest (lakehouse), and action (AI agents) — redefining how real-time intelligence is built and delivered.

⚙️ Ursa — The Lakehouse-Native Streaming Engine

StreamNative CTO Matteo Merli unveils Ursa, a new lakehouse-native engine built to bend the cloud cost curve. Ursa unifies low-latency streaming with cost-optimized object storage, eliminating connector overhead, cross-AZ replication costs, and complex synchronization layers.
Now production-ready as a Pulsar storage extension, Ursa enables enterprise workloads to scale efficiently — achieving high throughput, low latency, and significant cost savings.

🧩 Unity Catalog + Iceberg — Streams as First-Class Tables

Kundan (StreamNative) and Michelle (Databricks) reveal how the Unity Catalog and Apache Iceberg integration transforms streaming topics into instantly queryable, governed tables.
This native integration allows organizations to unify governance, enforce fine-grained access control, and eliminate data duplication — effectively bridging the gap between streaming and the lakehouse.

🤖 Orca — Bringing AI Agents into the Event Fabric

Neng Lu, Director of Platform Engineering at StreamNative, introduces Orca, a streaming-native runtime for AI agents. Python-first and framework-agnostic, Orca allows agents to be deployed and coordinated directly inside event streams, bringing state, governance, replay, and dynamic tool discovery to real-time AI systems.
A live demo showcases how agents scale, recover, and collaborate autonomously — turning event streams into an intelligent, self-evolving fabric.

🏗️ Architectures That Validate the Vision — LinkedIn & OpenAI

Two global leaders shared how they are redefining large-scale streaming architectures:

  • LinkedIn unveiled NorthGuard, a log store handling 32 trillion records per day, with elasticity, self-balancing clusters, and segment-based replication.
  • OpenAI demonstrated how real-time streaming powers model training, experimentation, and conversational AI, achieving 80 GB/s throughput and 3× quarterly growth using Kafka, Flink, and custom infrastructure.

🚗 Proof in the Wild — Motorq’s Transformation Story

Motorq, a connected-vehicle intelligence platform, showcased tangible results after adopting StreamNative and Ursa:

  • 50% lower streaming costs
  • Faster lakehouse ingestion
  • Near real-time analytics with simplified pipelines and reduced sync overhead
    This proves how the new architecture delivers measurable efficiency and performance at scale.

🔥 Fireside Chat — Where the Lakehouse Goes Next

The keynote closes with an inspiring conversation between Sijie (StreamNative CEO) and Reynold Xin (Databricks Co-founder & Chief Architect).
They discuss how governance, open formats, and single-file commit semantics will shape the future of the lakehouse — and why streaming-native architectures are essential for reliable, large-scale AI and agentic systems.

🎯 Takeaways:

  • Understand how real-time streaming, lakehouse, and AI agents converge
  • Learn cost-optimization strategies for large-scale cloud systems
  • Explore how leading organizations build next-generation data infrastructure
  • Gain insight into open standards and design principles for the Agentic Era

Whether you’re a data architect, AI engineer, or system designer, this keynote provides a comprehensive, forward-looking view of how data becomes intelligence — from motion to action, and from cost to insight.

Sijie Guo
CEO and Co-Founder, StreamNative, Apache Pulsar PMC Member

Sijie’s journey with Apache Pulsar began at Yahoo! where he was part of the team working to develop a global messaging platform for the company. He then went to Twitter, where he led the messaging infrastructure group and co-created DistributedLog and Twitter EventBus. In 2017, he co-founded Streamlio, which was acquired by Splunk, and in 2019 he founded StreamNative. He is one of the original creators of Apache Pulsar and Apache BookKeeper, and remains VP of Apache BookKeeper and PMC Member of Apache Pulsar. Sijie lives in the San Francisco Bay Area of California.

Matteo Meril
Co-Founder and CTO, StreamNative

Matteo is the CTO at StreamNative, where he brings rich experience in distributed pub-sub messaging platforms. Matteo was one of the co-creators of Apache Pulsar during his time at Yahoo!. Matteo worked to create a global, distributed messaging system for Yahoo!, which would later become Apache Pulsar. Matteo is the PMC Chair of Apache Pulsar, where he helps to guide the community and ensure the success of the Pulsar project. He is also a PMC member for Apache BookKeeper. Matteo lives in Menlo Park, California.

Kundan Vyas
Staff Product Manager, StreamNative

Kundan is a Staff Product Manager at StreamNative, where he spearheads StreamNative Cloud, Lakehouse Storage and compute platform for connectivity, functions, and stream processing. Kundan also leads Partner Strategy at StreamNative, focusing on building strong, mutually beneficial relationships that enhance the company's offerings and reach.

Neng Lu
Director of Platform, StreamNative

Neng Lu is currently the Director of Platform at StreamNative, where he leads the engineering team in developing the StreamNative ONE Platform and the next-generation Ursa engine. As an Apache Pulsar Committer, he specializes in advancing Pulsar Functions and Pulsar IO Connectors, contributing to the evolution of real-time data streaming technologies. Prior to joining StreamNative, Neng was a Senior Software Engineer at Twitter, where he focused on the Heron project, a cutting-edge real-time computing framework. He holds a Master's degree in Computer Science from the University of California, Los Angeles (UCLA) and a Bachelor's degree from Zhejiang University.

Aravind Suresh
Member of Technical Staff, OpenAI

Aravind Suresh leads the real-time infrastructure team at OpenAI, where he builds large-scale streaming, real-time, and ML infrastructure that powers AI products like ChatGPT and Sora. Previously, he led infrastructure efforts at Uber to enable exabyte scale data analytics and AI initiatives across Rides, Eats, and Groceries. With over seven years of experience, Aravind specializes in designing and operating mission-critical, high-throughput data platforms for real-time analytics and machine learning systems.

Ashwin Raja
Co-Founder & CTO, Motorq

Ashwin Raja is the Co-Founder & CTO of Motorq, a leading SaaS company transforming connected car data into actionable insights. With over two decades in technology, he has built world-class engineering teams and scalable platforms at Microsoft, HBO, and multiple startups. At Motorq, Ashwin drives AI-powered telemetry solutions that help global customers unlock efficiencies and create new business models. A champion of innovation, ownership mindset, and mentorship, he has grown Motorq’s development centers into industry models. Ashwin is passionate about nurturing the next generation of tech leaders while delivering impactful solutions for the automotive and mobility ecosystem.

Onur Karaman
Northguard tech lead, LinkedIn

Onur is a Sr Staff Engineer at LinkedIn with an interest in distributed systems. He's the tech lead of Northguard, a log storage system with a focus on scalability and operability. Prior to Northguard, Onur was a committer to Apache Kafka, where he focused on Kafka's scalability. He redesigned the cluster's controller, made the controller use ZooKeeper's async APIs, and worked on the group coordinator and consumer group management protocol.

Michelle Leon
Product Manager, Databricks

Michelle is a Product Manager at Databricks, focusing on Unity Catalog and Lakehouse storage. She is based in San Francisco.

Reynold Xin
Co–founder and Chief Architect, Databricks

Reynold Xin is a cofounder and Chief Architect at Databricks, where he leads the development of core data systems including Apache Spark, Delta Lake, Photon, and Databricks SQL. He holds a PhD in Computer Science from the University of California, Berkeley, where he specialized in large scale data systems.

Newsletter

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.