Ursa Wins VLDB 2025 Best Industry Paper: The First Lakehouse-Native Streaming Engine for Kafka

By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.

Blog

October 28, 2025

8 min read

What is Orca Agent Engine?

Neng Lu

Director of Platform, StreamNative

Introduction of Orca Agent Engine

Autonomous AI agents are moving from research labs to real-world production, but until now the infrastructure to support them at enterprise scale has been lacking. Many teams have tried building AI agents in notebooks or demos using various frameworks, only to hit walls in production due to fragmented data, brittle pipelines, and siloed agent processes. Orca Agent Engine (formerly called StreamNative Agent Engine) is our answer to this challenge – an event-driven runtime and infrastructure designed for always-on, real-time AI agents. It provides a unified streaming backbone so developers can bring their own AI agents and run them with live data in a robust, scalable way.

Orca is not just another agent framework or library – it’s a streaming-native infrastructure layer for deploying, coordinating, and scaling AI agents in production. Think of it as the “missing backbone” that takes you from a prototype agent in a notebook to a production-grade autonomous service. Built on Apache Pulsar’s battle-tested serverless computing foundation (Pulsar Functions), Orca enhances agents with event-driven capabilities. You simply package your existing agent code – whether built with Google’s Agent Development Kit (ADK), OpenAI’s agent APIs, or even plain Python – and deploy it on Orca. Once deployed, the agent automatically joins a shared event bus and registry, immediately tapping into live data streams, maintaining its own state, and emitting actions under the platform’s governance and observability.

In essence, Orca Agent Engine provides the always-on “nervous system” that modern AI agents need but traditional setups lack. Instead of isolated bots operating on stale snapshots of data, each agent connects to a real-time stream of events that delivers fresh, millisecond-level context. The shared event bus acts as a live context layer and communication channel for agents, so they can react to new events instantly and even talk to each other by publishing events. Orca also gives agents a built-in memory: each agent maintains a persistent, distributed state that is continually updated and externalized as events, available for recall or audit. No more “black-box” agents with hidden state – an agent’s observations and decisions become part of an event log that can be inspected and traced later, providing much-needed transparency.

Crucially, Orca’s architecture is cloud-native, scalable, and resilient by design. Because it leverages a streaming data platform under the hood, agents benefit from horizontal scaling, load balancing, and fault tolerance out of the box. Agents run as distributed functions across a cluster – there’s no single choke point. If one instance goes down, others seamlessly take over, preventing any single agent failure from breaking the workflow. In short, Orca handles the hard parts of running always-on, distributed agents – much like Kubernetes did for microservices, Orca provides that operational backbone for AI agents. Developers can focus on their agents’ logic and goals while the platform takes care of real-time data plumbing, scaling, and reliability.

Key Capabilities of Orca Agent Engine

Orca introduces a new paradigm for building real-time, AI-driven applications. Its core capabilities include:

Event-Driven Streaming Runtime: Agents are “always on,” continuously listening to event streams and emitting new events. Rather than waiting for HTTP requests or periodic batches, agents subscribe to Apache Pulsar or Apache Kafka topics and react the moment events occur. This streaming-first design lets AI agents operate on up-to-the-second information – perfect for scenarios where data never sleeps. One agent’s output can trigger other agents by publishing events, forming an asynchronous pipeline of decisions and actions driven entirely by data flows.
Shared Event Bus (Unified Nervous System): All agents (and other workflows or applications) communicate over a unified event bus, eliminating silos. This bus provides a shared context layer for your AI ecosystem: agents no longer poll for updates or run in isolation, but receive a live feed of context (e.g. sensor readings, user actions, database changes, or other agents’ outputs) and can broadcast their own insights or alerts to others. The result is a network of agents that collaborate in real time, share facts, and avoid redundant work. The event bus comes with built-in features like message ordering, persistence, back-pressure handling, and replay – thanks to Pulsar’s log storage and the Kafka-compatible Ursa engine – so agents can even “time-travel” by replaying past events to recover context or test new logic.
Persistent Streaming Memory: Each agent can maintain stateful memory beyond a single prompt-response cycle. Backed by a distributed state store, an agent’s intermediate results or important observations are logged as events and stored for future reference. In practice, this means an agent can “remember” context over long conversations or continually learn from new data, rather than being stateless between requests. Because this memory is externalized to the event stream, you gain full visibility into what the agent knows – every piece of state or decision rationale can be audited and replayed later. This tackles one of the biggest challenges of agentic AI: making their decision-making process transparent and reproducible.
MCP Integration: In modern agent systems, functions are tools—and Orca streamlines safe tool use via the Model Context Protocol (MCP). Introduced by Anthropic, MCP provides a uniform, secure way for agents to invoke external tools and access data. Orca embraces MCP so agents can call REST APIs, query databases, read from live streams, invoke cloud services, or even manage infrastructure (e.g., Pulsar clusters) through a single interface. Behind the scenes, StreamNative’s open-source MCP Server bridges Pulsar/Kafka with external systems and exposes integrations as on-demand, discoverable functions. Define a tool once—with schema and authorization—and any agent can use it without custom glue code or credential sprawl. Combined with Orca’s unified registry, tools and even other agents become callable MCP components, with dynamic discovery keeping capabilities up to date. The result is a governed, auditable tool ecosystem that expands what agents can do—from vector lookups to workflow execution—while preserving security and control.
Modular, Composable Agents: Orca encourages a decomposed, microservices-like approach to building complex agents, unlike monolithic chain-of-thought scripts. Complex tasks can be split into multiple specialized agents or functions that each handle a sub-task and communicate via events. For example, a “fast path” agent might apply quick rule-based decisions on incoming events, while a “smart path” agent performs deeper LLM-powered analysis on trickier cases – both orchestrated through the event bus. This modular design makes workflows dynamic and evolvable: agents can decide at runtime to invoke different tools or even spawn other agents based on the situation. You can add, remove, or update individual agents (much like updating microservices) without rewriting a giant centralized program. In essence, Orca enables building a collaborative agent mesh – a collection of agents that discover and call each other as needed to solve a problem together.
Unified Registry and Tool Directory: Every agent deployed via Orca is registered in a central registry alongside other components like connectors and functions. This acts as a directory of all “brains” (agents) and available tools, along with their metadata (interfaces, versions, owners, etc.). The benefit is twofold: (a) Operators get one control plane to manage and monitor all agents – you can see what agents are running, their status, update them, set permissions, etc. in one place. (b) Agents themselves can perform dynamic lookup of tools or peer agents at runtime. For instance, an orchestrator agent might query the registry to find a specialized “expert” agent or a function, then invoke it as a sub-task. This makes it much easier to build composed workflows where agents use other agents or services as tools, without hard-coding all integrations. The combination of the registry and the event bus enables late-binding and discovery of capabilities at runtime, adding tremendous flexibility.
Bring Your Own Agent (Framework-Agnostic): One of Orca’s biggest strengths is its openness to existing AI ecosystems. Orca is framework-agnostic – it doesn’t force you to rewrite your logic in a new DSL or adhere to a proprietary “agent” API. Instead, you can plug in the agents you’ve already built with the tools you love. Whether your agent is powered by Google’s Agent Development Kit or OpenAI’s Agents API, or just custom Python code, it can run on Orca without modification. This means developers can leverage popular frameworks and models (LangChain, LangGraph and others are on the roadmap) while still benefiting from Orca’s event-driven runtime and governance. In practice, because Orca Agent Engine is framework-agnostic, you can bring an existing agent (for example, an OpenAI agent you’ve already built) and see it immediately run on live streaming data. This lowers the barrier to moving from prototype to production – no need to rebuild your agent from scratch, simply deploy it on Orca and gain the streaming “superpowers” of the platform.‍

Operational Backbone for AI Agents

In summary, Orca Agent Engine transforms AI agents from stateless functions or chatbots into always-on, event-driven services with persistent memory and enterprise-grade observability. By leveraging Apache Pulsar or Kafka as a shared event bus, it enables agents to collaborate and discover each other via the Model Context Protocol (MCP) – creating an “agent mesh” where autonomous agents can coordinate actions and share context in real time. And with built-in audit logs and governance, every agent decision and action can be traced end-to-end, which is critical for trust and compliance in production AI systems. Orca Agent Engine provides a simple, neutral, and future-proof backbone for organizations looking to operationalize AI agents on live data streams. It bridges the gap between cutting-edge AI logic and reliable, scalable infrastructure, allowing developers and architects to focus on the what (the agent’s goals and logic) while Orca handles the how (the real-time data integration, scaling, fault tolerance, and oversight).

This is some text inside of a div block.

Button Text

Neng Lu

Neng Lu is currently the Director of Platform at StreamNative, where he leads the engineering team in developing the StreamNative ONE Platform and the next-generation Ursa engine. As an Apache Pulsar Committer, he specializes in advancing Pulsar Functions and Pulsar IO Connectors, contributing to the evolution of real-time data streaming technologies. Prior to joining StreamNative, Neng was a Senior Software Engineer at Twitter, where he focused on the Heron project, a cutting-edge real-time computing framework. He holds a Master's degree in Computer Science from the University of California, Los Angeles (UCLA) and a Bachelor's degree from Zhejiang University.

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

What is Orca Agent Engine?

Introduction of Orca Agent Engine

Key Capabilities of Orca Agent Engine

Operational Backbone for AI Agents

Newsletter