Ursa Wins VLDB 2025 Best Industry Paper: The First Lakehouse-Native Streaming Engine for Kafka

By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.

Blog

May 13, 2025

12 min

Introducing the StreamNative MCP Server: Connecting Streaming Data to AI Agents

Neng Lu

Director of Platform, StreamNative

Rui Fu

Staff Software Engineer, StreamNative

Over the past few weeks in our Agentic AI blog series, our CEO has explored the immense potential of integrating AI agents with real-time data streams:

In the last blog, we examined the Model Context Protocol (MCP), an open protocol introduced by Anthropic. It’s designed to enable seamless, secure, and standardized connections between AI models – especially large language models (LLMs) – and a wide range of external data sources, tools, and environments. With the protocol, AI agents can access and interact with external data sources in a universal, consistent way.

Today, we’re thrilled to unveil the StreamNative MCP Server and share it with all the streaming enthusiasts as an open-source project. It seamlessly connects any Kafka/Pulsar service to AI agents using the MCP protocol, regardless of whether it's on StreamNative Cloud or not. With the MCP Server, users can instruct agents to access fresh, real-time Kafka/Pulsar data and manage the cluster resources through natural language, performing tasks such as configuring topics, publishing/consuming data, or even writing and submitting Pulsar Functions without wrestling with complex commands.

In the following sections, we’ll introduce the StreamNative MCP Server, explain how it works, and show how it connects Apache Pulsar, Apache Kafka, and StreamNative Cloud streams to AI in a unified, developer-friendly way.

What is the StreamNative MCP Server?

The StreamNative MCP Server (aka “streamnative-mcp-server” or “snmcp”) is an open-source implementation of the Model Context Protocol designed specifically for bringing real-time streaming platforms – Apache Kafka and Apache Pulsar closer to LLMs and AI agents. By running the MCP Server, you can securely expose a Pulsar or Kafka deployment – whether it’s on-premises, in StreamNative cloud, or in other Streaming Service Vendors’ cloud – to any MCP-compatible AI client. This enables an LLM-based agent to read from, write to, and administer streams through a single standardized interface without any custom integration code. It significantly lowers the barrier to adopting streaming platforms and helps truly democratize streaming technology.

The server speaks MCP on one side and native Pulsar/Kafka protocols on the other. Because it adheres to the open MCP spec, it works out-of-the-box with any compliant client. Developers don’t need to reinvent protocols or worry about the underlying cluster details – the server abstracts those away using the familiar “tools,” “resources,” and “prompts” vocabulary that AI agents understand.

We're excited to open source the MCP server under the Apache 2.0 license, making it freely available for everyone to use, inspect, deploy, and extend without restriction. We believe this is a key step in unlocking real-time streaming for AI and helping accelerate the innovation between the streaming and AI landscape.

How It Works: Tools, Resources, and Prompts

To understand how the MCP Server enables AI-to-streaming integration, let’s briefly review the core MCP concepts it implements. In MCP, servers don’t simply expose raw data – they offer structured capabilities that the AI can utilize. The three primary capability types are:

Resources – Read-only data that the server makes available to clients and LLMs. Resources include files or data snippets that an AI agent can pull in as context. These resources provide structured data without additional computation needed.
Prompts – Predefined prompt templates or workflows that the server provides. Prompts serve as shortcuts for common interactions or tasks. Think of them as stored queries or conversation templates that the AI can invoke.
Tools – Tools are executable actions that the MCP Server provides to AI agents, representing the most powerful capability of the platform. Through tools, the MCP Server empowers AI agents to perform operations on streaming platforms and related systems with appropriate permissions and oversight. Each tool is essentially a function that an AI can invoke via the MCP protocol.

Under the hood, the MCP Server implements these concepts according to the MCP specification. When an AI agent connects, it can query the server for available tools, resources, and prompts (using standard MCP requests like tools/list and resources/list). The server advertises everything it can do in a discoverable way. Then, during an AI dialogue, the agent may choose to invoke a tool or retrieve a resource to fulfill the user’s request. The MCP Server receives those requests (formatted as JSON-RPC messages over the MCP connection) and translates them into actions on the Pulsar or Kafka protocol.

For example, if a user asks the AI agent, “How many events per second are flowing through Pulsar topic X right now?”, the agent (via its MCP client) might collect the required info to call a pulsar-admin-topics tool on the MCP Server to get topic stats. The server, in turn, uses Pulsar’s admin API to fetch the metrics for topic X, then returns that data to the AI agent, which incorporates it into a natural language answer. All of this happens through the standardized MCP interface – the agent never needs to know Pulsar protocol. It simply requests a tool by name and description from the MCP server. This model aligns perfectly with modern AI agent frameworks like ReAct (Reason+Act): the agent focuses on the reasoning and determining what tool action is needed (e.g., call pulsar-admin-topics), while the MCP Server handles the execution details (how) of interacting with the streaming backend, returning the observation (the topic stats).

How it Connects: Agents to Streams with Safety

The Model Context Protocol, implemented by the StreamNative MCP Server, provides the essential building blocks – Tools, Resources, and Prompts – that fundamentally expand what AI Agents can achieve when interacting with streaming data platforms. By leveraging these MCP primitives, agents gain two critical advantages: the ability to perceive and react to the world in real-time (connecting to streams), and the capacity to act within a framework of unified, secure administration (with safety).

First, MCP Resources and Tools directly address the limitation of static LLM knowledge by granting agents access to live data streams. Agents can utilize specific tools to query current states, consume messages, or even subscribe to continuous data feeds. This closes the gap between the agent's knowledge cutoff and the "here-and-now" reality reflected in platforms like Kafka and Pulsar, enabling truly context-aware agents to make timely decisions based on the latest events. This unlocks possibilities, allowing the agents to perform real-time monitoring or provide interactive diagnostics based on current system and platform states.

Second, the structured nature of MCP Tools and the ability to define accessible Resources provide the necessary foundation for governed agent actions. Administrators gain fine-grained control by selectively exposing specific tools and data resources to different agents. This allows AI agents to perform meaningful actions – like managing topics or understanding the real-time platform status using the authorised tools – while ensuring they operate within secure, predefined boundaries aligned with organisational policies. This capability is crucial for confidently deploying agents in enterprise environments, expanding their roles from passive information retrievers to active, yet controlled, participants in managing and interacting with streaming systems.

Therefore, the StreamNative MCP Server translates the potential of the Model Context Protocol into practice for your Kafka and Pulsar clusters. By providing controlled access to streaming capabilities and data, our server significantly enhances agent scope and reliability, enabling trustworthy, real-time AI applications. The next section details the specific features and capabilities built into the StreamNative MCP Server to deliver this value.

Key Features and Capabilities

Let’s drill into some of the technical highlights of the StreamNative MCP Server and what makes it developer-friendly:

🚀 30+ Built-In Tools and Actions

The MCP Server includes an elegantly designed toolkit of over 30 powerful tools that comprehensively cover the capabilities of modern streaming platforms.

Instead of building hundreds of single-purpose tools, we adopt an efficient approach by using 'Resource' and 'Operation' parameters within each tool, enabling one tool to handle multiple related functions. For example, the single `pulsar_admin_brokers` tool can list active brokers, check health status, and manage configurations through different parameter combinations. The toolkit supports a broad range of functionalities, including data operations (e.g., publishing or consuming messages), administrative tasks (e.g., creating topics, managing subscriptions, and monitoring broker statistics), and StreamNative Cloud resources management capabilities.

With this powerful library, AI agents can conveniently perform a wide range of tasks on the data streaming platform. It can "create a new topic for user logs," "increase the retention of topic Y to 7 days," or "write and run a Pulsar Function to process data" – and it knows the exact tools to execute these user requests. Each tool accepts input parameters (defined by JSON schemas) and returns results, with actions subject to host application approval.

🔒 Secure by Design

Security is a fundamental consideration in the StreamNative MCP Server design. It employs a defense-in-depth approach to ensure safe and governed agent interactions with your system. The server integrates with your cluster's existing authorization model via specified service accounts for granular access control. A strict read-only mode (--read-only) can also be enabled for added protection in sensitive environments. Administrators also have fine-grained control through selective feature enablement (--features) to limit the agent's operational scope based on least privilege. Complementing these controls, the server's built-in prompts often incorporate their own restrictions, adding another layer of guidance to keep AI agents interacting within intended boundaries. This multi-layered security supports strict policies and minimizes the risk of unauthorized access or data manipulation.

🔌 Connector Integration

The StreamNative MCP Server is designed to work with the Universal Connect (UniConn) framework, which means AI agents can leverage the rich ecosystem of Pulsar IO and Kafka Connect connectors through MCP. If your cluster is already ingesting or sinking data via connectors (e.g., from databases, cloud storage, etc.), the MCP Server can expose those as tools or resources as well. For instance, the MCP Server can spin up a Debezium MySQL → Pulsar pipeline on demand and then let the AI agent tap that stream to pull the latest change event or an entire batch of recent transactions. UniConn provides a unified interface for connectors on Pulsar and Kafka, and those connectors effectively become extensions of the AI’s reach. This opens up a world of external systems (SQL, NoSQL, SaaS APIs, etc.) to the AI agent through the same MCP Server. The agent could ask something like “What’s the latest record in our analytics DB?” and, via a connector tool, fetch that in real time. No custom code is needed to integrate these external sources – if there’s a connector, the MCP Server can likely expose it.

🗄️ Dynamic Topic Management

Beyond simply reading or writing data, the MCP Server lets AI agents create, configure, and manage topics and subscriptions on the fly. An agent can spin up a brand-new stream (“Create a topic for sensor-XYZ data”), which maps to a pulsar-admin-topics call, or tweak retention, partition counts, and subscription properties using the same toolset. All changes respect cluster governance – quotas, ACLs, and policies still apply – but the agent can carry them out from natural-language requests instead of a CLI.

🧩 Serverless Function Management

Moreover, we integrated Pulsar Functions support, enabling the agent to deploy serverless functions or connectors by submitting function code or connector configs via a tool. Imagine telling your AI agent, “Deploy a function that scans for sensitive data, e.g., SSN, and masks it”, and the agent uses an MCP tool to submit the Pulsar Function to the cluster. This drastically lowers the barrier to deploying stream processing logic, as the AI can act as your DevOps helper for streaming jobs. All changes remain subject to your cluster’s governance – the AI won’t bypass quotas or authorization – but it provides a natural-language interface to tasks previously handled via CLI or GUI.

📊 Streaming Data as First-Class Context

The StreamNative MCP Server supports streaming outputs using MCP’s event streaming features (based on JSON-RPC and will soon be on Server-Sent Events). This means that when an AI agent subscribes to a topic via a tool, the server can feed data continuously to the client in a streaming fashion, rather than sending only one-off responses. The MCP protocol supports sending incremental results, so an agent could effectively “listen” to a topic. This real-time push of data is crucial for truly live agentic behavior – your agent could, for example, monitor a stream of user transactions and proactively flag anomalies during a conversation. Under MCP, the client-side (agent host) can choose to display or use streaming responses as they come. The key takeaway: real-time data isn’t just a one-shot query – it’s a continuous feed, and our MCP Server fully supports that mode.

Interacting with Pulsar & Kafka via StreamNative MCP Server

Here are a few examples that showcase the StreamNative MCP Server’s capabilities; you can find additional demos in the StreamNative MCP Server playlist on YouTube.

Produce and Consume Kafka Messages with AVRO Schema in URSA

📺 Watch here

Create Kafka topic
Produce Kafka messages with AVRO schema
Consume Kafka messages
Examine the message in Databricks
Delete resources

Managing Pulsar Tenants, Namespaces, and Topics

📺 Watch here

Create tenant
Create namespace
Create partitioned topic
Test topic
Set namespace TTL
Delete resources

Create, Deploy, and Test Python Pulsar Function

📺 Watch here

Create Python Pulsar Function with vibe coding
Deploy with MCP Server
Test with MCP Server
Delete resources

Laying the Foundation for Real-Time Enterprise AI Agents

The release and open-source of the StreamNative MCP Server marks a significant milestone: it provides the foundation for what we envision as Real-Time Enterprise AI Agents—a complete environment for running AI agents natively with streaming data. With the MCP Server in place, AI agents can now connect to streaming systems to:

Retrieve up-to-the-second data for more accurate decision-making
Trigger transformations and pipelines via Pulsar Functions, ensuring the ability to enrich data on the fly
Tap into existing connectors to instantly access 200+ data sources without writing new integration code
Automate resource management and provisioning through natural language, reducing operational overhead and simplifying DevOps workflows

Future enhancements to the StreamNative MCP Server will unlock even more capabilities for fast, intelligent AI agents across diverse data landscapes.

Going Further with Ursa

Ursa, StreamNative’s next-generation, lakehouse-native data streaming engine, brings together real-time streaming data and lakehouse tables. Through MCP, AI agents gain unified access to both historical datasets (in Apache Iceberg or Delta Tables) and ongoing event streams – all from a single interface. This means no more relying on stale snapshots – agents can respond to live data, correlate it with archived knowledge, and deliver timely, context-rich insights.

Leveraging Pulsar Functions

Many users already rely on Pulsar Functions for real-time data processing and transformation. These business logic functions can now be directly utilized – or even dynamically created and updated – by AI agents through MCP. As a result, agents can perform in-flight analytics or adapt data pipelines based on changing requirements, making your event-driven architecture more intelligent and responsive.

Harnessing Connectors

StreamNative’s robust connector ecosystem, which covers everything from enterprise systems to SaaS platforms and databases, ensures that AI agents can connect to virtually any data source without custom coding. By removing the need for specialized integrations, developers save time and can focus on enhancing their AI-driven workflows.

Get Involved – Try it Out Today

The StreamNative MCP Server is available now on GitHub (under the StreamNative organization). We invite all streaming enthusiasts, data engineers, and curious tinkerers to download the code, read the docs, and play with it. We’ve provided the instructions that show how to connect an AI client – such as the Claude Desktop app – to your own MCP Server and start issuing tool commands to a local Pulsar or Kafka topic.

Because this is an early release, we’re actively seeking feedback and contributions from the community. Join the conversation on GitHub to ask questions, share use cases, and get help from our engineers and fellow early adopters.

This launch is an invitation to explore the cutting edge of real-time AI integration. Whether you want to build:

An AI ops assistant that manages your streaming platform
An intelligent monitoring agent that watches your event data
Or a new breed of data-driven chatbot that can act on the information it retrieves

…the tools are now in your hands.

We believe Agentic AI – AI agents empowered with real-time context – will unlock a new class of applications. With the StreamNative MCP Server, connecting streaming data to AI is no longer theoretical – it’s something you can implement today.

Feel free to explore the repo, launch the StreamNative MCP Server, and unleash your AI agents on live data. We can’t wait to see what you create, and we look forward to building the future of real-time AI together with the community.

This is some text inside of a div block.

Button Text

Neng Lu

Neng Lu is currently the Director of Platform at StreamNative, where he leads the engineering team in developing the StreamNative ONE Platform and the next-generation Ursa engine. As an Apache Pulsar Committer, he specializes in advancing Pulsar Functions and Pulsar IO Connectors, contributing to the evolution of real-time data streaming technologies. Prior to joining StreamNative, Neng was a Senior Software Engineer at Twitter, where he focused on the Heron project, a cutting-edge real-time computing framework. He holds a Master's degree in Computer Science from the University of California, Los Angeles (UCLA) and a Bachelor's degree from Zhejiang University.

Rui Fu

Rui Fu is a software engineer at StreamNative. Before joining StreamNative, he was a platform engineer at the Energy Internet Research Institute of Tsinghua University. He was leading and focused on stream data processing and IoT platform development at Energy Internet Research Institute. Rui received his postgraduate degree from HKUST and an undergraduate degree from The University of Sheffield.

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Introducing the StreamNative MCP Server: Connecting Streaming Data to AI Agents

What is the StreamNative MCP Server?

How It Works: Tools, Resources, and Prompts

How it Connects: Agents to Streams with Safety

Key Features and Capabilities

🚀 30+ Built-In Tools and Actions

🔒 Secure by Design

🔌 Connector Integration

🗄️ Dynamic Topic Management

🧩 Serverless Function Management

📊 Streaming Data as First-Class Context

Interacting with Pulsar & Kafka via StreamNative MCP Server

Produce and Consume Kafka Messages with AVRO Schema in URSA

Managing Pulsar Tenants, Namespaces, and Topics

Create, Deploy, and Test Python Pulsar Function

Laying the Foundation for Real-Time Enterprise AI Agents

Going Further with Ursa

Leveraging Pulsar Functions

Harnessing Connectors

Get Involved – Try it Out Today

Newsletter