Introducing the StreamNative AI Hub — Agent Engine, MCP Server & more.

By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.

Blog

July 15, 2025

6 min read

One Bus, Many Voices: Why Protocol Flexibility Matters for AI Agents

Matteo Merli

CTO, StreamNative & Co-Creator and PMC Chair Apache Pulsar

Explore the Series — Building AI Agents with Apache Pulsar:

This article is part of a three-part deep dive into how messaging architectures—especially the Pulsar protocol—can meet the evolving infrastructure demands of AI agents. From speed and flexibility to built-in resilience, this series unpacks the core messaging principles that power more capable and reliable AI systems.

--------------------------------------------------------------------------------------------------------------------

‍

AI agent ecosystems are rarely homogenous – they often involve a mix of languages, frameworks, and device types, each with its own preferred communication protocol. You might have edge IoT sensors speaking MQTT, web services using REST or AMQP, and data pipelines built on Kafka. Integrating all these “voices” into a cohesive system can be a daunting task if your messaging infrastructure is inflexible. In this final post of our series, we explore how Apache Pulsar’s pluggable protocol architecture enables multiple protocols (Pulsar, Kafka, MQTT, etc.) to coexist on a single event bus. We’ll see how this flexibility reduces system sprawl and accelerates development of AI agents, compared to Apache Kafka’s single-protocol model that often requires bolting on additional components.

(Earlier in this series, we discussed Pulsar’s support for multiple messaging patterns and its robust delivery guarantees (Streams vs Queues: Why Your Agents Need Both—and Why Pulsar Protocol Delivers and Reliability That Thinks Ahead: How Pulsar Helps Agents Stay Resilient). Now we look at another dimension of flexibility: multiple messaging protocols on one platform.)

Diverse Agents, Diverse Protocols

Let’s set the scene with an example: imagine a smart city AI system with various agents:

IoT sensors (traffic cameras, weather stations) that send data via MQTT – a lightweight pub/sub protocol common in IoT.
Backend analytics microservices written in Java using a Kafka client library (because the team has Kafka experience).
Legacy systems or edge devices using AMQP (the protocol behind RabbitMQ and other message brokers) for certain messaging needs.
Perhaps some mobile apps or web dashboards that communicate via WebSockets or REST.

In a traditional setup, you might deploy Kafka for the analytics pipeline, RabbitMQ for the AMQP devices, and an MQTT broker (like EMQX or Mosquitto) for the sensors. You’d then stitch these together: e.g., use Kafka Connect or custom bridges to pipe MQTT data into Kafka, and vice versa, or have services subscribe to multiple systems. This “many systems” approach leads to what we call system sprawl – multiple messaging infrastructures to operate and integrate. It introduces latency at the boundaries, increased ops overhead, and more points of failure.

Apache Kafka’s approach: Kafka uses its own proprietary binary protocol for client communication. Out of the box, Kafka speaks only Kafka protocol. If you have non-Kafka clients (MQTT, AMQP, etc.), Kafka by itself cannot talk to them. You’d typically deploy auxiliary services:

For MQTT, one approach is using a bridge or proxy: for instance, Confluent (Kafka’s company) provided an MQTT proxy that translates MQTT to Kafka, or you run a separate MQTT broker and use a Kafka Connect source/sink to move data between MQTT and Kafka.
For AMQP (e.g., RabbitMQ), you might have to consume from RabbitMQ and republish to Kafka (or vice versa) via a custom connector or application.
Each additional protocol usually means an additional layer or service to translate. This not only adds complexity, but can also limit functionality. For example, if you bridge MQTT to Kafka, features like MQTT’s persistent sessions or Kafka’s exactly-once might not translate perfectly through the bridge.

In short, Kafka’s single-protocol design means that if everything isn’t speaking Kafka, you need glue code or middleware. Many architectures with Kafka end up with a patchwork of brokers: Kafka + a message queue + an MQTT broker, etc., which is exactly what we want to avoid if possible.

Apache Pulsar’s approach: Pulsar was built with a concept of pluggable protocol handlers, enabling it to natively support multiple protocols on the same server. In practice, this means you can configure a Pulsar cluster to understand Kafka’s protocol, MQTT, AMQP, and more – all while storing and delivering messages using Pulsar’s backend. The Pulsar community has developed KoP (Kafka-on-Pulsar), MoP (MQTT-on-Pulsar), and AoP (AMQP-on-Pulsar) among other plugins. When these are enabled, a Pulsar broker effectively “speaks” the respective protocol:

KoP: Kafka on Pulsar allows Kafka clients (producers/consumers using the Kafka API) to connect to Pulsar as if it were a Kafka broker. The Pulsar broker listens on Kafka’s port (e.g., 9092) and understands Kafka protocol messages. This means an existing application coded to use the Kafka Java client can switch to Pulsar by just pointing it to the Pulsar cluster (with KoP enabled), no code change. The data it produces/consumes is actually stored in Pulsar topics, not Kafka logs, but the application is none the wiser. This capability dramatically eases migrations – teams can move to Pulsar without rewriting their whole codebase at once. Moreover, once on Pulsar, those apps gain access to Pulsar’s features like multi-tenancy and infinite log retention on cheaper storage tiers, which Kafka lacks or requires add-ons for.
MoP: MQTT on Pulsar works similarly for IoT scenarios. Your swarm of MQTT devices can connect to Pulsar brokers (with MoP enabled) using standard MQTT protocols. They publish and subscribe as if to a regular MQTT broker; under the hood Pulsar stores those messages in its distributed log. This means you don’t need a dedicated MQTT broker for your sensors – Pulsar handles it. And all the nice Pulsar features (like durability, geo-replication, tiered storage of old data) become available to the MQTT streams as well. For example, MQTT is often used with ephemeral brokers that might lose data if a consumer isn’t online. Pulsar’s storage ensures even if an IoT device goes offline, data can be retained until it comes back or can be replayed later.

Beyond these, Pulsar’s design allows adding other protocols relatively easily. In fact, there’s also WebSocket support, and even experiments with other systems (there was an integration called RocketMQ-on-Pulsar, etc.). The key is that Pulsar’s brokers translate whatever protocol into the Pulsar internal message format and back. All messages, regardless of ingress method, end up in the same durable, scalable storage and can be routed to any consumer. This unified bus can drastically simplify an AI architecture.

Why Does This Matter for AI Agents?

1. Easier integration of heterogeneous components: AI systems are evolving rapidly, and new tools or services come with their own interfaces. With Pulsar, you don’t have to constrain every component to one protocol. If your robotics team likes MQTT for device telemetry and your data science team likes Spark consuming from Kafka topics, that’s fine – both can work with the same Pulsar cluster. The MQTT devices publish to Pulsar (via MoP) and the Spark job (via KoP) can subscribe to that data, all in real time. No need to maintain a bridge or duplicate the data in two systems. This means you can plug in new agent components faster. The learning curve is lower too: developers can use the client libraries they are already familiar with (Kafka client, MQTT client, etc.) to interface with Pulsar. It lowers the barrier to adoption for various teams contributing to the agent ecosystem.

2. Reduced system sprawl and cost: Running one Pulsar cluster to handle multiple messaging needs is generally more efficient than running 2–3 separate systems (Kafka + RabbitMQ + MQTT broker). There’s less hardware overhead and fewer subsystems to monitor. For architects, this means fewer single-purpose data silos. Pulsar can act as a “single source of truth” event bus where all agent communications converge, even if they speak different protocols. Maintenance and scaling efforts focus on one system. It’s worth noting that Pulsar’s multi-protocol support doesn’t significantly degrade its performance; in many cases, the overhead of protocol translation is small compared to network and IO costs. So you can simplify your stack without sacrificing throughput.

3. Protocol-agnostic data flow: Because Pulsar decouples the storage of messages from the protocol, an event produced via one protocol can be consumed via another. For instance, an MQTT sensor publishes a message on topic “sensor/temperature,” which is stored in Pulsar. A Kafka client could subscribe to the equivalent Pulsar topic (through KoP) and get those temperature events as if they were coming from Kafka. This inter-protocol bridging is automatic in Pulsar – the topic is the common denominator. In Kafka world, doing such bridging often requires writing a Kafka Connector or a custom adapter service that reads from one system and writes to another, introducing additional latency and points of failure. Pulsar’s unified approach enables more real-time and straightforward data sharing across heterogeneous agents.

4. Future-proofing and innovation: With Pulsar’s plugin model, you’re less likely to hit a dead end when new tech comes along. If tomorrow a new standard protocol gains popularity in the AI/agents space, there’s a path to support it on Pulsar by writing a new protocol handler. In contrast, with Kafka you might have to wait for the ecosystem to build a stable connector or gateway, or run that new system separately. Pulsar’s flexibility thus acts as a hedge against changing technology. It also means you can gradually transition systems: for example, run Pulsar with KoP to serve your existing Kafka-based apps, and over time migrate those apps to use Pulsar’s native API if desired (for even more features). During the migration, they continue to interoperate. This “have your cake and eat it” approach speeds up adoption — companies like Tencent, for instance, have used Pulsar to replace Kafka under the hood for certain use cases, precisely because they could do so without telling all upstream/downstream apps to change at once.

Let’s illustrate with a scenario: suppose our smart city project initially used Kafka for aggregating events at the city level, and an MQTT broker for field devices. As it grows, the team finds maintaining two systems cumbersome. They decide to consolidate on Pulsar. They enable MoP, point all devices to the Pulsar endpoint (speaking MQTT) – devices don’t even notice the difference except perhaps improved reliability. They enable KoP, redirect existing Kafka clients (data sinks, analytics jobs) to Pulsar – those applications continue running as before. Now all data is flowing through one platform. Immediately, they notice benefits: data from devices is available to Kafka-based consumers with lower latency (no intermediate bridge needed). When a new AI agent service is developed in Python, the developers have options – they could use Pulsar’s native Python client. Either way, they tap into the same live data streams. The operational complexity drops, and the development agility increases (each team can work in the environment that suits them, while the system integrators ensure everything connects through Pulsar).

Meanwhile, Apache Kafka by itself would have pushed the team towards either writing a lot of integration code or standardizing on one protocol (often forcing everything into Kafka’s orbit). Some teams do end up standardizing on Kafka for all components (using Kafka clients everywhere). That can work for certain cases, but in contexts like IoT or edge AI, Kafka’s client library might be too heavy for small devices, or it may lack features like MQTT’s simple subscribe semantics or HTTP-based ingestion, etc. Pulsar avoids that “one size must fit all” trap by natively embracing multiple standards.

Key Takeaways:

Pulsar’s multi-protocol support (KoP, MoP) allows one Pulsar cluster to natively handle Kafka and MQTT clients and more. This means AI agents and devices can communicate using their protocol of choice while sharing a common event bus.
Easier integration and migration: Kafka clients can migrate to Pulsar without code changes and immediately leverage Pulsar’s advanced features. MQTT devices can connect directly to Pulsar and benefit from its durable storage and scaling. This flexibility accelerates deployment of new agent components and integration of legacy systems.
Reduced complexity: Instead of running separate messaging systems for different parts of your AI platform (and maintaining bridges between them), Pulsar provides a unified infrastructure. Fewer moving parts lead to lower latency and easier operations. For example, integrating MQTT with Kafka otherwise requires connectors or proxies, adding operational burden – Pulsar eliminates that by doing it natively.
Protocol transparency: In Pulsar, an event doesn’t care how it was produced or consumed. A message from an MQTT device can be consumed by a Kafka client or vice versa through the Pulsar broker, enabling cross-ecosystem data flow with no extra code. Your AI agents can thus share information more freely, which is vital for building collaborative, real-time intelligent systems.
Future-proof and extensible: Pulsar’s design anticipates that the tech landscape is varied. As your agent architecture evolves, Pulsar can adapt – supporting new protocols or standards as needed. It gives architects confidence that adopting Pulsar means adopting a platform, not just a single-protocol tool.

In summary, Apache Pulsar serves as a “one bus for many voices.” It lets all the players in your AI system – be they tiny IoT sensors or big data crunching services – communicate through a common medium without forcing them to all speak the same dialect. This reduces friction and speeds up development, because you can choose the best protocol or tool for each job and rely on Pulsar to bridge the gaps. By contrast, Kafka’s more siloed approach often means additional layers or a push to consolidate on Kafka’s API, which isn’t always practical.

For developers and system architects, this protocol agility can be a revelation. It becomes significantly easier to incorporate diverse components into your real-time AI platform. Need to plug in a new third-party service that only knows how to write to Kafka? No problem – point it at Pulsar KoP and you’re done. Want to ingest data from an existing MQTT broker network? Pulsar can be that broker. The end result is an accelerated deployment cycle for AI agents: you spend less time building glue code or deploying connectors, and more time on the agents’ logic and insights.

This concludes our three-part exploration of why Apache Pulsar offers unique advantages for building reasoning and reactive AI agents. We’ve seen how Pulsar’s unified approach to streams and queues, its resilient delivery guarantees, and its protocol flexibility all contribute to a more powerful and adaptable infrastructure. For teams pushing the boundaries of AI applications – where real-time data and robust messaging are key – Pulsar provides a solid foundation that can evolve with your needs.

Try out Pulsar!

This is some text inside of a div block.

Button Text

Matteo Merli

Matteo is the CTO at StreamNative, where he brings rich experience in distributed pub-sub messaging platforms. Matteo was one of the co-creators of Apache Pulsar during his time at Yahoo!. Matteo worked to create a global, distributed messaging system for Yahoo!, which would later become Apache Pulsar. Matteo is the PMC Chair of Apache Pulsar, where he helps to guide the community and ensure the success of the Pulsar project. He is also a PMC member for Apache BookKeeper. Matteo lives in Menlo Park, California.

Show all

Blog

Aug 4, 2025

8 min read