Introducing the StreamNative AI Hub — Agent Engine, MCP Server & more.

By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.

Blog

June 10, 2025

6 min read

Streams vs Queues: Why Your Agents Need Both—and Why Pulsar Protocol Delivers

Matteo Merli

CTO, StreamNative & Co-Creator and PMC Chair Apache Pulsar

Developers building reasoning and reactive AI agents often grapple with two messaging patterns: streaming (event streams) and queuing (work/task queues). It’s crucial to understand the difference because effective AI agents typically need both patterns in their architecture. In this first post, we’ll clarify stream vs. queue semantics and show how Apache Pulsar (protocol) uniquely delivers both out of the box, unlike Apache Kafka (protocol) which was designed around streams. We’ll use practical examples (imagine continuous sensor inputs and discrete task execution requests) to illustrate why agents demand both patterns and how Pulsar handles them natively.

Stream vs. Queue Semantics 101

In streaming message systems, producers append data to an unbounded, ordered log (the stream). Consumers then read from this log in sequence, maintaining an offset (position) in the stream. Order is guaranteed per partition, and messages aren’t removed on consumption (they remain for a retention period). This is great when event order matters – for example, time-series sensor data or user click events should be processed in the exact order produced. Apache Kafka is the classic example of a streaming platform: it provides high throughput and strict ordering by partitions, which makes it ideal for ingesting ordered event streams.

In queuing message systems, producers send messages to a queue, and each message is processed by only one consumer (even if many consumers are listening). Consumers pull from the queue and acknowledge each message when done, upon which it’s removed from the queue. Queues excel at distributing tasks or jobs that can be done in parallel without a global ordering requirement. This pattern is common for background work: e.g. an agent that needs to perform independent tasks (send emails, execute API calls) can put those tasks on a queue, and a pool of workers will split them up. Systems like RabbitMQ or Amazon SQS embody queue semantics – focusing on one-message-per-consumer with robust features like message retries and dead-lettering.

Why do agents need both? Because AI agents operate in real-time environments and must perform reliable actions. For instance, consider a robotic agent: it ingests a stream of sensor readings (continuous, ordered data) while also handling discrete commands or tasks (which can be processed independently). A streaming pipeline ensures the robot’s perception of the world stays ordered (you don’t want to react to events out of sequence). A queue ensures the robot can execute tasks concurrently or retry failures without halting all other work. In practice, the most powerful systems leverage both patterns – streaming for live data feeds and queueing for task execution. Real-world examples include an IoT monitoring agent that uses streaming for sensor telemetry, plus queueing to distribute analysis jobs or alerts based on those sensor events.

How Pulsar and Kafka Handle These Patterns

Apache Kafka (protocol) was originally built around the stream model. It provides high-performance ordered logs, but it doesn’t natively implement traditional queue semantics. You can use Kafka like a queue to some extent – for example, by creating a topic with multiple partitions and a consumer group so that each message goes to one consumer. However, because Kafka enforces per-partition order, this approach comes with caveats. If one message in a partition takes a long time to process or a consumer instance crashes, subsequent messages in that partition are blocked until the slow message is handled (since Kafka consumers can only mark progress by committing the offset up to the last processed message). In effect, a slow or stuck message can stall that “queue.” A common workaround in Kafka is to have the application catch a failed message and publish it to a separate topic (acting as a manual dead-letter queue or retry queue). But this introduces extra complexity – developers need to build custom logic for rerouting or reprocessing failed events, manage multiple topics for what conceptually is one queue, and potentially re-order results that arrive via retries.

Apache Pulsar (protocol) was designed to natively support both streaming and queueing paradigms. Pulsar topics are append-only logs like Kafka, but Pulsar’s consumer model is more flexible. Pulsar supports multiple subscription types on topics: for example, a Shared subscription lets multiple consumers fetch from the same topic in a round-robin fashion (each message goes to one consumer, like a work queue). This enables true distributed queuing on a single topic – you can have, say, 10 consumers all pulling tasks from one Pulsar topic and the broker will balance the load among them. Crucially, each consumer individually acknowledges messages in Pulsar, so the system knows exactly which messages were processed. If one consumer is slow or fails, it doesn’t hold up others – unacknowledged messages can be redelivered to another worker as needed (more on this in Post 2). The Exclusive/Failover subscription modes, on the other hand, let only one consumer (or one primary with a hot standby) consume a topic, preserving total order like Kafka’s semantics. And Pulsar even has a Key_Shared mode where messages are distributed but ordering is maintained per key – effectively a hybrid that ensures all messages for a given entity go to the same consumer in order, while still load-balancing different keys across consumers.

What this means is that Pulsar delivers true queue and stream capabilities in one system. You can treat a Pulsar topic like a Kafka stream and/or like a distributed queue depending on the subscription. Under the hood, it’s the same topic, but the consumption pattern adapts to your needs. For example, a Pulsar topic with a Shared subscription is analogous to a RabbitMQ queue – multiple consumers each get a subset of messages – whereas the same topic could have another subscription that behaves like a Kafka stream (with a dedicated consumer reading the full ordered log). Indeed, Pulsar’s heritage at Yahoo was as a unified messaging platform intended to replace both their Kafka (stream) and RabbitMQ (queue) use cases. As one case study noted, Kafka was excellent for ordered event ingestion, but Yahoo’s team “used RabbitMQ for other use cases since Kafka lacked the necessary work-queue semantics”. Pulsar was adopted because it could cover all Kafka-like and RabbitMQ-like scenarios in a single, scalable system.

Practical Example: Sensor Streams + Task Queues

Let’s revisit the example of an AI-powered robot for a concrete scenario:

Sensor input (streaming): The robot’s vision or telemetry sensors publish a constant stream of events (images, lidar scans, etc.). These need to be processed in order and possibly replayed for debugging. Using Pulsar, the sensor topics could be consumed with an exclusive subscription (strict order) by a stream processing component. Kafka could also handle this part well, as it’s a straight event log.
Task execution (queuing): When the robot’s AI decides on actions (e.g. “pick up object” or “navigate to location”), those tasks are added to a work queue. Here Pulsar shines: the tasks can be sent to a Pulsar topic with a shared subscription, so multiple executor modules (consumers) will divide the tasks. Each task message goes to one executor, which acknowledges it upon completion. If a task fails, the executor can negative-acknowledge it (signal a failure) and another instance can retry (we’ll explain this mechanism later). In Kafka, implementing this queue would be clumsier – you might create a single-partition topic (to ensure one consumer at a time) or a partition per consumer, but then you lose parallelism or have to predefine partitions. And without per-message ack, error handling would require manual intervention (like writing failed tasks to a new “retry” topic).

By using Pulsar for both patterns, our hypothetical robot agents get the best of both worlds seamlessly. There’s no need to run separate systems (Kafka for streams and a RabbitMQ or SQS for queues) and then glue them together. Pulsar can ingest the high-rate sensor streams and dispatch tasks with queue semantics in one unified platform. This simplicity translates to a more cohesive architecture for AI agents, where every kind of message – whether an ordered event or a one-off task – can flow through the same Pulsar cluster. It reduces operational overhead and eliminates the impedance mismatch when bridging different messaging systems. As developers, we can focus on our agent logic rather than on plumbing data between Kafka topics and a separate queue service.

Key Takeaways:

Streams vs Queues: Streaming systems preserve an ordered log of events for replay or sequential processing, while queues distribute individual messages to consumers for parallel task execution. AI agents commonly require both patterns (e.g. process sensor events in order, handle commands/tasks concurrently).
Kafka’s limitation: Kafka natively provides streams, not work-queues. You can simulate queues on Kafka but face complications due to strict ordering and offset-based acknowledgments. A slow or failed message can block a partition, and handling retries means extra topics and custom logic.
Pulsar’s advantage: Pulsar supports both messaging semantics natively. Its flexible subscription modes (exclusive, shared, failover, key_shared) let you pick the right tool for the job on the same platform. You get Kafka-like high-throughput streams and RabbitMQ-like distributed queues in one system. This means less system sprawl and easier integration between components – a big win for complex AI agent architectures.

In the next post, we’ll delve into reliability and resilience – exploring how Pulsar’s acknowledgment and retry mechanisms keep AI agent pipelines robust where Kafka’s model can struggle.

Try out Pulsar on StreamNative Cloud!

‍

This is some text inside of a div block.

Button Text

Matteo Merli

Matteo is the CTO at StreamNative, where he brings rich experience in distributed pub-sub messaging platforms. Matteo was one of the co-creators of Apache Pulsar during his time at Yahoo!. Matteo worked to create a global, distributed messaging system for Yahoo!, which would later become Apache Pulsar. Matteo is the PMC Chair of Apache Pulsar, where he helps to guide the community and ensure the success of the Pulsar project. He is also a PMC member for Apache BookKeeper. Matteo lives in Menlo Park, California.

Show all

Blog

Jul 18, 2025

6 min read