Ursa Wins VLDB 2025 Best Industry Paper: The First Lakehouse-Native Streaming Engine for Kafka

By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.

Blog

August 6, 2025

8 min read

At-Least-Once, Exactly-Once, and Acks in Pulsar (Pulsar Guide for RabbitMQ/JMS Engineers 3/10)

Penghui Li

Director of Streaming, StreamNative & Apache Pulsar PMC Member

Hang Chen

Director of Storage, StreamNative & Apache Pulsar PMC Member

Neng Lu

Director of Platform, StreamNative

‍TL;DR:

Pulsar ensures at-least-once delivery by persisting messages until consumers acknowledge them. In practice, this is similar to RabbitMQ’s and JMS’s default behavior – you won’t lose messages, but you could see duplicates if something fails and a message is re-delivered. Pulsar also offers features for effectively-once processing: it has automatic message deduplication on the broker side and introduced transactions for true end-to-end exactly-once semantics in complex workflows. In this post, we’ll explain how acknowledgments work in Pulsar (individual vs cumulative acks), how Pulsar handles redeliveries and duplicates, and how you can achieve “exactly-once” delivery guarantees using Pulsar’s features (which is something neither RabbitMQ nor JMS natively provide without external coordination).

Understanding At-Least-Once Delivery in Pulsar

By default, Pulsar follows an at-least-once delivery model. This means every message sent to Pulsar will be delivered to consumers at least once. It might be delivered more than once in some failure scenarios, but Pulsar will never intentionally drop a message that hasn’t been acknowledged. Let’s break down what that means:

When a message is published to a Pulsar topic, it’s stored durably (on disk via BookKeeper) and added to each subscription’s backlog.
A consumer receives the message and processes it. Until the consumer sends an acknowledgment back to the broker, that message remains in the backlog marked as “unacknowledged”.
If the consumer fails to ack (e.g., it crashes or times out), the broker will re-deliver that message (either to the same consumer once it reconnects, or to another consumer if using a shared subscription).
Because of this re-delivery on no-ack, the consumer might end up seeing the same message again – hence “at least once”.

This is analogous to how RabbitMQ works when you use manual acknowledgments (basic_ack). If a RabbitMQ consumer dies before acking, RabbitMQ will requeue and redeliver the message to another consumer (or the same one when it comes back), resulting in a potential duplicate delivery to the application. JMS similarly, in CLIENT_ACK or transactional sessions, will redeliver unacked messages on restart or rollback.

Pulsar’s ack mechanism: In Pulsar, an acknowledgment is an explicit signal. The default mode in the client API is manual ack – your consumer code calls consumer.acknowledge(messageId) (or acknowledges cumulatively, which we’ll discuss shortly). Pulsar then knows it can mark that message as processed. Only after acking does Pulsar consider the message permanently done for that subscription. Until then, it’s retained.

Now, how long will Pulsar wait to redeliver if a consumer doesn’t ack? Pulsar has a concept of acknowledgment timeout. If you set an ack timeout on the consumer (say 30 seconds), then if the consumer hasn’t acked a given message within 30s of receiving it, the broker considers that consumer “stuck” and will try delivering that message to another consumer (in a shared subscription scenario) or the same consumer again. The message is not removed until acknowledged. If ack timeout is not set, Pulsar will only redeliver on certain events like the client disconnecting. Additionally, a consumer can explicitly negative-ack a message (tell the broker “I’ve failed to process this, please redeliver sooner”) to speed up re-delivery without waiting for a timeout.

Cumulative vs Individual Acknowledgments

Pulsar has a feature JMS does not: cumulative acknowledgment. This is a bit like saying “I acknowledge everything up to message X”. It’s useful for high throughput when you process messages in order and want to reduce ack traffic. For example, if a consumer receives messages 1,2,3,...10 sequentially, instead of sending ten separate acks, it could send one cumulative ack for message 10, which signals to the broker “messages 1 through 10 are all acknowledged”. This only works in subscription modes where ordering is guaranteed (exclusive or failover subs) and not in shared mode (because shared mode might deliver out-of-order across consumers). It’s an optimization detail, but good to know Pulsar supports it.

Individual ack is the normal mode: ack each message as you process it, which you’d do in a shared subscription or any scenario where you might skip around.

Negative ack (nack): Pulsar consumers can send a negative acknowledgment for a message they failed to process, prompting immediate re-delivery (instead of waiting for a timeout). RabbitMQ has a similar concept: basic.nack/basic.reject to requeue or dead-letter a message. JMS typically would rely on not acknowledging or rolling back to signal failure.

At-Most-Once Mode?

At-most-once would mean a message is either delivered once or not at all (no duplicates, but possibly dropped on failure). Pulsar by design doesn’t drop messages without ack. However, if you wanted at-most-once behavior (for example, maybe you don’t care if the message is lost on failure, but you want to avoid duplicates at all cost), the way to approximate that is to enable auto-ack (so the client acks as soon as it receives a message, before processing). In that case, if the app crashes during processing, the message was already acked and will not be redelivered – so you might have lost it (didn’t finish processing) = at-most-once. But that’s usually not what you want for reliable systems. Pulsar defaults to at-least-once to favor reliability.

Dealing with Duplicates: Effective “Exactly-Once” Processing

Pure exactly-once delivery (where the messaging system guarantees a message is never delivered more than once to any consumer) is a hard problem in distributed systems, especially without heavy transactional coordination. Neither RabbitMQ nor JMS brokers guarantee exactly-once delivery to consumers out of the box – they guarantee at-least-once, and it’s up to the consumer to handle deduplicating if needed.

Pulsar’s improvements: Pulsar provides a couple of features to minimize duplicates:

Message Deduplication on the broker: Pulsar brokers can detect and eliminate duplicates that occur due to producer retries. For example, if a producer sends the same message again (maybe it didn’t get an ack and retried), Pulsar can discard the duplicate if it has the same unique sequence ID from that producer. This is a server-side dedup so that the topic doesn’t even get the duplicate persisted. To use it, you enable broker deduplication (and the producer must either provide sequence IDs or let Pulsar auto-assign them). This is great for preventing the classic duplicate that happens when a producer retry goes through (e.g., network glitch causing the producer to think message wasn’t sent and send it again). Pulsar will store a small cache of recent message IDs to compare and drop dups. RabbitMQ doesn’t have an equivalent feature – if a producer re-sends, Rabbit will just queue it again, so consumers may see duplicates if the producer logic doesn’t handle it. JMS doesn’t standardize this either (some JMS brokers have “duplicate delivery check” features, but not universal).
Transactions and Exactly-Once Semantics: Pulsar introduced a transaction mechanism that allows a producer and consumer to participate in an atomic operation. Essentially, a consumer can consume messages and produce results to another topic within a transaction, and commit it such that either both the ack and the new message publish happen or neither do. With this, Pulsar can achieve end-to-end exactly-once in a pipeline (e.g., when using Flink or Pulsar Functions). If the transaction is aborted, Pulsar will roll back (meaning it will not ack the inputs, so they’ll be redelivered, and it will discard any outputs). If committed, it will make sure the ack is persisted and outputs are visible, exactly once. This feature is powerful for streaming jobs that read from a topic and write to another – it prevents duplicates in the output even if the job restarts. Implementing that in RabbitMQ or JMS typically involves external transactions (like using a database as a fence or two-phase commit between the queue and the processing outcome). Pulsar has it built-in for its ecosystem (since 2.8.0).

It’s worth noting that “exactly-once” in messaging is often achieved at the processing level rather than literally one and only one delivery. Pulsar’s documentation talks about “effectively-once” processing, meaning through deduplication + proper design you can ensure each effect (like a database update or a downstream event) happens once. The broker may deliver something twice, but your application or the system deduplicates such that the end result doesn’t double-count.

Where JMS stands: JMS doesn’t guarantee exactly-once delivery either. The closest is if you use JMS in a transacted session, you can get exactly-once processing within that transaction – either you consume and commit (so you won’t see it again) or rollback (so it’s as if you never got it). But that’s still at-least-once at the system level; exactly-once globally requires coordination outside JMS (like the two-phase commit with XA if integrating with a database).

Handling Acknowledgments in Practice

RabbitMQ users: Think of Pulsar’s ack like basic_ack. You should ack after you’ve processed the message. If you fail to process, you can either not ack (and allow redelivery) or negative ack (to expedite requeue). There’s no direct equivalent of RabbitMQ’s basic.reject requeue=false (which dead-letters or drops a message) except to implement a Dead Letter Topic policy or simply ack & drop. We’ll cover Dead Letter Topics in the next post, but basically, Pulsar can automatically route messages that keep failing to a special “DLQ” topic after a max redelivery count.

JMS users: Pulsar’s manual ack is like CLIENT_ACKNOWLEDGE mode (where you call message.acknowledge()). If you used AUTO_ACK in JMS, then to replicate that you’d just call ack as soon as you get the message or use a listener that auto-acks. Pulsar doesn’t have the concept of DUPS_OK_ACKNOWLEDGE (which JMS had for potentially lazy acks). And for JMS transacted sessions, the analogy would be using Pulsar transactions if you truly need atomic consume+produce. But for most cases, you commit processing by acking the message.

A nice thing about Pulsar: acknowledgments can be asynchronous (non-blocking). When you call consumer.acknowledgeAsync(msgId), the client will send the ack to broker in the background while your code can move on. This helps keep throughput high (you don’t wait for an ack round-trip each time).

Exactly-Once Processing with Pulsar: A Quick Example

To illustrate how Pulsar can do what others can’t, let’s outline a scenario:

Suppose we have a system that reads messages from an “input” topic, does some transformation, and writes to an “output” topic. We want to ensure that each input message results in exactly one output message, even if crashes happen.

Using plain at-least-once, if our consumer processes a message and publishes the result, but crashes before acking, Pulsar will redeliver that input message and the consumer will process it again, producing a duplicate output. How to avoid that?

With Pulsar Transactions: We can start a Pulsar transaction, consume the message, produce the output message within the transaction, then commit the transaction. Pulsar will ensure the ack for input and the publish for output are atomic. If crash happens before commit, none of it is visible (so no ack, input will replay, but also no output published). If commit succeeds, input is acked and output is published once. This way, the output topic will not have duplicates, and input won’t be reprocessed erroneously.

Without transactions, one could still achieve idempotency by including a unique identifier from the input in the output and having consumers or downstream deduplicate, but that’s more work on the user side. Pulsar’s transactions aim to handle it in the messaging layer.

It’s advanced and currently used with frameworks like Flink for exactly-once streaming jobs. For many use cases, enabling broker deduplication is sufficient to avoid producer-side duplicates, and carefully handling consumer logic (so it can tolerate the rare duplicate by ignoring if it sees one) achieves effectively-once processing.

Acknowledgment API Summary

Here’s a quick summary of Pulsar acknowledgment-related APIs and features:

consumer.acknowledge(msgId) – ack a single message.
consumer.acknowledgeCumulative(msgId) – ack this and all earlier messages in the subscription (only for ordered subs) in one go.
consumer.negativeAcknowledge(msgId) – signal a failure on this message; broker will redeliver it after a short delay (by default).
Ack timeout (set via ConsumerBuilder.ackTimeout(duration)) – if set, broker will automatically treat unacked messages as needing redelivery after this timeout.
By default, no ack timeout is set, so broker waits indefinitely until the consumer dies or negative acks.
Pulsar will mark acknowledged messages as deletable. If all subscriptions ack a message, it’s removed from storage (unless retention is keeping it for some time).
Unacknowledged messages live in the backlog. If a consumer reconnects, it’ll receive those messages.
Exactly-once via transactions: Use the transactional API (PulsarClient.newTransaction) to encompass consume and produce operations. This is a more complex API, not used unless you specifically need it.

What About Ordering and Redelivery Ordering?

One nuance: There is no ordering guarantee for Pulsar’s Shared subscription. If ordering is crucial, you typically would be using an Failover subscription (1 active consumer) or Key_Shared (to maintain per-key order). In those cases, if a message is not acked, you usually stop processing subsequent ones (or use cumulative ack) to maintain order.

Using negative ack on an Exclusive or Failover sub can break ordering if you continue with later messages. So the recommended pattern is, if you care about order, don’t ack out of order. Handle the failure out-of-band (like send to DLQ) or stop consumption until you can ack.

Key Takeaways

At-least-once is the default: Pulsar, like RabbitMQ and JMS, will do everything to ensure a message is not lost – storing it until acknowledged. This means duplicates are possible on failures. You should design consumers to handle the occasional duplicate message.
Acks are explicit and crucial: Your Pulsar consumers must acknowledge messages after processing. Until you ack, the broker assumes you haven’t finished and will resend if needed. Pulsar gives you tools like cumulative ack and ack timeouts to manage this efficiently.
No auto-drop: Pulsar won’t drop messages that aren’t acked (unless you explicitly configure a TTL). There’s no equivalent of JMS’s Session.AUTO_ACKNOWLEDGE where messages are implicitly acked upon receipt – in Pulsar, ack happens when you call it (or if using a listener, when the framework acks after your callback returns).
Duplicates mitigation: Pulsar broker can deduplicate messages on the producer side when enabled, eliminating duplicates caused by producer retries. This is something RabbitMQ doesn’t do internally.
Exactly-once capabilities: Pulsar is one of the few messaging systems in its class that provides a transactional mechanism for true exactly-once delivery in complex workflows. This is advanced and typically used with stream processing frameworks, but it’s there. For simpler cases, you can often reach “effectively-once” by using deduplication and careful consumer design.
Comparison to RabbitMQ/JMS transactions: RabbitMQ’s handling of acknowledgments is simpler (it has no multi-message transactions beyond acknowledging multiple deliveries in one go). JMS has the notion of sessions and transactions, but coordinating an exactly-once outcome often required XA transactions with an external resource. Pulsar’s built-in transaction support and end-to-end exactly-once for consume-process-produce scenarios is a step beyond what traditional brokers offer, giving Pulsar an edge for building reliable data pipelines.
Negative acks and redelivery: You can signal failures explicitly with negative acks, and Pulsar will requeue the message for redelivery quickly, helping you implement retry logic. This is similar to basic.nack in RabbitMQ.

In summary, Pulsar’s acknowledgment and delivery semantics are robust and similar to what queue veterans expect, with some extra goodies (like dedup and transactions) for those who need that extra level of guarantee. In the next post, we’ll look at how Pulsar’s concept of subscriptions can be used to mimic various queueing patterns, specifically focusing on how Shared and Failover subscription modes work – essentially, how Pulsar “queues” actually operate under the hood.

Stay tuned to understand how “Queues are just subscriptions” in Pulsar and how that simplifies scaling and failover.

‍

----------------------------------------------------------------------------------------------------

‍

Want to go deeper into real-time data and streaming architectures? Join us at the Data Streaming Summit San Francisco 2025 on September 29–30 at the Grand Hyatt at SFO.

30+ sessions | 4 tracks | Real-world insights from OpenAI, Netflix, LinkedIn, Paypal, Uber, AWS, Google, Motorq, Databricks, Ververica, Confluent & more!

[Explore the Full Agenda]

[Register Now]

‍

This is some text inside of a div block.

Button Text

Penghui Li

Penghui Li is passionate about helping organizations to architect and implement messaging services. Prior to StreamNative, Penghui was a Software Engineer at Zhaopin.com, where he was the leading Pulsar advocate and helped the company adopt and implement the technology. He is an Apache Pulsar Committer and PMC member.

Hang Chen

Hang Chen, an Apache Pulsar and BookKeeper PMC member, is Director of Storage at StreamNative, where he leads the design of next-generation storage architectures and Lakehouse integrations. His work delivers scalable, high-performance infrastructure powering modern cloud-native event streaming platforms.

Neng Lu

Neng Lu is currently the Director of Platform at StreamNative, where he leads the engineering team in developing the StreamNative ONE Platform and the next-generation Ursa engine. As an Apache Pulsar Committer, he specializes in advancing Pulsar Functions and Pulsar IO Connectors, contributing to the evolution of real-time data streaming technologies. Prior to joining StreamNative, Neng was a Senior Software Engineer at Twitter, where he focused on the Heron project, a cutting-edge real-time computing framework. He holds a Master's degree in Computer Science from the University of California, Los Angeles (UCLA) and a Bachelor's degree from Zhejiang University.

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

At-Least-Once, Exactly-Once, and Acks in Pulsar (Pulsar Guide for RabbitMQ/JMS Engineers 3/10)

‍TL;DR:

Understanding At-Least-Once Delivery in Pulsar

Cumulative vs Individual Acknowledgments

At-Most-Once Mode?

Dealing with Duplicates: Effective “Exactly-Once” Processing

Handling Acknowledgments in Practice

Exactly-Once Processing with Pulsar: A Quick Example

Acknowledgment API Summary

What About Ordering and Redelivery Ordering?

Key Takeaways

Newsletter