Skip to content

Exactly Once is a Lie

Imagine you're placing an order in an online shop. You click the "Submit Order" button. Nothing happens. You wait a few seconds. Still nothing. So you click again. And maybe once more, just to be sure. Finally, a confirmation page appears. You've successfully placed your order – or have you? Did you place one order, or three?

This scenario plays out millions of times every day across the internet, and it reveals one of the most persistent myths in distributed systems: the promise of exactly-once delivery. Message queues advertise it. Streaming platforms claim it. Enterprise architectures depend on it. But here's the uncomfortable truth: exactly-once delivery is impossible in distributed systems. The good news? That's perfectly okay, and there are practical ways to handle it.

The desire for exactly-once semantics is deeply intuitive. You transfer money – it should leave your account exactly once. You place an order – one order should go through, not three. Duplicate processing leads to double charges and customer frustration. Lost messages mean orders that vanish and payments that never get recorded. Both scenarios are unacceptable in production systems, which is why the promise of "exactly once" sounds so appealing. The only problem is that it contradicts fundamental realities of distributed computing.

The Three Actual Guarantees

Before we explore why exactly-once is impossible, let's clarify what distributed systems can actually guarantee. There are three real delivery semantics:

At-most-once delivery means a message will be delivered zero or one time, but never more than once. This is a "fire and forget" approach appropriate for non-critical operations like logging or metrics where losing occasional data points is acceptable, and duplicates would be more problematic.

At-least-once delivery guarantees that a message will be delivered one or more times. It will definitely arrive, but it might arrive multiple times. This is the standard choice for business-critical operations like payments or order placement where losing a message is unacceptable, even if it means dealing with potential duplicates.

Exactly-once delivery promises that a message will be delivered exactly one time – not zero, not two, but precisely one. This sounds perfect. It's also theoretically impossible to guarantee in asynchronous distributed systems.

The Two Generals Problem

To understand why exactly-once delivery is impossible, we need to go back to a classic problem in computer science: the Two Generals Problem.

Imagine two generals commanding separate armies on opposite sides of an enemy city. They want to attack together, but they'll only succeed if they attack simultaneously. They can communicate by sending messengers through enemy territory, but messengers might be captured.

General A sends a messenger to General B: "Attack at dawn." The messenger reaches General B. But does General A know that General B received the message? Not yet. So General B sends a messenger back to confirm. But now General B doesn't know if that confirmation reached General A. So maybe General A should send another confirmation. But then General A doesn't know if that reached General B. This continues infinitely.

No matter how many confirmations are sent, neither general can be certain that both sides have the same understanding. There's always one more acknowledgment needed, and that last acknowledgment itself needs to be acknowledged. The problem is unsolvable.

This is exactly the situation we face in distributed messaging. When you send a message and wait for an acknowledgment, if you don't receive one, you don't know whether the message was lost or the acknowledgment was lost. This is why exactly-once delivery is impossible. You can get arbitrarily close with enough confirmations and retries, but you can never be absolutely certain.

The theoretical impossibility becomes painfully practical when networks timeout. You send an HTTP request to write an event. Seconds tick by. Your client times out. Did the request succeed and the response got lost? Or did the request never arrive? There's no way to know. If you retry, you risk creating a duplicate. If you don't retry, you risk losing the event. The timeout doesn't tell you what happened; it tells you that you don't know what happened.

The One Exception: Transactional Processing

There's an important caveat: exactly-once processing is possible under very specific conditions. If you can make the effect of processing a message and the tracking of that processing happen in a single atomic transaction, then you can guarantee exactly-once processing.

For example, when building a read model in a PostgreSQL database, you can perform both the read model update and the recording of the processed event ID in a single database transaction. Either both succeed or neither does. If the transaction fails, when you retry, you'll see that the event hasn't been processed yet. If it succeeds, the event ID is stored, and duplicates will be detected.

This is rarely applicable in practice. Most interesting event handlers have side effects outside a single database transaction: publishing to message queues, sending emails, calling external APIs, or updating multiple different databases. For all these cases, transactional exactly-once is impossible, and we're back to choosing between at-most-once and at-least-once.

Given this reality, what do vendors mean when they advertise "exactly-once semantics"? Usually, they're talking about effectively-once semantics through a combination of at-least-once delivery and idempotent processing. The message itself might be delivered multiple times, but the effect of processing it happens only once. The message arrives at least once, guaranteeing it won't be lost. The processing is designed to be idempotent, meaning that applying the same message multiple times produces the same result as applying it once. Together, these give you the outcome you actually care about: the business effect happens exactly once, even if the technical delivery happens multiple times.

This shifts the problem from an impossible guarantee about delivery to a solvable problem about design. You can design your event handlers to recognize when they've already processed an event and skip redundant work. This is achievable, practical, and honest.

Idempotence: The Solution

The concept that makes effectively-once semantics possible is idempotence. An operation is idempotent if performing it multiple times has the same effect as performing it once.

Some operations are naturally idempotent. Setting a value to a specific state is idempotent. If you set a light switch to "on," it doesn't matter how many times you do it – the light is on. Operations like SET, PUT, and DELETE in databases and APIs are often idempotent by design.

Other operations are not naturally idempotent. Incrementing a counter is not idempotent. If you increment a counter from 5 to 6, doing it twice gives you 7, not 6. Similarly, sending an email twice means the recipient gets two emails.

The key to building reliable distributed systems is recognizing when an operation is not naturally idempotent and making it idempotent through careful design. This usually means checking whether the operation has already been performed before performing it again, or designing the operation so that repeating it naturally produces the same result.

Event sourcing and idempotence are natural partners. Events represent immutable facts about what happened in the past. They can be stored, replayed, and processed multiple times, but the underlying fact doesn't change. However, the effect of processing that event should be idempotent. If a "BookBorrowed" event arrives twice because of a network retry, your system should recognize that the book has already been marked as borrowed and not attempt to borrow it again.

This is where event IDs become crucial. Every event has a unique identifier. When you process an event, you can check: have I already processed an event with this ID? If yes, skip it. If no, process it and record that you've seen this ID. This simple pattern makes event processing idempotent.

Event sourcing makes this pattern easier to implement because events are already structured with unique identifiers, clear timestamps, and explicit ordering. You're processing well-defined, immutable facts, and the question "have I seen this before?" has a clear answer.

EventSourcingDB: Two Perspectives on the Problem

When you're building event-sourced systems with EventSourcingDB, the exactly-once challenge appears in two distinct scenarios, and EventSourcingDB provides tools for both.

The first scenario is writing events. You're a client trying to write an event to the database. The network times out. Did the event get stored? If you retry and the event was already stored, you'll create a duplicate in the event stream. If you don't retry and it wasn't stored, you've lost the event.

EventSourcingDB solves this with preconditions. Preconditions allow you to perform atomic check-and-write operations. For example, you can say "write this event, but only if the subject is currently at revision 42." If another request has already written an event, the revision will have advanced to 43, and your retry will fail with a 409 Conflict response. This tells you explicitly that the event has already been written. You can retry write operations as many times as needed without fear of creating duplicates. You can learn more about how this works in the preconditions documentation.

The second scenario is observing events. You're consuming an event stream, processing each event as it arrives. Your connection drops. When you reconnect, where do you resume?

EventSourcingDB lets you choose by providing the lowerBound parameter with inclusivity control. If you reconnect with lowerBound set to the last event ID you received and type: "exclusive", you'll get at-most-once semantics – you might miss the boundary event if you hadn't fully processed it before disconnecting. If you reconnect with type: "inclusive", you'll get at-least-once semantics – you might reprocess the boundary event, but you won't miss it.

You choose whether you prefer the risk of missing an event or the risk of processing it twice. For most business-critical applications, you'll choose at-least-once by setting type: "inclusive", and then you'll design your event handlers to be idempotent so that reprocessing is safe. You can learn more about this in the observing events documentation.

Together, these two features – preconditions for writes and lowerBound control for reads – give you the tools to build reliable event-sourced systems without pretending that exactly-once delivery is possible.

Practical Patterns for Idempotence

Three practical patterns work well in event-sourced systems:

Event ID tracking – Maintain a record of every event ID you've processed. Before processing an event, check if its ID is in your processed set. For space-efficient tracking, you can use Bloom filters for probabilistic membership testing or capped tables that automatically remove old entries. This works especially well when combined with transactional processing, where you store the processed event ID in the same transaction as the business effect.

Natural business keys – Often, the domain itself tells you whether an event has already been applied. If you receive a "BookBorrowed" event for a book that's already marked as borrowed, you know this is a duplicate. This is elegant and efficient, but requires careful thinking about your domain model and state transitions.

State machine design – Structure your domain entities as state machines where transitions are explicitly defined and inherently idempotent. If you receive the same event twice, the state machine simply recognizes that the transition has already occurred and remains in the current state. The transition "confirmed → confirmed" is valid and safe.

In practice, you'll often combine these patterns. The key is to design for idempotence from the beginning, not to retrofit it later when duplicates cause production incidents.

Embrace the Truth

Exactly-once delivery is impossible in distributed systems. This isn't a limitation of current technology – it's a fundamental property of asynchronous communication between independent components that can fail independently. The Two Generals Problem proved this in 1975, and the proof still holds.

Exactly-once processing is possible, but only within narrow transactional boundaries. For the vast majority of distributed systems, the honest solution is at-least-once delivery combined with idempotent processing, giving you effectively-once semantics. Messages are delivered reliably, retried if necessary, and your handlers are designed to recognize and skip duplicates.

Event sourcing makes this pattern natural. Events are immutable facts with unique identifiers. Event handlers check whether they've already processed a given event ID. EventSourcingDB provides the tools you need: preconditions to make writes safe to retry, lowerBound controls to let you choose at-most-once or at-least-once semantics when observing, and a foundation that makes idempotent processing straightforward to implement. You can read more about the overall approach in the consistency guarantees documentation.

The next time someone promises you exactly-once delivery, ask them what they really mean. Chances are, they mean effectively-once through idempotence, which is honest and achievable. And if they truly mean exactly-once delivery across distributed systems, they're either redefining terms, making assumptions that won't hold in production, or selling something that doesn't exist.

Build for reality, not for marketing slogans. Design your systems to handle the messiness of networks, the inevitability of retries, and the reality of duplicates. Make your operations idempotent. Test with chaos. Be transparent about trade-offs. The result will be more robust, more honest, and more reliable than any system built on the myth of exactly-once delivery.