Email Uniqueness in Event Sourcing¶

Recently, a reader asked us how to enforce email uniqueness when registering users in an event-sourced system. It is a fair question. In a relational database, you add a column, slap on UNIQUE, and the problem is solved. The database does the heavy lifting, and you move on with your day. In Event Sourcing, that easy answer simply does not exist.

The good news is there are several reasonable ways to solve it. The bad news is that none of them is free. Each one pays for uniqueness somewhere, whether in scale, in latency, in user experience, or in operational complexity. Choosing among them is not a technical question. It is a question about what your business actually means by "unique."

What `UNIQUE` Actually Buys You¶

Before we look at solutions, it pays to look at the original problem. A UNIQUE constraint in a relational database is a statement about the current state of a table. Whenever a row is inserted or updated, the database verifies, atomically and synchronously, that no other row violates the constraint. The mechanism is invisible to the application, and the guarantee is enforced inside a single transaction. Two competing inserts cannot both succeed.

Event Sourcing has no such table. The truth lives in the event stream, a chronological log of what has happened, not a snapshot of what currently is. Asking "is this email unique?" is no longer a question the storage layer answers for free. It becomes a question about the current state of the system, derived from history. And current state, in a distributed system, is always a matter of agreement and timing.

This is why uniqueness in Event Sourcing is not a storage detail. It is a system-wide consistency rule, and consistency is a business decision, not a default. Once you accept that, the question stops being "how do I add UNIQUE?" and becomes "where do I want to pay for it?" Six options follow, each paying in a different currency.

Option 1: One Aggregate for All Emails¶

The most direct translation of the relational mental model is to introduce a single aggregate that owns the list of all known email addresses. Every registration goes through it. Before UserRegistered is emitted, the aggregate checks whether the email is already in its list. If it is, the registration fails. If not, the new email is appended along with the new user.

The mechanics are clean. Two parallel registrations end up serialized through the aggregate's optimistic concurrency control. One write succeeds. The other is rejected with a 409. From the outside, this looks exactly like the database behavior the developer was hoping for.

The price shows up at scale. Every single registration in your entire system goes through one aggregate. That aggregate grows unbounded, holding every email ever registered. Every write needs to read the full state, validate, and append. The aggregate becomes a serialization point that the entire registration path depends on. For hundreds of users, this is fine. For tens of thousands, it starts to feel slow. For millions, it stops working entirely. Your aggregate is not a table, and an aggregate of every email is exactly the kind of design that pretends otherwise.

Option 2: Ask EventQL on Every Write¶

A more elegant-looking variation skips the dedicated aggregate altogether. EventSourcingDB supports preconditions, small declarative checks evaluated atomically with each write. One of them, isEventQlQueryTrue, lets you express any condition you can phrase as an EventQL query. So why not ask, on each write, whether a UserRegistered event with that email already exists? If the count is zero, write the event. If not, reject it.

The flow is appealing. A single write, a declarative precondition, and the event store guarantees atomicity. No second aggregate, no extra workflow. Two parallel registrations are serialized, the duplicate is rejected with a 409. It looks like exactly what we want.

We strongly advise against this approach. The reason is not correctness, it is scale. The query has to scan the event stream looking for prior UserRegistered events with the same email, and the cost of that scan grows linearly with the number of registrations you have already processed. With a thousand users, it is invisible. With a hundred thousand, you start to notice. With a million, every registration is paying for the size of your user base. The hot path of registration is precisely the wrong place to introduce a cost that grows over time. What looks elegant on day one becomes more expensive every single day.

Preconditions are an excellent tool for guarding local invariants, within a stream, within a subject. They are the wrong tool for a global invariant that requires looking at every event ever written.

Option 3: Write First, Reconcile Later¶

If strong consistency is too expensive, the opposite extreme is also an option: do not check at all on the write path. Just write the UserRegistered event. A read-model, running asynchronously, eventually notices the duplicate and resolves it deterministically. For example, the earlier event ID wins, the later account is invalidated, and the affected user is notified.

This is the cheapest write path you can have. Maximum throughput, zero coordination, no contention. Two registrations with the same email simply both succeed, and the housekeeping process cleans up the mess afterwards. The implementation is shockingly simple.

You pay in user experience and operational discipline. The system is eventually consistent, and there is a window, usually short but always non-zero, in which both accounts appear valid. The user who registered second has a working account, until they do not. They get a confirmation email, they set up their profile, and only later do they discover that their account has been invalidated. That is a bad story to be on the receiving end of. And the housekeeping process needs to actually run, with monitoring, alerts, and clear conflict-resolution rules. If it falls behind, your problem grows.

Option 4: Make the Email the Subject¶

Another tempting answer is to use the email itself as the stream identifier. The subject becomes /users/<email>, or, if you do not want the address visible in the path, /users/<sha256-of-email>. With isSubjectPristine as a precondition, only one stream per email can ever come into existence. Two parallel registrations with the same email collide on the same subject path, and the second one fails with a 409.

This option is genuinely attractive. It is strongly consistent without a global aggregate, without scanning the entire event log, and without a second entity. A single write does the entire job. Compared to Option 1, it scales beautifully. Each email gets its own tiny stream, and there is no contention between unrelated registrations.

The trade-off shows up the moment your business decides email addresses are allowed to change. Now the subject, your identity, is also your uniqueness mechanism. You cannot move the user to a new email without either migrating the stream or losing the link between identity and locking. Hash-based variants make the subject opaque, which is great for privacy and terrible for debugging. The subject ends up doing two jobs at once, and the moment those jobs disagree, the model breaks down.

Option 5: Reserve First, Register Second¶

The Reservation Pattern separates the two responsibilities Option 4 conflated. A registration becomes a two-step process. First, an EmailReservationRequested event is written to a dedicated reservation aggregate, identified by the email, for example the subject /email-reservations/<email> with isSubjectPristine as a guard. Only if the reservation succeeds is the actual UserRegistered event written, with its own user ID, on its own stream.

The flow handles conflicts cleanly. Two parallel attempts with the same email race for the reservation. One wins, one is rejected immediately. The successful path then proceeds to register the user. If the user abandons the registration, an EmailReservationReleased event frees the email up. From the user's perspective, uniqueness is visible at all times. There is no window in which both registrations appear successful, unlike Option 3. Identity and locking are clearly separated, unlike Option 4. Confirmation emails, expiration timers, and even cleanup logic all fit naturally into the workflow.

The price is complexity. The system is still eventually consistent between the reservation and the actual account creation, just on a much shorter scale. You need TTL logic for reservations that never get confirmed. You need compensation for accounts that fail mid-registration. You write more events, you operate more workflows, and you have to reason about more failure modes. This is the cost of a clean separation between identity and uniqueness, and it is the option we usually recommend when the consistency boundary genuinely is the registration workflow itself. Dynamic consistency boundaries are exactly the kind of mental model that makes this design feel natural rather than convoluted.

Option 6: Borrow a Unique Index From Elsewhere¶

Finally, the pragmatist's answer. Why solve the problem inside Event Sourcing at all? Spin up a small relational table with one column and a UNIQUE index, and use it as a lock outside the event store. Every registration first inserts the email into that table. If it succeeds, the UserRegistered event is written. If it fails, the registration is rejected.

The pro side is concrete. You get hard, atomic uniqueness on the email, with the exact semantics relational developers have used for decades. Operations teams understand it without explanation. And if your environment already runs SQL for identity, sessions, or other master data, this is not a new system, it is reuse of infrastructure you already operate.

The con side is equally concrete. There is no distributed transaction between the lock table and the event store. You have two separate systems with separate lifecycles, and the order of operations becomes a source of subtle bugs. Lock first, then write the event? If the event write fails, the email is now permanently locked, but no account exists. Write the event first, then lock? You can end up with a registered user whose email is not actually reserved, which defeats the purpose. You traded one consistency problem for another, and now it lives at the boundary between two systems instead of inside one. In a green-field setup, this means a second storage system, additional operational burden, and new failure modes that did not exist before.

There Is No Free Lunch¶

Six options, all of which work, none of which is free. Strong consistency in a single aggregate buys you simplicity at the cost of scale. EventQL preconditions buy you elegance at the cost of linear-growing query time. Subject-as-identity buys you a clean stream-per-email at the cost of flexibility when emails change. Write-first buys you throughput at the cost of user experience and reconciliation work. Reservation buys you a clean separation at the cost of workflow complexity. External locks buy you familiarity at the cost of a second system to keep in sync.

Notice what is happening. Each option pays in a different currency, and which currency is cheap for you depends entirely on your context. A startup with a few thousand users and an existing SQL database might happily borrow a unique index. A large platform with millions of users and a heavy registration workflow will probably reach for the Reservation Pattern. A system where email changes are common will avoid Option 4 instinctively. A team trying out Event Sourcing on a small scale might be perfectly fine with Option 1 for years.

This is the part that surprises developers coming from a UNIQUE-constraint background. Uniqueness is not a technical detail you tick off and forget about. It is a modeling decision that touches identity, scale, user experience, and operations. The relational database hid the trade-off behind a keyword. Event Sourcing makes the trade-off visible. That is a feature, not a bug. Once you see the choices clearly, you can pick the one that fits your business and explain to the next developer why.

If you would like to go deeper on the building blocks behind these patterns, the documentation on preconditions is the natural next stop. And if you would like to talk through which option fits your specific situation, we would love to hear from you at hello@thenativeweb.io.