Designing Read Models¶

This guide explores how to design effective, purpose-driven read models when working with EventSourcingDB. It covers the role of read models in an event-sourced architecture, discusses modeling strategies and trade-offs, and explains how to align projections with use cases. Read models are not simply a reflection of stored events – they are tailored, context-specific representations of current state.

Event-sourced systems distinguish between two concerns: recording the facts of what happened (writes) and providing efficient access to current or derived information (reads). Read models are the bridge between these worlds. They are derived from historical events, but shaped by the needs of the present. Their design determines how easily users, services, and tools can interact with the system – and how well the system performs under real-world conditions.

Read Models Serve a Specific Purpose¶

Every read model should exist to answer a specific set of questions. These may come from user interfaces, APIs, reports, automated decisions, or external integrations. The structure, update strategy, and storage of a read model should be guided by these needs – not by the structure of the event data itself.

Before designing a read model, ask what it is for:

Who or what consumes it?
What information is needed?
What queries must it support?
How fresh does it need to be?
What volume and access patterns are expected?

Answering these questions helps determine the optimal shape of the model – including whether it should be normalized or denormalized, batched or updated in real time, centralized or distributed.

Denormalization Is Often the Right Choice¶

In traditional databases, normalization is favored to reduce duplication and improve consistency. In read models, the opposite is often true. Since read models are derived and disposable, duplication is not only acceptable – it is sometimes essential.

Denormalized read models can:

Simplify queries and API responses
Eliminate the need for runtime joins
Improve cacheability and performance
Reduce coupling between components

For example, a read model for a book search interface might embed author names directly in the book entries, even if they come from separate events. This avoids runtime lookups and allows the interface to render quickly and simply.

The key is to treat read models as products, not databases. Their job is to serve specific consumers efficiently, even if that means duplicating or reshaping data.

Projection Strategies¶

There are several ways to build and update read models from events:

Stream projections subscribe to one or more event streams and react to new events by updating the model incrementally.
Batch projections replay all relevant events and rebuild the model from scratch.
Hybrid approaches combine replay for initialization and observation for live updates.

Which strategy you choose depends on size, consistency needs, and system architecture. For small or medium-sized models, full replays are often fast and simple. For larger systems, incremental updates via observation provide better scalability.

EventSourcingDB supports both approaches through its reading and observing APIs. Combined with event filters, type constraints, and subject hierarchies, you can tailor your projections to process only the events that matter.

Read Models Are Eventually Consistent¶

Because read models are derived from an append-only event log, they are inherently eventually consistent. This means that after an event is written, there is a short delay before the corresponding change appears in the read model. This is not a bug – it is a fundamental characteristic of the architecture.

You can mitigate this delay, but not eliminate it entirely. Therefore, downstream consumers must be designed to tolerate temporary staleness. For example:

UI elements can show loading indicators or optimistic updates.
Business logic can retry or defer actions if data is incomplete.
Monitoring systems can account for lag in derived metrics.

If strong consistency is required for a specific interaction, consider handling it within the write path – for example, by including the expected result directly in the event or command response.

Rebuilding and Resilience¶

One of the key strengths of event-sourced systems is the ability to rebuild read models from scratch. If a model becomes corrupted, outdated, or poorly structured, it can simply be discarded and recreated from the event history. This requires projections to be idempotent – they must produce the same result every time, given the same input.

To make rebuilding fast and reliable:

Keep projection logic deterministic and side-effect-free.
Avoid dependencies on external state or systems.
Use snapshots if the number of events becomes large.

Snapshots allow you to store intermediate states and resume from a later point in the stream. In EventSourcingDB, snapshots are treated as regular events, which means they fit naturally into the event flow and do not require separate infrastructure.

Choosing the Right Storage¶

Read models are stored separately from the event store. This allows them to be optimized for access, not durability. Depending on your use case, you might store them in:

A relational database (e.g. PostgreSQL) for structured data and complex queries
A document store (e.g. MongoDB) for nested or flexible data models
A key-value store (e.g. Redis) for fast lookups
A time-series database (e.g. InfluxDB) for metrics and event trends

The choice depends entirely on the shape and purpose of the read model. You are free to use different stores for different models – or even combine multiple models for different consumers. What matters is that the structure fits the task.

Read Models Are Not the Source of Truth¶

It is tempting to treat read models as the authoritative view of system state – especially when they are fast, easy to query, and often up to date. But in event-sourced systems, the only source of truth is the event log. Read models are derivations, and they may be incomplete, stale, or incorrect at any given moment.

This distinction is essential for system integrity. Never update read models directly. Never use them as the basis for critical decisions unless you have validated their freshness. And always design your system so that read models can be discarded and rebuilt without data loss.

Read models are powerful – but they are only as reliable as the events they are based on and the projections that create them. Treat them as tools, not as truth.