Skip to content

The Read Model Zoo: Projections Beyond Tables

Say "projection" to most developers and they reach, almost reflexively, for a SQL table. Denormalized, perhaps materialized, but ultimately rows in a relational database. It happens so quickly that it doesn't feel like a decision. It feels like the definition of the word.

It isn't. A read model is just a query-optimized view of the event history, and the shape it takes should follow the query, not the convention. There are at least four other shapes worth knowing about, each one fitting a class of queries that a SQL table either can't handle well or has no business handling at all. Once you've seen the menu, the reflex gets harder to justify.

The Reflex: A SQL Table

Let's start by being fair to the default. There's a reason it's the default.

Imagine a library system. Events like BookBorrowed, BookReturned, and BookReserved flow into the event store. A common query: "which books does reader 42 currently have borrowed?" This is a transactional lookup. It needs a current snapshot, indexed by reader ID, with a predictable shape. A SQL table is exactly right for this.

The projection that builds it is straightforward. On BookBorrowed, INSERT a row. On BookReturned, DELETE it. PostgreSQL or MySQL handles the storage, indexing handles the lookup, and the query response is in single-digit milliseconds. There's nothing wrong with this picture, and nothing exotic about it. For lookups of current state by a known key, the SQL table earns its place.

The trouble starts when the same shape gets pressed into service for queries it wasn't designed for. That's when projections start to creak.

The Same Events, Different Questions

Step back for a moment. Where did the SQL table come from? Not from the events themselves. BookBorrowed doesn't care whether it lives in a row, a graph node, or a JSON file. The table came from the query you anticipated. You expected someone to ask "which books does this reader have?", you reasoned backward to a shape that answers it efficiently, and you wrote a projection that maintains that shape.

That's the entire game. The events are the source of truth; everything else is shaped to fit a question. If you're writing CQRS-style systems, this framing should be familiar from our post on CQRS Without the Complexity. The read side serves queries. The write side records facts. They have different jobs and, importantly, different optimal data structures.

The interesting consequence is that different queries want different shapes. A full-text search wants an inverted index. A "who knows whom" question wants a graph. A "find things like this" question wants a vector space. A monthly summary wants almost nothing at all. None of these are well-served by a row in a table, and forcing them into one is how teams end up with stored procedures that nobody can debug and LIKE '%foo%' queries that lock the database.

Let's walk through four other shapes the same library events can take.

A Search Index for Full-Text Queries

Suppose readers want to search book descriptions and reviews. "Find me books about consciousness." A SQL LIKE '%consciousness%' works as a tech demo and falls apart under any real load. Worse, it can't do stemming, synonyms, multi-language tokenization, or relevance ranking.

A search engine like Elasticsearch, OpenSearch, Meilisearch, or Tantivy is built for exactly this. The projection feeds it: on BookDescribed, upsert a document; on BookReviewed, append the review text into the document; on BookDeleted, remove the document. Hooked into the SDK's observeEvents iterator, the projection looks like this:

for await (const event of client.observeEvents('/books', { recursive: true })) {
  if (event.type !== 'io.eventsourcingdb.library.book-described') {
    continue;
  }

  await searchIndex.upsert({
    id: event.data.bookId,
    title: event.data.title,
    description: event.data.description,
    updatedAt: event.time.toISOString(),
  });
}

What you gain is a query path that actually fits the question: phrase matching, relevance scoring, faceting by genre, suggestions for misspellings. What you give up is transactional consistency with the rest of your system, because the index updates asynchronously, like every projection. That's not a regression. That's just the read side being honest about how it works, as we discussed in Read-Model Consistency and Lag.

A Graph for Relationships

Now imagine a different query: "what other books did people who borrowed Crime and Punishment also borrow?" Or "find a chain of co-borrowers between Reader 42 and Reader 117." These are relationship queries, and a graph database makes them straightforward where SQL has to resort to recursive CTEs that few teams want to read on a Friday afternoon.

The projection writes edges. On BookBorrowed, upsert a (Reader)-[:BORROWED]->(Book) relationship in Neo4j, ArangoDB, or Memgraph. On BookReturned, you might add an attribute to the edge with the return date rather than deleting it, because the historical fact of "this reader once borrowed this book" is exactly what your recommendation query depends on.

for await (const event of client.observeEvents('/books', { recursive: true })) {
  if (event.type !== 'io.eventsourcingdb.library.book-borrowed') {
    continue;
  }

  await graph.run(
    `MERGE (r:Reader {id: $readerId})
     MERGE (b:Book {id: $bookId})
     MERGE (r)-[rel:BORROWED]->(b)
     SET rel.borrowedAt = $borrowedAt`,
    event.data,
  );
}

Graph queries that would be brutal in SQL ("five hops out, weighted by frequency") become single Cypher statements. You wouldn't store all your data in a graph, but for relationship-shaped questions, the graph projection is the right tool, used where it shines.

A Vector Store for Semantic Similarity

Now a query that wasn't even possible ten years ago: "find me books that are conceptually similar to The Brothers Karamazov." Not similar in title, not similar in author, but similar in what they're actually about.

This is the home turf of vector embeddings. On BookDescribed, compute an embedding from the description and metadata using whatever model fits your domain, then upsert the vector into Qdrant, Pinecone, Weaviate, or pgvector if you want to keep it inside PostgreSQL. The query is then a nearest-neighbor lookup against that vector space.

The projection update is mostly orchestration: fetch the description text from the event, call the embedding model, write the resulting vector with its book ID. The vector store handles the similarity search, and the result is a list of books your readers might actually want to discover. This particular kind of projection is one of the reasons we spent time thinking about Event-Driven Data Science: EventSourcingDB Meets Python and Pandas: events feed naturally into analytical and ML-shaped read models, not just transactional ones.

Worth noting: vector projections are typically slower to update because the embedding step itself is non-trivial. That's fine. You batch, debounce, or run them on a slower lane. Different read models have different lag tolerances, and the system architecture should reflect that.

A Plain JSON File for Low-Traffic Dashboards

Now consider the opposite end of the spectrum: a monthly statistics page in your library's admin UI. Viewed maybe ten times a day, by three administrators. Shows aggregate counts, top-borrowed books, average loan duration.

Reach for PostgreSQL? Redis? Neither. Reach for a JSON file. Fold the relevant events into a document, write it atomically to disk, and serve it as a static file behind a CDN if you have one, or directly from the web server if you don't.

let stats = initialStats();

for await (const event of client.observeEvents('/books', { recursive: true })) {
  stats = recompute(stats, event);
  await writeJsonAtomic('/var/www/stats/monthly.json', stats);
}

The projection is trivial. The query is fetch("/stats/monthly.json"). There's no connection pool to manage, no migration to plan, no backup to schedule beyond the events themselves. If you need to change the shape, you drop the file, replay, and you have a new one in seconds.

This shape gets dismissed because it feels unserious. It is serious. For low-traffic, read-heavy, write-rare projections, it's often the simplest thing that could possibly work, and the simplest thing that could possibly work tends to be the right answer.

And Yes, In-Memory Counts Too

A sixth shape, which we've covered in depth elsewhere: the projection that doesn't get persisted at all. For workloads where the data fits in memory and the process can rebuild from the event store on startup, you don't need a database at all.

We made the full case in Your Read Model Doesn't Always Need a Database, so we won't repeat it here. The relevant point for this post is just that the in-memory map is yet another shape on the menu, and it sits comfortably next to the five above. The right answer is "whichever shape the query wants," and sometimes the query wants a Map<string, Borrowing>.

What Makes This Possible

A reasonable reaction to all of the above is: "fine, but maintaining six projections sounds exhausting." It would be, in a system where projections were expensive. In an event-sourced system, they aren't.

The reason is the events are the source of truth, and everything else is derived. If you decide tomorrow that your search index has the wrong analyzer chain, you drop it, change the projection code, replay the events, and you have a new index. If a new product team needs a graph projection that didn't exist last week, they write a handler, point it at the event store, and replay. No data migration. No coordination with the write side. No risk to the existing read models.

This is what we mean when we say projections are cheap. The cost of a projection isn't the storage or the compute; it's the cost of building and maintaining the code that produces it. And because each projection is independent, the cost is proportional to how many you actually need, not to how many possible shapes exist in your system.

There's a deeper consequence here. Once projections are cheap, you stop thinking of them as expensive things to be carefully designed up front and start thinking of them as disposable artifacts that follow whatever query the business needs next. The right shape can win. The wrong shape can be retired without ceremony. And the SQL table, when you do reach for it, is a choice rather than a reflex.

The Practical Rules

A few patterns we've learned the hard way.

Pick the shape per query, not per system. There is no rule that says "this system uses PostgreSQL for read models." There's only "this query wants rows, this query wants a graph, this query wants a vector space." Multiple shapes coexisting is the expected state, not a code smell.

Don't be afraid to have many projections. If a new screen needs a new shape, build a new projection. Two projections updating from the same events is not duplication. It's the read side doing its job.

Treat them as disposable. The minute a projection feels precious, something has gone wrong. Replay rebuilds them. If replay is too slow to be practical, that's a different problem worth solving, and we've written about it in Optimizing Event Replays.

The one non-negotiable: idempotency on the projection update. Because subscribers consume the event stream with at-least-once delivery, the same event can arrive twice. The projection handler must produce the same result either way. This is the same constraint that applies to any subscriber, which is why we kept harping on it in You Don't Need an Outbox: the absence of a broker doesn't remove the idempotency requirement; it just clarifies where it belongs.

Many Shapes, One Source of Truth

The SQL table isn't wrong. It's just the default that crowds out everything else, and the crowding tends to be invisible until someone asks a question the table can't answer well. At that point, teams pile workarounds onto the table instead of reaching for a different shape. Stored procedures. LIKE '%...%' queries. Recursive CTEs that nobody wants to debug. Each of these is a sign that the projection has been asked to do a job it wasn't designed for.

When events are the source of truth and projections are cheap, the design conversation changes. You stop arguing about which database to standardize on for read models. You start asking what shape each query wants, and you pick the answer per query. Some of them want a PostgreSQL table. Some want an Elasticsearch index. Some want a graph, a vector store, a JSON file, or nothing more than a map in memory. The events don't care. The events are the same.

If you'd like to go deeper on how to think about read-model design across these shapes, the Designing Read Models guide in our documentation works through the trade-offs in more detail.

And if you'd like to talk through which shapes your current read models are quietly being forced into, we'd love to hear from you at hello@thenativeweb.io.