databasessystem-designfundamentalsperformance

Materialized Views — Precomputed Query Results for Fast Reads at Scale

March 29, 2026 7 min readBy Codelit Team Discussion

What is a materialized view?#

A materialized view is a precomputed query result stored as a physical table. Unlike a regular (virtual) view that re-executes the query on every access, a materialized view holds the data on disk and serves reads directly — no computation required at query time.

The tradeoff: you pay storage and maintenance cost in exchange for dramatically faster reads.

Materialized views vs virtual views#

Aspect	Virtual view	Materialized view
Storage	None — query runs each time	Physical table on disk
Read latency	Depends on query complexity	Near-instant (table scan)
Staleness	Always fresh	Can be stale until refreshed
Write overhead	None	Maintenance cost on source changes
Use case	Abstraction, access control	Dashboards, analytics, read-heavy APIs

Virtual views are aliases for queries. Materialized views are caches with structure.

Full refresh vs incremental maintenance#

Full refresh#

The simplest approach: drop the materialized view contents and recompute from scratch.

REFRESH MATERIALIZED VIEW monthly_revenue;

This works for small datasets or infrequent updates. For a 100-million-row aggregation, full refresh can take minutes and block reads (unless you use CONCURRENTLY in PostgreSQL).

Incremental maintenance#

Instead of recomputing everything, incremental maintenance applies only the delta — the rows that changed since the last refresh.

The database tracks insertions, updates, and deletions on source tables. When a refresh is triggered, it:

Identifies changed rows using change tracking or triggers
Computes the effect of those changes on the materialized result
Applies the delta to the existing materialized data

Incremental maintenance is orders of magnitude faster for large datasets with small change rates. Oracle and SQL Server have supported it for decades. PostgreSQL does not natively support incremental refresh — you need extensions like pg_ivm or application-level logic.

CQRS and read models#

Command Query Responsibility Segregation (CQRS) separates write operations from read operations. The write side processes commands and emits events. The read side consumes those events and builds optimized read models — which are, conceptually, materialized views.

How CQRS read models work#

A command modifies the write model (normalized, optimized for writes)
An event is published describing the change
A read-model projector consumes the event and updates a denormalized read store
Queries hit the read store directly — no joins, no aggregations at query time

Each read model is tailored to a specific query pattern. An e-commerce system might maintain separate read models for product listings, order histories, and inventory dashboards — each shaped differently from the same source events.

Benefits of CQRS read models#

Independent scaling — read and write stores can scale separately
Query-optimized shapes — each read model matches its access pattern exactly
Technology freedom — the read store can be Elasticsearch, Redis, DynamoDB, or anything else
Eventual consistency — the tradeoff; read models lag behind writes by the event processing delay

Denormalized views#

Denormalization precomputes joins by embedding related data directly into the materialized result. Instead of joining orders, customers, and products at query time, the denormalized view stores a flat row with all fields.

Common denormalization patterns:

Embedded references — store the customer name and email alongside each order row
Pre-aggregated summaries — store daily totals instead of computing from individual transactions
Nested documents — in document stores, embed related arrays (e.g., order items inside the order)

The cost is data duplication and increased write amplification. Every source change may require updating multiple denormalized rows.

Streaming materialized views#

Traditional materialized views are refreshed on a schedule or on demand. Streaming materialized views update continuously as source data changes — the view is never stale.

ksqlDB#

ksqlDB creates materialized views over Kafka topics using SQL syntax:

CREATE TABLE order_totals AS
  SELECT customer_id, SUM(amount) AS total_spent
  FROM orders_stream
  GROUP BY customer_id
  EMIT CHANGES;

The order_totals table updates in real time as new events arrive on the orders_stream topic. Applications query it via pull queries for point lookups or push queries for streaming results.

Materialize#

Materialize is a streaming database that maintains SQL materialized views over change streams (Kafka, PostgreSQL CDC, S3). It uses differential dataflow to incrementally maintain complex queries including joins, aggregations, and temporal windows.

CREATE MATERIALIZED VIEW active_subscriptions AS
  SELECT customer_id, plan, started_at
  FROM subscriptions
  WHERE canceled_at IS NULL;

Unlike batch systems, Materialize updates the view within milliseconds of the source change. This makes it suitable for operational dashboards, real-time analytics, and serving live API responses.

Apache Flink materialized views#

Flink SQL supports creating dynamic tables from streaming sources. Combined with a state backend (RocksDB), Flink maintains materialized aggregations that applications can query via the Flink REST API or by sinking results to an external store.

Cache as a materialized view#

Martin Kleppmann argues that a cache is just an informal materialized view. Both store precomputed results derived from source data. The difference is that caches are typically managed by application code with ad-hoc invalidation, while materialized views are maintained by a system with well-defined consistency guarantees.

Treating your cache as a materialized view means:

Derive it from a changelog — subscribe to database change events (CDC) and update the cache deterministically
Make it rebuildable — the cache can be reconstructed from the event log at any time
Eliminate manual invalidation — the event stream tells you exactly what changed and when
Version it — include a sequence number so consumers know how fresh the cache is

This pattern eliminates the classic cache invalidation problem by turning it into a stream processing problem.

Event-driven refresh strategies#

Change Data Capture (CDC)#

CDC captures row-level changes from the database transaction log and publishes them as events. Tools like Debezium stream these changes to Kafka, where downstream consumers update materialized views.

This is the most reliable refresh mechanism because it captures every change, including those made by direct SQL or background jobs that bypass the application layer.

Trigger-based refresh#

Database triggers fire on INSERT, UPDATE, or DELETE and update the materialized view synchronously. This guarantees zero lag but adds latency to every write operation.

Scheduled refresh#

A cron job or scheduler refreshes the view at fixed intervals. Simple and predictable, but the view can be stale between refreshes. Suitable for dashboards and reports where minute-level freshness is acceptable.

Hybrid approach#

Many production systems combine strategies:

CDC for near-real-time updates to critical read models
Scheduled full refresh as a consistency safety net (catches any missed events)
On-demand refresh for ad-hoc analytical queries

When materialized views break down#

High write volume with complex joins — maintenance cost can exceed the read savings
Frequent schema changes — every source schema change requires updating the view definition and backfilling
Strong consistency requirements — materialized views are inherently eventually consistent (except synchronous trigger-based approaches)
Storage constraints — denormalized views can be significantly larger than the source data

Visualize materialized views in your architecture#

On Codelit, generate a CQRS system or a streaming pipeline with ksqlDB to see how materialized views connect to source data, event streams, and query endpoints. Click on any view to explore its refresh strategy and data flow.

This is article #232 in the Codelit engineering blog series.

Build and explore data architectures visually at codelit.io.

{ }

Explore the WhatsApp architecture interactively

Try it →

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

Build this architecture →

Comments

api design

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

8 min read

system design

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

7 min read

api

CORS Preflight Optimization — Reduce Latency on Every API Call

7 min read

Try these templates

WhatsApp-Scale Messaging System

End-to-end encrypted messaging with offline delivery, group chats, and media sharing at billions-of-messages scale.

9 components

Gmail-Scale Email Service

Email platform handling billions of messages with spam filtering, search indexing, attachment storage, and push notifications.

10 components

Build this architecture

Generate an interactive architecture for Materialized Views in seconds.

Try it in Codelit →

databasessystem-designfundamentalsperformance

Materialized Views — Precomputed Query Results for Fast Reads at Scale

March 29, 2026 7 min readBy Codelit Team Discussion

What is a materialized view?#

The tradeoff: you pay storage and maintenance cost in exchange for dramatically faster reads.

Materialized views vs virtual views#

Aspect	Virtual view	Materialized view
Storage	None — query runs each time	Physical table on disk
Read latency	Depends on query complexity	Near-instant (table scan)
Staleness	Always fresh	Can be stale until refreshed
Write overhead	None	Maintenance cost on source changes
Use case	Abstraction, access control	Dashboards, analytics, read-heavy APIs

Virtual views are aliases for queries. Materialized views are caches with structure.

Full refresh vs incremental maintenance#

Full refresh#

The simplest approach: drop the materialized view contents and recompute from scratch.

REFRESH MATERIALIZED VIEW monthly_revenue;

This works for small datasets or infrequent updates. For a 100-million-row aggregation, full refresh can take minutes and block reads (unless you use CONCURRENTLY in PostgreSQL).

Incremental maintenance#

Instead of recomputing everything, incremental maintenance applies only the delta — the rows that changed since the last refresh.

The database tracks insertions, updates, and deletions on source tables. When a refresh is triggered, it:

Identifies changed rows using change tracking or triggers
Computes the effect of those changes on the materialized result
Applies the delta to the existing materialized data

CQRS and read models#

How CQRS read models work#

A command modifies the write model (normalized, optimized for writes)
An event is published describing the change
A read-model projector consumes the event and updates a denormalized read store
Queries hit the read store directly — no joins, no aggregations at query time

Benefits of CQRS read models#

Independent scaling — read and write stores can scale separately
Query-optimized shapes — each read model matches its access pattern exactly
Technology freedom — the read store can be Elasticsearch, Redis, DynamoDB, or anything else
Eventual consistency — the tradeoff; read models lag behind writes by the event processing delay

Denormalized views#

Common denormalization patterns:

Embedded references — store the customer name and email alongside each order row
Pre-aggregated summaries — store daily totals instead of computing from individual transactions
Nested documents — in document stores, embed related arrays (e.g., order items inside the order)

The cost is data duplication and increased write amplification. Every source change may require updating multiple denormalized rows.

Streaming materialized views#

Traditional materialized views are refreshed on a schedule or on demand. Streaming materialized views update continuously as source data changes — the view is never stale.

ksqlDB#

ksqlDB creates materialized views over Kafka topics using SQL syntax:

CREATE TABLE order_totals AS
  SELECT customer_id, SUM(amount) AS total_spent
  FROM orders_stream
  GROUP BY customer_id
  EMIT CHANGES;

The order_totals table updates in real time as new events arrive on the orders_stream topic. Applications query it via pull queries for point lookups or push queries for streaming results.

Materialize#

CREATE MATERIALIZED VIEW active_subscriptions AS
  SELECT customer_id, plan, started_at
  FROM subscriptions
  WHERE canceled_at IS NULL;

Unlike batch systems, Materialize updates the view within milliseconds of the source change. This makes it suitable for operational dashboards, real-time analytics, and serving live API responses.

Apache Flink materialized views#

Cache as a materialized view#

Treating your cache as a materialized view means:

Derive it from a changelog — subscribe to database change events (CDC) and update the cache deterministically
Make it rebuildable — the cache can be reconstructed from the event log at any time
Eliminate manual invalidation — the event stream tells you exactly what changed and when
Version it — include a sequence number so consumers know how fresh the cache is

This pattern eliminates the classic cache invalidation problem by turning it into a stream processing problem.

Event-driven refresh strategies#

Change Data Capture (CDC)#

CDC captures row-level changes from the database transaction log and publishes them as events. Tools like Debezium stream these changes to Kafka, where downstream consumers update materialized views.

This is the most reliable refresh mechanism because it captures every change, including those made by direct SQL or background jobs that bypass the application layer.

Trigger-based refresh#

Database triggers fire on INSERT, UPDATE, or DELETE and update the materialized view synchronously. This guarantees zero lag but adds latency to every write operation.

Scheduled refresh#

Hybrid approach#

Many production systems combine strategies:

CDC for near-real-time updates to critical read models
Scheduled full refresh as a consistency safety net (catches any missed events)
On-demand refresh for ad-hoc analytical queries

When materialized views break down#

High write volume with complex joins — maintenance cost can exceed the read savings
Frequent schema changes — every source schema change requires updating the view definition and backfilling
Strong consistency requirements — materialized views are inherently eventually consistent (except synchronous trigger-based approaches)
Storage constraints — denormalized views can be significantly larger than the source data

Visualize materialized views in your architecture#

This is article #232 in the Codelit engineering blog series.

Build and explore data architectures visually at codelit.io.

{ }

Explore the WhatsApp architecture interactively

Try it →

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

Build this architecture →

Comments

api design

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

8 min read

system design

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

7 min read

api

CORS Preflight Optimization — Reduce Latency on Every API Call

7 min read

Try these templates

WhatsApp-Scale Messaging System

End-to-end encrypted messaging with offline delivery, group chats, and media sharing at billions-of-messages scale.

9 components

Gmail-Scale Email Service

Email platform handling billions of messages with spam filtering, search indexing, attachment storage, and push notifications.

10 components

Build this architecture

Generate an interactive architecture for Materialized Views in seconds.

Try it in Codelit →

Materialized Views — Precomputed Query Results for Fast Reads at Scale

What is a materialized view?#

Materialized views vs virtual views#

Full refresh vs incremental maintenance#

Full refresh#

Incremental maintenance#

CQRS and read models#

How CQRS read models work#

Benefits of CQRS read models#

Denormalized views#

Streaming materialized views#

ksqlDB#

Materialize#

Apache Flink materialized views#

Cache as a materialized view#

Event-driven refresh strategies#

Change Data Capture (CDC)#

Trigger-based refresh#

Scheduled refresh#

Hybrid approach#

When materialized views break down#

Visualize materialized views in your architecture#

Comments

Related articles

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

CORS Preflight Optimization — Reduce Latency on Every API Call

Try these templates

WhatsApp-Scale Messaging System

Gmail-Scale Email Service

Build this architecture

Materialized Views — Precomputed Query Results for Fast Reads at Scale

What is a materialized view?#

Materialized views vs virtual views#

Full refresh vs incremental maintenance#

Full refresh#

Incremental maintenance#

CQRS and read models#

How CQRS read models work#

Benefits of CQRS read models#

Denormalized views#

Streaming materialized views#

ksqlDB#

Materialize#

Apache Flink materialized views#

Cache as a materialized view#

Event-driven refresh strategies#

Change Data Capture (CDC)#

Trigger-based refresh#

Scheduled refresh#

Hybrid approach#

When materialized views break down#

Visualize materialized views in your architecture#

Comments

Related articles

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

CORS Preflight Optimization — Reduce Latency on Every API Call

Try these templates

WhatsApp-Scale Messaging System

Gmail-Scale Email Service

Build this architecture