caching strategiessystem designRedis cachingcache invalidationCDN cachingcache-aside patternperformance

Caching Strategies for System Design: Patterns, Layers & Invalidation

March 28, 2026 5 min readBy Codelit Team Discussion

Caching Strategies for System Design#

Every high-traffic system relies on caching. Without it, databases buckle under repeated identical queries, APIs slow to a crawl, and users leave. In this guide we cover the patterns, layers, invalidation strategies, and tools you need to design a robust caching architecture.

Why Cache?#

Caching stores computed or fetched results closer to the consumer so subsequent requests skip expensive work. The benefits are straightforward:

Latency reduction — serving from memory is orders of magnitude faster than disk or network I/O.
Throughput increase — fewer requests reach origin servers, freeing capacity.
Cost savings — less compute, fewer database connections, lower cloud bills.
Resilience — stale cache can serve traffic during origin outages.

Cache Patterns#

Cache-Aside (Lazy Loading)#

The application checks the cache first. On a miss it reads from the database, then populates the cache.

def get_user(user_id):
    cached = redis.get(f"user:{user_id}")
    if cached:
        return deserialize(cached)

    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    redis.setex(f"user:{user_id}", TTL_SECONDS, serialize(user))
    return user

Pros: Only requested data is cached. Cons: First request is always slow; risk of stale data.

Read-Through#

The cache sits in front of the database. On a miss the cache itself fetches from the origin and stores the result — the application only talks to the cache.

Write-Through#

Every write goes to the cache and the database in the same operation. Data is always fresh in cache but writes are slower because both stores must acknowledge.

Write-Behind (Write-Back)#

Writes go to the cache immediately; the cache asynchronously flushes to the database. This lowers write latency but introduces a durability risk — if the cache node dies before flushing, data is lost.

┌────────┐  write  ┌───────┐  async flush  ┌──────────┐
│  App   │───────▶│ Cache  │─────────────▶│ Database │
└────────┘        └───────┘               └──────────┘

Cache Layers#

A production system typically stacks several cache layers:

Browser Cache  ──▶  CDN (Edge)  ──▶  App-Level Cache  ──▶  DB Query Cache
   (local)         (CloudFront)       (Redis/Memcached)      (MySQL QC)

Layer	Controlled By	Typical TTL	Best For
Browser	`Cache-Control` headers	Minutes–days	Static assets, API responses
CDN	Edge rules (CloudFront, Cloudflare)	Seconds–hours	Images, HTML, API at edge
Application	Code (Redis, Memcached)	Seconds–hours	DB query results, computed data
Database	DB config	Automatic	Repeated identical queries

CDN Caching#

CDN caching is the first line of defense. Set proper Cache-Control and Surrogate-Control headers:

location /api/public/ {
    add_header Cache-Control "public, max-age=60, stale-while-revalidate=30";
    add_header Surrogate-Control "max-age=300";
}

Redis Caching#

Redis is the most popular application-level cache. Key features that matter for system design:

Data structures — strings, hashes, sorted sets, streams.
Eviction policies — allkeys-lru, volatile-ttl, noeviction.
Cluster mode — horizontal sharding across nodes.
Pub/Sub — useful for event-driven invalidation.

# Redis hash for structured cache entries
redis.hset("product:42", mapping={
    "name": "Widget",
    "price": "29.99",
    "stock": "150",
})
redis.expire("product:42", 3600)

Cache Invalidation Strategies#

Phil Karlton famously said there are only two hard problems in computer science: cache invalidation and naming things. Here are three proven approaches.

TTL-Based Expiry#

Set a time-to-live on every key. Simple and predictable, but data can be stale up to the full TTL window.

Event-Driven Invalidation#

When the source of truth changes, publish an event that triggers cache deletion.

def update_product(product_id, data):
    db.update("products", product_id, data)
    redis.delete(f"product:{product_id}")          # immediate invalidation
    event_bus.publish("product.updated", product_id) # notify other services

Versioned Keys#

Embed a version or hash in the cache key. When data changes, increment the version — old keys simply expire naturally.

version = db.get_version("config")
cache_key = f"config:v{version}"

Cache Stampede Prevention#

When a popular key expires, hundreds of concurrent requests may all miss and hit the database simultaneously — a cache stampede (or thundering herd).

Locking#

Only one request fetches from the origin; the rest wait or serve stale data.

def get_with_lock(key, fetch_fn, ttl=60):
    value = redis.get(key)
    if value:
        return deserialize(value)

    lock_key = f"lock:{key}"
    if redis.set(lock_key, "1", nx=True, ex=5):  # acquire lock
        value = fetch_fn()
        redis.setex(key, ttl, serialize(value))
        redis.delete(lock_key)
        return value

    time.sleep(0.05)           # brief back-off
    return get_with_lock(key, fetch_fn, ttl)  # retry

Probabilistic Early Expiry#

Each reader has a small random chance of refreshing the cache before TTL expires, spreading recomputation over time instead of concentrating it at expiry.

Stale-While-Revalidate#

Serve the stale value immediately while refreshing in the background. This works at the HTTP layer (stale-while-revalidate directive) and can be implemented in application code.

Consistency Patterns#

Strong consistency — write-through with synchronous invalidation. Slower but safe for financial data.
Eventual consistency — write-behind with TTL or event-driven invalidation. Good for product catalogs, feeds, analytics.
Read-your-writes — after a user writes, route their reads to the primary or bypass cache briefly to avoid showing stale self-data.

Tools at a Glance#

Tool	Type	Use Case
Redis	In-memory store	App cache, sessions, rate limiting
Memcached	In-memory store	Simple key-value caching at scale
Varnish	HTTP accelerator	Reverse-proxy cache for web traffic
CloudFront	CDN	Edge caching for global distribution
Cloudflare	CDN + WAF	Edge cache with DDoS protection

Key Takeaways#

Layer your caches — browser, CDN, application, database.
Pick the right pattern — cache-aside for reads, write-behind for write-heavy workloads.
Plan invalidation from day one — TTL as a baseline, events for precision.
Guard against stampedes — locking or probabilistic early refresh.
Match consistency to the domain — strong for money, eventual for content.

Design your caching architecture at codelit.io.

128 articles on system design at codelit.io/blog.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

Build this architecture →

Comments

AI search

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

8 min read

AI safety

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

8 min read

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

Try these templates

Uber Real-Time Location System

Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.

6 components

E-Commerce Checkout System

Production checkout flow with Stripe payments, inventory management, and fraud detection.

11 components

Notification System

Multi-channel notification platform with preferences, templating, and delivery tracking.

9 components

Build this architecture

Generate an interactive architecture for Caching Strategies for System Design in seconds.

Try it in Codelit →

caching strategiessystem designRedis cachingcache invalidationCDN cachingcache-aside patternperformance

Caching Strategies for System Design: Patterns, Layers & Invalidation

March 28, 2026 5 min readBy Codelit Team Discussion

Caching Strategies for System Design#

Why Cache?#

Caching stores computed or fetched results closer to the consumer so subsequent requests skip expensive work. The benefits are straightforward:

Latency reduction — serving from memory is orders of magnitude faster than disk or network I/O.
Throughput increase — fewer requests reach origin servers, freeing capacity.
Cost savings — less compute, fewer database connections, lower cloud bills.
Resilience — stale cache can serve traffic during origin outages.

Cache Patterns#

Cache-Aside (Lazy Loading)#

The application checks the cache first. On a miss it reads from the database, then populates the cache.

def get_user(user_id):
    cached = redis.get(f"user:{user_id}")
    if cached:
        return deserialize(cached)

    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    redis.setex(f"user:{user_id}", TTL_SECONDS, serialize(user))
    return user

Pros: Only requested data is cached. Cons: First request is always slow; risk of stale data.

Read-Through#

The cache sits in front of the database. On a miss the cache itself fetches from the origin and stores the result — the application only talks to the cache.

Write-Through#

Every write goes to the cache and the database in the same operation. Data is always fresh in cache but writes are slower because both stores must acknowledge.

Write-Behind (Write-Back)#

┌────────┐  write  ┌───────┐  async flush  ┌──────────┐
│  App   │───────▶│ Cache  │─────────────▶│ Database │
└────────┘        └───────┘               └──────────┘

Cache Layers#

A production system typically stacks several cache layers:

Browser Cache  ──▶  CDN (Edge)  ──▶  App-Level Cache  ──▶  DB Query Cache
   (local)         (CloudFront)       (Redis/Memcached)      (MySQL QC)

Layer	Controlled By	Typical TTL	Best For
Browser	`Cache-Control` headers	Minutes–days	Static assets, API responses
CDN	Edge rules (CloudFront, Cloudflare)	Seconds–hours	Images, HTML, API at edge
Application	Code (Redis, Memcached)	Seconds–hours	DB query results, computed data
Database	DB config	Automatic	Repeated identical queries

CDN Caching#

CDN caching is the first line of defense. Set proper Cache-Control and Surrogate-Control headers:

location /api/public/ {
    add_header Cache-Control "public, max-age=60, stale-while-revalidate=30";
    add_header Surrogate-Control "max-age=300";
}

Redis Caching#

Redis is the most popular application-level cache. Key features that matter for system design:

Data structures — strings, hashes, sorted sets, streams.
Eviction policies — allkeys-lru, volatile-ttl, noeviction.
Cluster mode — horizontal sharding across nodes.
Pub/Sub — useful for event-driven invalidation.

# Redis hash for structured cache entries
redis.hset("product:42", mapping={
    "name": "Widget",
    "price": "29.99",
    "stock": "150",
})
redis.expire("product:42", 3600)

Cache Invalidation Strategies#

Phil Karlton famously said there are only two hard problems in computer science: cache invalidation and naming things. Here are three proven approaches.

TTL-Based Expiry#

Set a time-to-live on every key. Simple and predictable, but data can be stale up to the full TTL window.

Event-Driven Invalidation#

When the source of truth changes, publish an event that triggers cache deletion.

def update_product(product_id, data):
    db.update("products", product_id, data)
    redis.delete(f"product:{product_id}")          # immediate invalidation
    event_bus.publish("product.updated", product_id) # notify other services

Versioned Keys#

Embed a version or hash in the cache key. When data changes, increment the version — old keys simply expire naturally.

version = db.get_version("config")
cache_key = f"config:v{version}"

Cache Stampede Prevention#

When a popular key expires, hundreds of concurrent requests may all miss and hit the database simultaneously — a cache stampede (or thundering herd).

Locking#

Only one request fetches from the origin; the rest wait or serve stale data.

def get_with_lock(key, fetch_fn, ttl=60):
    value = redis.get(key)
    if value:
        return deserialize(value)

    lock_key = f"lock:{key}"
    if redis.set(lock_key, "1", nx=True, ex=5):  # acquire lock
        value = fetch_fn()
        redis.setex(key, ttl, serialize(value))
        redis.delete(lock_key)
        return value

    time.sleep(0.05)           # brief back-off
    return get_with_lock(key, fetch_fn, ttl)  # retry

Probabilistic Early Expiry#

Each reader has a small random chance of refreshing the cache before TTL expires, spreading recomputation over time instead of concentrating it at expiry.

Stale-While-Revalidate#

Serve the stale value immediately while refreshing in the background. This works at the HTTP layer (stale-while-revalidate directive) and can be implemented in application code.

Consistency Patterns#

Strong consistency — write-through with synchronous invalidation. Slower but safe for financial data.
Eventual consistency — write-behind with TTL or event-driven invalidation. Good for product catalogs, feeds, analytics.
Read-your-writes — after a user writes, route their reads to the primary or bypass cache briefly to avoid showing stale self-data.

Tools at a Glance#

Tool	Type	Use Case
Redis	In-memory store	App cache, sessions, rate limiting
Memcached	In-memory store	Simple key-value caching at scale
Varnish	HTTP accelerator	Reverse-proxy cache for web traffic
CloudFront	CDN	Edge caching for global distribution
Cloudflare	CDN + WAF	Edge cache with DDoS protection

Key Takeaways#

Layer your caches — browser, CDN, application, database.
Pick the right pattern — cache-aside for reads, write-behind for write-heavy workloads.
Plan invalidation from day one — TTL as a baseline, events for precision.
Guard against stampedes — locking or probabilistic early refresh.
Match consistency to the domain — strong for money, eventual for content.

Design your caching architecture at codelit.io.

128 articles on system design at codelit.io/blog.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

Build this architecture →

Comments

AI search

Build this architecture

Generate an interactive architecture for Caching Strategies for System Design in seconds.

Try it in Codelit →

Caching Strategies for System Design: Patterns, Layers & Invalidation

Caching Strategies for System Design#

Why Cache?#

Cache Patterns#

Cache-Aside (Lazy Loading)#

Read-Through#

Write-Through#

Write-Behind (Write-Back)#

Cache Layers#

CDN Caching#

Redis Caching#

Cache Invalidation Strategies#

TTL-Based Expiry#

Event-Driven Invalidation#

Versioned Keys#

Cache Stampede Prevention#

Locking#

Probabilistic Early Expiry#

Stale-While-Revalidate#

Consistency Patterns#

Tools at a Glance#

Key Takeaways#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Uber Real-Time Location System

E-Commerce Checkout System

Notification System

Build this architecture

Caching Strategies for System Design: Patterns, Layers & Invalidation

Caching Strategies for System Design#

Why Cache?#

Cache Patterns#

Cache-Aside (Lazy Loading)#

Read-Through#

Write-Through#

Write-Behind (Write-Back)#

Cache Layers#

CDN Caching#

Redis Caching#

Cache Invalidation Strategies#

TTL-Based Expiry#

Event-Driven Invalidation#

Versioned Keys#

Cache Stampede Prevention#

Locking#

Probabilistic Early Expiry#

Stale-While-Revalidate#

Consistency Patterns#

Tools at a Glance#

Key Takeaways#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Uber Real-Time Location System

E-Commerce Checkout System

Notification System

Build this architecture