Caching Strategies for System Design: Patterns, Layers & Invalidation
Caching Strategies for System Design#
Every high-traffic system relies on caching. Without it, databases buckle under repeated identical queries, APIs slow to a crawl, and users leave. In this guide we cover the patterns, layers, invalidation strategies, and tools you need to design a robust caching architecture.
Why Cache?#
Caching stores computed or fetched results closer to the consumer so subsequent requests skip expensive work. The benefits are straightforward:
- Latency reduction — serving from memory is orders of magnitude faster than disk or network I/O.
- Throughput increase — fewer requests reach origin servers, freeing capacity.
- Cost savings — less compute, fewer database connections, lower cloud bills.
- Resilience — stale cache can serve traffic during origin outages.
Cache Patterns#
Cache-Aside (Lazy Loading)#
The application checks the cache first. On a miss it reads from the database, then populates the cache.
def get_user(user_id):
cached = redis.get(f"user:{user_id}")
if cached:
return deserialize(cached)
user = db.query("SELECT * FROM users WHERE id = %s", user_id)
redis.setex(f"user:{user_id}", TTL_SECONDS, serialize(user))
return user
Pros: Only requested data is cached. Cons: First request is always slow; risk of stale data.
Read-Through#
The cache sits in front of the database. On a miss the cache itself fetches from the origin and stores the result — the application only talks to the cache.
Write-Through#
Every write goes to the cache and the database in the same operation. Data is always fresh in cache but writes are slower because both stores must acknowledge.
Write-Behind (Write-Back)#
Writes go to the cache immediately; the cache asynchronously flushes to the database. This lowers write latency but introduces a durability risk — if the cache node dies before flushing, data is lost.
┌────────┐ write ┌───────┐ async flush ┌──────────┐
│ App │───────▶│ Cache │─────────────▶│ Database │
└────────┘ └───────┘ └──────────┘
Cache Layers#
A production system typically stacks several cache layers:
Browser Cache ──▶ CDN (Edge) ──▶ App-Level Cache ──▶ DB Query Cache
(local) (CloudFront) (Redis/Memcached) (MySQL QC)
| Layer | Controlled By | Typical TTL | Best For |
|---|---|---|---|
| Browser | Cache-Control headers | Minutes–days | Static assets, API responses |
| CDN | Edge rules (CloudFront, Cloudflare) | Seconds–hours | Images, HTML, API at edge |
| Application | Code (Redis, Memcached) | Seconds–hours | DB query results, computed data |
| Database | DB config | Automatic | Repeated identical queries |
CDN Caching#
CDN caching is the first line of defense. Set proper Cache-Control and Surrogate-Control headers:
location /api/public/ {
add_header Cache-Control "public, max-age=60, stale-while-revalidate=30";
add_header Surrogate-Control "max-age=300";
}
Redis Caching#
Redis is the most popular application-level cache. Key features that matter for system design:
- Data structures — strings, hashes, sorted sets, streams.
- Eviction policies —
allkeys-lru,volatile-ttl,noeviction. - Cluster mode — horizontal sharding across nodes.
- Pub/Sub — useful for event-driven invalidation.
# Redis hash for structured cache entries
redis.hset("product:42", mapping={
"name": "Widget",
"price": "29.99",
"stock": "150",
})
redis.expire("product:42", 3600)
Cache Invalidation Strategies#
Phil Karlton famously said there are only two hard problems in computer science: cache invalidation and naming things. Here are three proven approaches.
TTL-Based Expiry#
Set a time-to-live on every key. Simple and predictable, but data can be stale up to the full TTL window.
Event-Driven Invalidation#
When the source of truth changes, publish an event that triggers cache deletion.
def update_product(product_id, data):
db.update("products", product_id, data)
redis.delete(f"product:{product_id}") # immediate invalidation
event_bus.publish("product.updated", product_id) # notify other services
Versioned Keys#
Embed a version or hash in the cache key. When data changes, increment the version — old keys simply expire naturally.
version = db.get_version("config")
cache_key = f"config:v{version}"
Cache Stampede Prevention#
When a popular key expires, hundreds of concurrent requests may all miss and hit the database simultaneously — a cache stampede (or thundering herd).
Locking#
Only one request fetches from the origin; the rest wait or serve stale data.
def get_with_lock(key, fetch_fn, ttl=60):
value = redis.get(key)
if value:
return deserialize(value)
lock_key = f"lock:{key}"
if redis.set(lock_key, "1", nx=True, ex=5): # acquire lock
value = fetch_fn()
redis.setex(key, ttl, serialize(value))
redis.delete(lock_key)
return value
time.sleep(0.05) # brief back-off
return get_with_lock(key, fetch_fn, ttl) # retry
Probabilistic Early Expiry#
Each reader has a small random chance of refreshing the cache before TTL expires, spreading recomputation over time instead of concentrating it at expiry.
Stale-While-Revalidate#
Serve the stale value immediately while refreshing in the background. This works at the HTTP layer (stale-while-revalidate directive) and can be implemented in application code.
Consistency Patterns#
- Strong consistency — write-through with synchronous invalidation. Slower but safe for financial data.
- Eventual consistency — write-behind with TTL or event-driven invalidation. Good for product catalogs, feeds, analytics.
- Read-your-writes — after a user writes, route their reads to the primary or bypass cache briefly to avoid showing stale self-data.
Tools at a Glance#
| Tool | Type | Use Case |
|---|---|---|
| Redis | In-memory store | App cache, sessions, rate limiting |
| Memcached | In-memory store | Simple key-value caching at scale |
| Varnish | HTTP accelerator | Reverse-proxy cache for web traffic |
| CloudFront | CDN | Edge caching for global distribution |
| Cloudflare | CDN + WAF | Edge cache with DDoS protection |
Key Takeaways#
- Layer your caches — browser, CDN, application, database.
- Pick the right pattern — cache-aside for reads, write-behind for write-heavy workloads.
- Plan invalidation from day one — TTL as a baseline, events for precision.
- Guard against stampedes — locking or probabilistic early refresh.
- Match consistency to the domain — strong for money, eventual for content.
Design your caching architecture at codelit.io.
128 articles on system design at codelit.io/blog.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
AI Architecture Review
Get an AI audit covering security gaps, bottlenecks, and scaling risks
Related articles
Try these templates
Uber Real-Time Location System
Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.
6 componentsE-Commerce Checkout System
Production checkout flow with Stripe payments, inventory management, and fraud detection.
11 componentsNotification System
Multi-channel notification platform with preferences, templating, and delivery tracking.
9 componentsBuild this architecture
Generate an interactive architecture for Caching Strategies for System Design in seconds.
Try it in Codelit →
Comments