system design cheat sheetsystem design interviewscaling checklistload balancercachingmessage queueCDNsystem design patternsdistributed systems

System Design Cheat Sheet: Quick Reference for Interviews and Beyond

March 29, 2026 7 min readBy Codelit Team Discussion

System design interviews test your ability to think through trade-offs at scale. This cheat sheet gives you the key numbers, components, patterns, and checklists you need — all in one place.

Numbers Every Engineer Should Know#

These latency and throughput numbers help you make quick back-of-envelope estimates.

Operation	Latency
L1 cache reference	0.5 ns
L2 cache reference	7 ns
Main memory reference	100 ns
SSD random read	150 us
HDD random read	10 ms
Round trip within same datacenter	0.5 ms
Round trip CA to Netherlands	150 ms

Throughput rules of thumb:

A single server can handle ~10K–50K concurrent connections (with async I/O)
A single PostgreSQL instance handles ~5K–10K transactions/second
Redis handles ~100K operations/second on a single node
A single SSD delivers ~100K–200K random IOPS
1 Gbps network = ~125 MB/s throughput

Data size estimates:

1 million users with 1 KB profile = 1 GB
1 billion rows at 100 bytes each = 100 GB
10 million images at 200 KB each = 2 TB
1 million daily active users generating 10 requests each = 10M requests/day = ~115 requests/second

Common Components#

Load Balancer (LB)#

Distributes traffic across multiple servers. Enables horizontal scaling and eliminates single points of failure.

L4 (transport): Routes based on IP and port. Fast, no inspection.
L7 (application): Routes based on HTTP headers, URL, cookies. More flexible.
Algorithms: Round robin, least connections, weighted, IP hash, consistent hashing.
Tools: NGINX, HAProxy, AWS ALB/NLB, Envoy.

Cache#

Reduces latency and database load by storing frequently accessed data in memory.

Cache-aside: App checks cache first, loads from DB on miss, writes to cache.
Write-through: App writes to cache and DB simultaneously.
Write-behind: App writes to cache; cache async-writes to DB.
Eviction: LRU (most common), LFU, TTL-based.
Tools: Redis, Memcached, Varnish (HTTP cache), CDN edge cache.

Database (DB)#

Choose based on access patterns, consistency needs, and scale requirements.

Relational (SQL): Strong consistency, ACID, complex queries. PostgreSQL, MySQL.
Document (NoSQL): Flexible schema, horizontal scale. MongoDB, DynamoDB.
Wide-column: High write throughput, time-series. Cassandra, ScyllaDB.
Graph: Relationship-heavy queries. Neo4j, Amazon Neptune.
Key-value: Simple lookups, extreme speed. Redis, DynamoDB.

Message Queue#

Decouples producers from consumers. Enables async processing and absorbs traffic spikes.

At-most-once: Fire and forget. Fast but may lose messages.
At-least-once: Retry until acknowledged. May duplicate.
Exactly-once: Hardest to achieve. Usually involves idempotency.
Tools: Kafka (high throughput, log-based), RabbitMQ (flexible routing), SQS (managed), NATS (lightweight).

CDN (Content Delivery Network)#

Caches static assets at edge locations close to users. Reduces latency for global audiences.

Cache HTML, CSS, JS, images, videos at edge PoPs
Typical hit ratio: 90–99% for static content
Origin shield reduces load on your origin server
Tools: CloudFront, Cloudflare, Fastly, Akamai

Step-by-Step Framework#

Use this framework for every system design question.

Step 1: Clarify Requirements (3–5 minutes)#

What are the core features? (functional requirements)
What are the scale expectations? (users, requests/sec, data volume)
What are the non-functional requirements? (latency, availability, consistency)
What is NOT in scope?

Step 2: Back-of-Envelope Estimation (3–5 minutes)#

Daily active users and requests per user
Read-to-write ratio
Storage needs over 5 years
Bandwidth requirements
Peak vs average traffic (typically 3–5x average)

Step 3: High-Level Design (10–15 minutes)#

Draw the major components: clients, LB, app servers, cache, DB, queue, CDN
Show the data flow for core operations
Identify the primary data model and storage choices
Call out the API endpoints

Step 4: Deep Dive (10–15 minutes)#

Pick 2–3 components to detail based on the interviewer's interest
Discuss trade-offs for each decision
Address bottlenecks and how to mitigate them
Show how the design handles failure

Step 5: Wrap Up (3–5 minutes)#

Summarize trade-offs made
Discuss monitoring and alerting
Mention future improvements

Common Patterns Checklist#

Use these patterns as building blocks. Check which ones apply to your design.

Read replicas — scale read-heavy workloads by replicating the database
Sharding — partition data across multiple databases by a shard key
CQRS — separate read and write models for different optimization
Event sourcing — store state as a sequence of events, not current state
Saga pattern — manage distributed transactions across services
Circuit breaker — prevent cascading failures by failing fast
Bulkhead — isolate components so one failure does not sink the ship
Consistent hashing — distribute data across nodes with minimal redistribution on changes
Fan-out on write — precompute feeds/timelines at write time (Twitter model)
Fan-out on read — compute feeds at read time, cheaper for low-read users
Rate limiting — protect services from abuse (token bucket, sliding window)
Idempotency — make operations safe to retry without side effects

Scaling Checklist#

When the interviewer asks "how would you scale this?", walk through these layers:

Vertical scaling (scale up):

Bigger machines, more RAM, faster SSDs
Simple but has a ceiling

Horizontal scaling (scale out):

Stateless app servers behind a load balancer
Database read replicas for read-heavy workloads
Sharding for write-heavy or large datasets
Cache layer (Redis/Memcached) to reduce DB load
CDN for static content
Message queues to decouple and buffer writes
Async processing for non-critical paths

Data layer scaling:

Connection pooling (PgBouncer, ProxySQL)
Query optimization and proper indexing
Denormalization for read performance
Partitioning large tables by date or range
Archive cold data to object storage

Infrastructure:

Multi-region deployment for availability and latency
Auto-scaling groups based on CPU/memory/request metrics
Blue-green or canary deployments for safe rollouts

Monitoring Checklist#

Every system design should address observability. Cover these areas:

The Four Golden Signals (Google SRE):

Latency — response time for successful and failed requests
Traffic — requests per second, concurrent connections
Errors — 5xx rate, failed health checks, timeout rate
Saturation — CPU, memory, disk, connection pool utilization

What to monitor:

Application metrics: request rate, error rate, p50/p95/p99 latency
Infrastructure metrics: CPU, memory, disk I/O, network
Database metrics: query time, connection count, replication lag, cache hit ratio
Queue metrics: depth, consumer lag, processing time
Business metrics: signups, orders, revenue per minute

Alerting principles:

Alert on symptoms (high error rate), not causes (high CPU)
Use severity levels: page for critical, ticket for warning
Avoid alert fatigue — every alert should be actionable

Tools: Prometheus + Grafana, Datadog, New Relic, AWS CloudWatch, PagerDuty for on-call.

Quick Reference Card#

Requirement        --> Component
Static content     --> CDN
Session/state      --> Redis
Async processing   --> Message Queue (Kafka/SQS)
Search             --> Elasticsearch/OpenSearch
File storage       --> S3/GCS/Blob Storage
Real-time updates  --> WebSocket / SSE
Rate limiting      --> API Gateway / Redis
Auth               --> OAuth2 / JWT at gateway
Notifications      --> Queue + worker + push service
Analytics          --> Event stream + data warehouse

That wraps up article #280 on Codelit. If you found this useful, explore our growing library of 280 articles covering system design, infrastructure, and software engineering — browse all posts here.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

90+ Templates

Practice with real-world architectures — Uber, Netflix, Slack, and more

Build this architecture →

Comments

API caching

Build this architecture

Generate an interactive architecture for System Design Cheat Sheet in seconds.

Try it in Codelit →

System Design Cheat Sheet: Quick Reference for Interviews and Beyond

Numbers Every Engineer Should Know#

Common Components#

Load Balancer (LB)#

Cache#

Database (DB)#

Message Queue#

CDN (Content Delivery Network)#

Step-by-Step Framework#

Step 1: Clarify Requirements (3–5 minutes)#

Step 2: Back-of-Envelope Estimation (3–5 minutes)#

Step 3: High-Level Design (10–15 minutes)#

Step 4: Deep Dive (10–15 minutes)#

Step 5: Wrap Up (3–5 minutes)#

Common Patterns Checklist#

Scaling Checklist#

Monitoring Checklist#

Quick Reference Card#

Comments

Related articles

API Gateway Caching Strategies: From Cache Headers to CDN Integration

Rate Limiting Algorithms Compared: Token Bucket, Sliding Window & More

Async Processing Patterns: Queues, Workers & Background Jobs

Try these templates

Uber Real-Time Location System

E-Commerce Checkout System

Notification System

Build this architecture

System Design Cheat Sheet: Quick Reference for Interviews and Beyond

Numbers Every Engineer Should Know#

Common Components#

Load Balancer (LB)#

Cache#

Database (DB)#

Message Queue#

CDN (Content Delivery Network)#

Step-by-Step Framework#

Step 1: Clarify Requirements (3–5 minutes)#

Step 2: Back-of-Envelope Estimation (3–5 minutes)#

Step 3: High-Level Design (10–15 minutes)#

Step 4: Deep Dive (10–15 minutes)#

Step 5: Wrap Up (3–5 minutes)#

Common Patterns Checklist#

Scaling Checklist#

Monitoring Checklist#

Quick Reference Card#

Comments

Related articles

API Gateway Caching Strategies: From Cache Headers to CDN Integration

Rate Limiting Algorithms Compared: Token Bucket, Sliding Window & More

Async Processing Patterns: Queues, Workers & Background Jobs

Try these templates

Uber Real-Time Location System

E-Commerce Checkout System

Notification System

Build this architecture