system-designinterview-preplearning-roadmapsoftware-architecturedistributed-systemsmilestone

The Complete System Design Reference: 350-Article Library, Learning Roadmap & Interview Strategy

March 29, 2026 7 min readBy Codelit Team Discussion

This is article number 350. What started as a handful of system design explainers has grown into a comprehensive library covering every major area of distributed systems, software architecture, and infrastructure. This capstone article organizes the entire collection into a structured learning roadmap, interview strategy, and practice plan.

The Library at a Glance#

350 articles across 10 major categories. Below is a guided tour of the library with top picks from each category.

1. Fundamentals of Distributed Systems#

The building blocks: CAP theorem, consistency models, failure modes, and distributed consensus.

CAP Theorem Explained — Consistency, Availability, Partition Tolerance
Consensus Algorithms — Raft, Paxos, and Practical BFT
Distributed Clocks — Lamport Timestamps and Vector Clocks
Failure Detection — Heartbeats, Phi Accrual, and SWIM
Consistency Models — Eventual, Causal, Strong, and Linearizability
The Two Generals Problem and Its Real-World Implications
Split-Brain Resolution Strategies

Start here if you are new to distributed systems. These concepts appear in every system design interview.

2. Databases and Storage#

From relational databases to distributed key-value stores, time-series databases, and data lakes.

SQL vs NoSQL — When to Choose What
Database Sharding Strategies — Hash, Range, and Directory-Based
Write-Ahead Logging and Crash Recovery
LSM Trees vs B-Trees — Storage Engine Internals
Distributed Transactions — Two-Phase Commit and Saga Pattern
Time-Series Databases — InfluxDB, TimescaleDB, and Prometheus
Data Lake Architecture — Medallion Pattern and Lakehouse
Change Data Capture with Debezium

Key insight: most system design questions revolve around data modeling and storage tradeoffs. Master this category thoroughly.

3. Caching and Performance#

Caching layers, eviction policies, cache invalidation, and CDN architecture.

Caching Strategies — Cache-Aside, Write-Through, Write-Behind
Redis Deep Dive — Data Structures, Persistence, and Clustering
Cache Invalidation — The Hardest Problem in Computer Science
CDN Architecture — Edge Caching, Origin Shielding, and Purging
Bloom Filters for Cache Optimization
Connection Pooling and Keep-Alive Strategies

Interview tip: always discuss caching in your design. It shows you think about latency and cost.

4. Networking and Protocols#

HTTP, gRPC, WebSockets, DNS, load balancing, and service mesh.

HTTP/2 and HTTP/3 — Multiplexing, Header Compression, and QUIC
gRPC — Protocol Buffers, Streaming, and Deadlines
WebSocket Architecture for Real-Time Systems
DNS Architecture — Resolution, Caching, and GeoDNS
Load Balancing — L4 vs L7, Consistent Hashing, and Health Checks
Service Mesh — Istio, Linkerd, and Sidecar Proxy Pattern
API Gateway Patterns — Rate Limiting, Auth, and Routing

5. Messaging and Event-Driven Architecture#

Queues, streams, event sourcing, and pub/sub at scale.

Apache Kafka — Partitions, Consumer Groups, and Exactly-Once
Event Sourcing and CQRS — When and How
Message Queue Comparison — RabbitMQ, SQS, Kafka, and Pulsar
Dead Letter Queues and Retry Strategies
Saga Pattern for Distributed Transactions
Schema Registry and Schema Evolution
Idempotency in Event-Driven Systems

Key insight: event-driven architecture is the backbone of modern microservices. Every senior engineer should be fluent here.

6. System Design Interviews#

Concrete system designs: URL shortener to global-scale social networks.

Designing a URL Shortener — End to End
Designing a Chat System — WhatsApp Scale
Designing a News Feed — Facebook and Twitter
Designing a Rate Limiter — Token Bucket to Sliding Window
Designing a Notification System — Push, Email, and SMS
Designing a Search Autocomplete System
Designing a Video Streaming Platform — Netflix Architecture
Designing a Ride-Sharing Platform — Uber and Lyft
Designing a Distributed File Storage — Google Drive
Designing a Payment System — Stripe Architecture

Practice plan: work through two designs per week. Sketch the architecture, identify bottlenecks, then read the article to compare.

7. Infrastructure and DevOps#

Containers, orchestration, CI/CD, infrastructure as code, and cloud-native patterns.

Kubernetes Architecture — Pods, Services, and the Control Plane
Container Networking — CNI, Service Discovery, and DNS
CI/CD Pipeline Design — GitHub Actions, ArgoCD, and Flux
Infrastructure as Code — Terraform, Pulumi, and CDK
GitOps — Principles, Tools, and Production Patterns
Blue-Green and Canary Deployment Strategies
Feature Flags and Progressive Rollouts

8. Observability and Reliability#

Monitoring, tracing, logging, SLOs, incident response, and chaos engineering.

The Three Pillars of Observability — Logs, Metrics, and Traces
OpenTelemetry Instrumentation Guide
SLOs, SLIs, and Error Budgets — A Practical Guide
Distributed Tracing — Jaeger, Zipkin, and Tempo
Chaos Engineering — Principles, Tools, and Game Days
On-Call Best Practices and Incident Response
Alerting Strategy — Signal vs Noise

9. Security and Authentication#

Auth protocols, encryption, zero trust, and secure system design.

OAuth 2.0 and OpenID Connect — The Complete Guide
JWT — Structure, Signing, Validation, and Common Pitfalls
Zero Trust Architecture — Beyond the Perimeter
API Security — OWASP Top 10 for APIs
Encryption at Rest and in Transit
Secrets Management — Vault, AWS Secrets Manager, and SOPS
Rate Limiting and DDoS Mitigation

10. Architecture Patterns and Principles#

Microservices, monoliths, domain-driven design, and emerging patterns.

Microservices vs Monolith — A Practical Decision Framework
Domain-Driven Design — Bounded Contexts and Aggregates
Strangler Fig Pattern — Incremental Migration
CQRS — Command Query Responsibility Segregation
Hexagonal Architecture — Ports and Adapters
Cell-Based Architecture for Blast Radius Reduction
Multi-Tenancy Patterns — Shared vs Isolated

The Learning Roadmap#

Phase 1: Foundations (Weeks 1-4)#

Focus on categories 1, 2, and 3. Build a solid mental model of distributed systems, understand storage tradeoffs, and learn caching patterns. These three areas form the foundation for every system design discussion.

Phase 2: Communication and Events (Weeks 5-8)#

Move to categories 4 and 5. Understand how services communicate — synchronous (HTTP, gRPC) vs asynchronous (Kafka, queues). This is where most architectural decisions diverge.

Phase 3: Real Designs (Weeks 9-14)#

Work through category 6 systematically. For each design problem: spend 45 minutes sketching your own solution before reading the article. Compare your approach and note what you missed.

Phase 4: Production Engineering (Weeks 15-18)#

Cover categories 7, 8, and 9. These topics separate junior from senior engineers. Understanding deployment, observability, and security shows production maturity.

Phase 5: Architecture Mastery (Weeks 19-20)#

Finish with category 10. At this point you have enough context to appreciate the tradeoffs between architectural styles.

Interview Strategy#

Before the Interview#

Pick 10 system designs from category 6 and practice them end to end
Build a template: requirements, estimation, API design, data model, high-level architecture, deep dive, bottlenecks
Prepare tradeoff discussions — interviewers care more about why than what

During the Interview#

Clarify requirements — spend the first 3-5 minutes asking questions
Start with the happy path — get a working design on the board before optimizing
Quantify — back-of-envelope calculations show engineering rigor
Discuss tradeoffs explicitly — "We could use X, which gives us Y but costs Z"
Address failure modes — what happens when a node goes down, a network partitions, or traffic spikes 10x

Common Mistakes#

Jumping into the solution without gathering requirements
Over-engineering for scale that does not exist
Ignoring data consistency requirements
Forgetting about operational concerns (monitoring, deployment, rollback)

Practice Plan#

Week	Focus	Activity
1-2	Fundamentals	Read 5 articles daily from categories 1-3
3-4	Deep dives	Pick 3 topics and write your own summaries
5-8	Design practice	2 full system designs per week (45 min each)
9-10	Mock interviews	Practice with a partner using random problems
11-12	Weak areas	Revisit topics you struggled with

What Comes Next#

350 articles is a milestone, not a finish line. Distributed systems continue to evolve — new consensus protocols, new database architectures, new infrastructure primitives. The library will keep growing.

If you have read even a fraction of these articles and practiced the designs, you are well-prepared — not just for interviews, but for building real systems at scale.

350 articles on system design at codelit.io/blog.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

90+ Templates

Practice with real-world architectures — Uber, Netflix, Slack, and more

Build this architecture →

Comments

api design

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

8 min read

system design

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

7 min read

api

API-First Design Methodology — Design Before You Implement

7 min read

Try these templates

Uber Real-Time Location System

Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.

6 components

E-Commerce Checkout System

Production checkout flow with Stripe payments, inventory management, and fraud detection.

11 components

Notification System

Multi-channel notification platform with preferences, templating, and delivery tracking.

9 components

Build this architecture

Generate an interactive architecture for The Complete System Design Reference in seconds.

Try it in Codelit →

system-designinterview-preplearning-roadmapsoftware-architecturedistributed-systemsmilestone

The Complete System Design Reference: 350-Article Library, Learning Roadmap & Interview Strategy

March 29, 2026 7 min readBy Codelit Team Discussion

The Library at a Glance#

350 articles across 10 major categories. Below is a guided tour of the library with top picks from each category.

1. Fundamentals of Distributed Systems#

The building blocks: CAP theorem, consistency models, failure modes, and distributed consensus.

CAP Theorem Explained — Consistency, Availability, Partition Tolerance
Consensus Algorithms — Raft, Paxos, and Practical BFT
Distributed Clocks — Lamport Timestamps and Vector Clocks
Failure Detection — Heartbeats, Phi Accrual, and SWIM
Consistency Models — Eventual, Causal, Strong, and Linearizability
The Two Generals Problem and Its Real-World Implications
Split-Brain Resolution Strategies

Start here if you are new to distributed systems. These concepts appear in every system design interview.

2. Databases and Storage#

From relational databases to distributed key-value stores, time-series databases, and data lakes.

SQL vs NoSQL — When to Choose What
Database Sharding Strategies — Hash, Range, and Directory-Based
Write-Ahead Logging and Crash Recovery
LSM Trees vs B-Trees — Storage Engine Internals
Distributed Transactions — Two-Phase Commit and Saga Pattern
Time-Series Databases — InfluxDB, TimescaleDB, and Prometheus
Data Lake Architecture — Medallion Pattern and Lakehouse
Change Data Capture with Debezium

Key insight: most system design questions revolve around data modeling and storage tradeoffs. Master this category thoroughly.

3. Caching and Performance#

Caching layers, eviction policies, cache invalidation, and CDN architecture.

Caching Strategies — Cache-Aside, Write-Through, Write-Behind
Redis Deep Dive — Data Structures, Persistence, and Clustering
Cache Invalidation — The Hardest Problem in Computer Science
CDN Architecture — Edge Caching, Origin Shielding, and Purging
Bloom Filters for Cache Optimization
Connection Pooling and Keep-Alive Strategies

Interview tip: always discuss caching in your design. It shows you think about latency and cost.

4. Networking and Protocols#

HTTP, gRPC, WebSockets, DNS, load balancing, and service mesh.

HTTP/2 and HTTP/3 — Multiplexing, Header Compression, and QUIC
gRPC — Protocol Buffers, Streaming, and Deadlines
WebSocket Architecture for Real-Time Systems
DNS Architecture — Resolution, Caching, and GeoDNS
Load Balancing — L4 vs L7, Consistent Hashing, and Health Checks
Service Mesh — Istio, Linkerd, and Sidecar Proxy Pattern
API Gateway Patterns — Rate Limiting, Auth, and Routing

5. Messaging and Event-Driven Architecture#

Queues, streams, event sourcing, and pub/sub at scale.

Apache Kafka — Partitions, Consumer Groups, and Exactly-Once
Event Sourcing and CQRS — When and How
Message Queue Comparison — RabbitMQ, SQS, Kafka, and Pulsar
Dead Letter Queues and Retry Strategies
Saga Pattern for Distributed Transactions
Schema Registry and Schema Evolution
Idempotency in Event-Driven Systems

Key insight: event-driven architecture is the backbone of modern microservices. Every senior engineer should be fluent here.

6. System Design Interviews#

Concrete system designs: URL shortener to global-scale social networks.

Designing a URL Shortener — End to End
Designing a Chat System — WhatsApp Scale
Designing a News Feed — Facebook and Twitter
Designing a Rate Limiter — Token Bucket to Sliding Window
Designing a Notification System — Push, Email, and SMS
Designing a Search Autocomplete System
Designing a Video Streaming Platform — Netflix Architecture
Designing a Ride-Sharing Platform — Uber and Lyft
Designing a Distributed File Storage — Google Drive
Designing a Payment System — Stripe Architecture

Practice plan: work through two designs per week. Sketch the architecture, identify bottlenecks, then read the article to compare.

7. Infrastructure and DevOps#

Containers, orchestration, CI/CD, infrastructure as code, and cloud-native patterns.

Kubernetes Architecture — Pods, Services, and the Control Plane
Container Networking — CNI, Service Discovery, and DNS
CI/CD Pipeline Design — GitHub Actions, ArgoCD, and Flux
Infrastructure as Code — Terraform, Pulumi, and CDK
GitOps — Principles, Tools, and Production Patterns
Blue-Green and Canary Deployment Strategies
Feature Flags and Progressive Rollouts

8. Observability and Reliability#

Monitoring, tracing, logging, SLOs, incident response, and chaos engineering.

The Three Pillars of Observability — Logs, Metrics, and Traces
OpenTelemetry Instrumentation Guide
SLOs, SLIs, and Error Budgets — A Practical Guide
Distributed Tracing — Jaeger, Zipkin, and Tempo
Chaos Engineering — Principles, Tools, and Game Days
On-Call Best Practices and Incident Response
Alerting Strategy — Signal vs Noise

9. Security and Authentication#

Auth protocols, encryption, zero trust, and secure system design.

OAuth 2.0 and OpenID Connect — The Complete Guide
JWT — Structure, Signing, Validation, and Common Pitfalls
Zero Trust Architecture — Beyond the Perimeter
API Security — OWASP Top 10 for APIs
Encryption at Rest and in Transit
Secrets Management — Vault, AWS Secrets Manager, and SOPS
Rate Limiting and DDoS Mitigation

10. Architecture Patterns and Principles#

Microservices, monoliths, domain-driven design, and emerging patterns.

Microservices vs Monolith — A Practical Decision Framework
Domain-Driven Design — Bounded Contexts and Aggregates
Strangler Fig Pattern — Incremental Migration
CQRS — Command Query Responsibility Segregation
Hexagonal Architecture — Ports and Adapters
Cell-Based Architecture for Blast Radius Reduction
Multi-Tenancy Patterns — Shared vs Isolated

The Learning Roadmap#

Phase 1: Foundations (Weeks 1-4)#

Phase 2: Communication and Events (Weeks 5-8)#

Move to categories 4 and 5. Understand how services communicate — synchronous (HTTP, gRPC) vs asynchronous (Kafka, queues). This is where most architectural decisions diverge.

Phase 3: Real Designs (Weeks 9-14)#

Work through category 6 systematically. For each design problem: spend 45 minutes sketching your own solution before reading the article. Compare your approach and note what you missed.

Phase 4: Production Engineering (Weeks 15-18)#

Cover categories 7, 8, and 9. These topics separate junior from senior engineers. Understanding deployment, observability, and security shows production maturity.

Phase 5: Architecture Mastery (Weeks 19-20)#

Finish with category 10. At this point you have enough context to appreciate the tradeoffs between architectural styles.

Interview Strategy#

Before the Interview#

Pick 10 system designs from category 6 and practice them end to end
Build a template: requirements, estimation, API design, data model, high-level architecture, deep dive, bottlenecks
Prepare tradeoff discussions — interviewers care more about why than what

During the Interview#

Clarify requirements — spend the first 3-5 minutes asking questions
Start with the happy path — get a working design on the board before optimizing
Quantify — back-of-envelope calculations show engineering rigor
Discuss tradeoffs explicitly — "We could use X, which gives us Y but costs Z"
Address failure modes — what happens when a node goes down, a network partitions, or traffic spikes 10x

Common Mistakes#

Jumping into the solution without gathering requirements
Over-engineering for scale that does not exist
Ignoring data consistency requirements
Forgetting about operational concerns (monitoring, deployment, rollback)

Practice Plan#

Week	Focus	Activity
1-2	Fundamentals	Read 5 articles daily from categories 1-3
3-4	Deep dives	Pick 3 topics and write your own summaries
5-8	Design practice	2 full system designs per week (45 min each)
9-10	Mock interviews	Practice with a partner using random problems
11-12	Weak areas	Revisit topics you struggled with

What Comes Next#

If you have read even a fraction of these articles and practiced the designs, you are well-prepared — not just for interviews, but for building real systems at scale.

350 articles on system design at codelit.io/blog.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

90+ Templates

Practice with real-world architectures — Uber, Netflix, Slack, and more

Build this architecture →

Comments

api design

Build this architecture

Generate an interactive architecture for The Complete System Design Reference in seconds.

Try it in Codelit →

The Complete System Design Reference: 350-Article Library, Learning Roadmap & Interview Strategy

The Library at a Glance#

1. Fundamentals of Distributed Systems#

2. Databases and Storage#

3. Caching and Performance#

4. Networking and Protocols#

5. Messaging and Event-Driven Architecture#

6. System Design Interviews#

7. Infrastructure and DevOps#

8. Observability and Reliability#

9. Security and Authentication#

10. Architecture Patterns and Principles#

The Learning Roadmap#

Phase 1: Foundations (Weeks 1-4)#

Phase 2: Communication and Events (Weeks 5-8)#

Phase 3: Real Designs (Weeks 9-14)#

Phase 4: Production Engineering (Weeks 15-18)#

Phase 5: Architecture Mastery (Weeks 19-20)#

Interview Strategy#

Before the Interview#

During the Interview#

Common Mistakes#

Practice Plan#

What Comes Next#

Comments

Related articles

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

API-First Design Methodology — Design Before You Implement

Try these templates

Uber Real-Time Location System

E-Commerce Checkout System

Notification System

Build this architecture

The Complete System Design Reference: 350-Article Library, Learning Roadmap & Interview Strategy

The Library at a Glance#

1. Fundamentals of Distributed Systems#

2. Databases and Storage#

3. Caching and Performance#

4. Networking and Protocols#

5. Messaging and Event-Driven Architecture#

6. System Design Interviews#

7. Infrastructure and DevOps#

8. Observability and Reliability#

9. Security and Authentication#

10. Architecture Patterns and Principles#

The Learning Roadmap#

Phase 1: Foundations (Weeks 1-4)#

Phase 2: Communication and Events (Weeks 5-8)#

Phase 3: Real Designs (Weeks 9-14)#

Phase 4: Production Engineering (Weeks 15-18)#

Phase 5: Architecture Mastery (Weeks 19-20)#

Interview Strategy#

Before the Interview#

During the Interview#

Common Mistakes#

Practice Plan#

What Comes Next#

Comments

Related articles

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

API-First Design Methodology — Design Before You Implement

Try these templates

Uber Real-Time Location System

E-Commerce Checkout System

Notification System

Build this architecture