URL shortener system designsystem designdistributed systemscachingdatabase designscaling

URL Shortener System Design: The Complete Engineering Guide

March 28, 2026 7 min readBy Codelit Team Discussion

URL Shortener System Design#

The URL shortener is one of the most popular system design interview questions, and for good reason. It touches distributed ID generation, read-heavy optimization, caching, analytics, and scaling decisions that show up in real production systems every day.

This guide walks through the complete design from requirements to production-ready architecture.

Functional Requirements#

Before drawing boxes, pin down what the system actually does.

Shorten: Given a long URL, generate a unique short URL
Redirect: Given a short URL, redirect to the original long URL
Custom aliases: Users can optionally choose a custom short code
Expiration: Links can have a TTL (time-to-live)
Analytics: Track click counts, referrers, geography, device type

Non-Functional Requirements#

High availability: The redirect path cannot go down
Low latency: Redirects must complete in under 50ms at p99
Read-heavy: Reads outnumber writes by 100:1 or more
Scalability: Handle billions of redirects per month
Durability: A shortened link must never lose its mapping

Scale Estimation#

Start with rough numbers to drive design decisions.

New URLs per day:       1 million
Reads per day:          100 million (100:1 ratio)
Reads per second:       ~1,200 RPS (peak ~3,000)
Storage per record:     ~500 bytes
Storage per year:       ~180 GB
Total URLs over 5 years: ~1.8 billion

These numbers tell you that storage is modest, but read throughput matters. This is a caching problem more than a storage problem.

Short Code Generation: Base62 Encoding#

A 7-character base62 string (a-z, A-Z, 0-9) gives you 62^7 ≈ 3.5 trillion unique codes. That is far more than enough for most systems.

Approach 1: Counter-Based#

Use a global auto-incrementing counter and convert the integer to base62.

Counter: 1000000
Base62:  "4c92"

Pros: Zero collisions, simple logic. Cons: Predictable URLs, single point of contention for the counter.

To distribute the counter, assign ID ranges to each application server. Server A gets 1–1,000,000, Server B gets 1,000,001–2,000,000, and so on. A coordination service like ZooKeeper manages range allocation.

Approach 2: Hash-Based#

Hash the long URL (MD5, SHA-256) and take the first 7 characters of the base62-encoded hash.

Pros: Deterministic — same URL always produces the same short code. Cons: Collisions are possible and must be handled.

Hash Collision Handling#

When a collision is detected on insert:

Append and rehash: Concatenate a counter or timestamp to the input and hash again
Bloom filter pre-check: Before hitting the database, check a Bloom filter to quickly rule out most collisions
Retry with salt: Add a random salt, rehash, and retry (bounded retries)

The Bloom filter approach is efficient at scale. A Bloom filter with 1 billion entries and a 1% false positive rate requires roughly 1.2 GB of memory — easily fits on a single node.

Database Choice: SQL vs NoSQL#

SQL (PostgreSQL, MySQL)#

Strong consistency guarantees
Mature indexing on the short code column
ACID transactions for collision-safe inserts
Works well up to hundreds of millions of rows with proper indexing

NoSQL (DynamoDB, Cassandra)#

Horizontal scaling out of the box
Key-value access pattern is a natural fit (short_code → long_url)
Higher write throughput at massive scale
Eventual consistency is acceptable for URL mappings

Recommendation: For most URL shorteners, a SQL database with a unique index on short_code is simpler to operate and sufficient up to billions of rows. If you need multi-region writes or extreme write throughput, DynamoDB or Cassandra makes sense.

Schema#

CREATE TABLE urls (
  id          BIGINT PRIMARY KEY,
  short_code  VARCHAR(10) UNIQUE NOT NULL,
  long_url    TEXT NOT NULL,
  user_id     BIGINT,
  created_at  TIMESTAMP DEFAULT NOW(),
  expires_at  TIMESTAMP,
  click_count BIGINT DEFAULT 0
);

CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_expires_at ON urls(expires_at) WHERE expires_at IS NOT NULL;

Caching with Redis#

Since reads dominate, a cache layer is critical.

Client → Load Balancer → App Server → Redis Cache → Database

Cache hit: Return the long URL directly from Redis (~1ms)
Cache miss: Query the database, populate the cache, then return
Eviction policy: LRU (Least Recently Used) works well — popular links stay cached, stale links get evicted
TTL: Set cache TTL slightly shorter than link expiration to avoid serving expired links

With 20 GB of Redis memory and ~500 bytes per entry, you can cache roughly 40 million of the hottest URLs. At a 100:1 read-write ratio, cache hit rates above 90% are typical.

Redirect Flow: 301 vs 302#

This decision matters more than most engineers realize.

301 (Permanent Redirect): The browser caches the redirect. Subsequent visits skip your server entirely. Good for bandwidth savings, bad for analytics.
302 (Temporary Redirect): The browser does not cache. Every click hits your server. Essential for accurate click tracking.

Use 302 if analytics matter (they almost always do). Use 301 only for static, permanent mappings where tracking is not needed.

Redirect Sequence#

1. Client requests GET /abc1234
2. App server looks up "abc1234" in Redis
3. Cache hit → return 302 with Location header
4. Cache miss → query database → populate cache → return 302
5. Log the click event asynchronously

Analytics Tracking#

Never block the redirect path with analytics writes. Use an asynchronous pipeline.

Redirect Handler → Message Queue (Kafka) → Analytics Consumer → Analytics DB

Each click event includes:

Short code
Timestamp
IP address (for geo lookup)
User-Agent (for device/browser detection)
Referer header

Store aggregated analytics in a time-series database (ClickHouse, TimescaleDB) or a columnar store for efficient querying.

Link Expiration#

Two strategies for cleaning up expired links:

Lazy deletion: Check expires_at on every read. If expired, return 404 and optionally delete the record. Simple but leaves dead rows in the database.
Active cleanup: A background cron job periodically scans for expired rows and deletes them in batches. Keeps the database lean.

Use both together. Lazy deletion ensures correctness. Active cleanup prevents table bloat.

Rate Limiting#

Protect the creation endpoint from abuse.

Per-user rate limit: Authenticated users get a higher quota (e.g., 100 URLs/hour)
Per-IP rate limit: Anonymous users get a lower quota (e.g., 10 URLs/hour)
Implementation: Token bucket or sliding window counter in Redis

Key:   rate_limit:{user_id}:{window}
Value: counter
TTL:   window duration

High-Level Architecture#

                    ┌──────────────┐
                    │   DNS / CDN  │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │Load Balancer │
                    └──────┬───────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
        ┌─────▼─────┐┌────▼────┐┌─────▼─────┐
        │ App Server ││App Server││App Server │
        └─────┬─────┘└────┬────┘└─────┬─────┘
              │            │            │
        ┌─────▼────────────▼────────────▼─────┐
        │              Redis Cache             │
        └─────────────────┬───────────────────┘
                          │
        ┌─────────────────▼───────────────────┐
        │         Database (Primary)           │
        │         + Read Replicas              │
        └─────────────────┬───────────────────┘
                          │
        ┌─────────────────▼───────────────────┐
        │     Kafka → Analytics Pipeline       │
        └─────────────────────────────────────┘

Key Design Decisions Summary#

Decision	Recommendation	Reason
ID generation	Counter with range allocation	No collisions, horizontally distributable
Database	PostgreSQL (or DynamoDB at extreme scale)	Simple, reliable, sufficient for billions of rows
Cache	Redis with LRU eviction	Sub-millisecond reads, 90%+ hit rate
Redirect code	302	Enables accurate analytics tracking
Analytics	Async via Kafka	Never block the redirect path
Rate limiting	Token bucket in Redis	Protects write endpoints from abuse
Expiration	Lazy + active cleanup	Correctness and database hygiene

What Interviewers Are Really Testing#

The URL shortener question is not about the URL shortener. It tests whether you can:

Estimate scale and let numbers drive decisions
Identify the read-write ratio and optimize for the dominant path
Handle distributed ID generation without single points of failure
Design a caching strategy that actually moves the needle
Think about tradeoffs (301 vs 302, SQL vs NoSQL, consistency vs availability)

Nail these fundamentals, and the specific system almost does not matter.

Design, visualize, and share system architecture diagrams instantly on codelit.io.

This is article #179 in the Codelit engineering blog series.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

Build this architecture →

Comments

AI search

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

8 min read

AI safety

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

8 min read

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

Try these templates

Uber Real-Time Location System

Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.

6 components

E-Commerce Checkout System

Production checkout flow with Stripe payments, inventory management, and fraud detection.

11 components

URL Shortener Service

Scalable URL shortening with analytics, custom aliases, and expiration.

7 components

Build this architecture

Generate an interactive architecture for URL Shortener System Design in seconds.

Try it in Codelit →

URL shortener system designsystem designdistributed systemscachingdatabase designscaling

URL Shortener System Design: The Complete Engineering Guide

March 28, 2026 7 min readBy Codelit Team Discussion

URL Shortener System Design#

This guide walks through the complete design from requirements to production-ready architecture.

Functional Requirements#

Before drawing boxes, pin down what the system actually does.

Shorten: Given a long URL, generate a unique short URL
Redirect: Given a short URL, redirect to the original long URL
Custom aliases: Users can optionally choose a custom short code
Expiration: Links can have a TTL (time-to-live)
Analytics: Track click counts, referrers, geography, device type

Non-Functional Requirements#

High availability: The redirect path cannot go down
Low latency: Redirects must complete in under 50ms at p99
Read-heavy: Reads outnumber writes by 100:1 or more
Scalability: Handle billions of redirects per month
Durability: A shortened link must never lose its mapping

Scale Estimation#

Start with rough numbers to drive design decisions.

New URLs per day:       1 million
Reads per day:          100 million (100:1 ratio)
Reads per second:       ~1,200 RPS (peak ~3,000)
Storage per record:     ~500 bytes
Storage per year:       ~180 GB
Total URLs over 5 years: ~1.8 billion

These numbers tell you that storage is modest, but read throughput matters. This is a caching problem more than a storage problem.

Short Code Generation: Base62 Encoding#

A 7-character base62 string (a-z, A-Z, 0-9) gives you 62^7 ≈ 3.5 trillion unique codes. That is far more than enough for most systems.

Approach 1: Counter-Based#

Use a global auto-incrementing counter and convert the integer to base62.

Counter: 1000000
Base62:  "4c92"

Pros: Zero collisions, simple logic. Cons: Predictable URLs, single point of contention for the counter.

Approach 2: Hash-Based#

Hash the long URL (MD5, SHA-256) and take the first 7 characters of the base62-encoded hash.

Pros: Deterministic — same URL always produces the same short code. Cons: Collisions are possible and must be handled.

Hash Collision Handling#

When a collision is detected on insert:

Append and rehash: Concatenate a counter or timestamp to the input and hash again
Bloom filter pre-check: Before hitting the database, check a Bloom filter to quickly rule out most collisions
Retry with salt: Add a random salt, rehash, and retry (bounded retries)

The Bloom filter approach is efficient at scale. A Bloom filter with 1 billion entries and a 1% false positive rate requires roughly 1.2 GB of memory — easily fits on a single node.

Database Choice: SQL vs NoSQL#

SQL (PostgreSQL, MySQL)#

Strong consistency guarantees
Mature indexing on the short code column
ACID transactions for collision-safe inserts
Works well up to hundreds of millions of rows with proper indexing

NoSQL (DynamoDB, Cassandra)#

Horizontal scaling out of the box
Key-value access pattern is a natural fit (short_code → long_url)
Higher write throughput at massive scale
Eventual consistency is acceptable for URL mappings

Schema#

CREATE TABLE urls (
  id          BIGINT PRIMARY KEY,
  short_code  VARCHAR(10) UNIQUE NOT NULL,
  long_url    TEXT NOT NULL,
  user_id     BIGINT,
  created_at  TIMESTAMP DEFAULT NOW(),
  expires_at  TIMESTAMP,
  click_count BIGINT DEFAULT 0
);

CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_expires_at ON urls(expires_at) WHERE expires_at IS NOT NULL;

Caching with Redis#

Since reads dominate, a cache layer is critical.

Client → Load Balancer → App Server → Redis Cache → Database

Cache hit: Return the long URL directly from Redis (~1ms)
Cache miss: Query the database, populate the cache, then return
Eviction policy: LRU (Least Recently Used) works well — popular links stay cached, stale links get evicted
TTL: Set cache TTL slightly shorter than link expiration to avoid serving expired links

With 20 GB of Redis memory and ~500 bytes per entry, you can cache roughly 40 million of the hottest URLs. At a 100:1 read-write ratio, cache hit rates above 90% are typical.

Redirect Flow: 301 vs 302#

This decision matters more than most engineers realize.

301 (Permanent Redirect): The browser caches the redirect. Subsequent visits skip your server entirely. Good for bandwidth savings, bad for analytics.
302 (Temporary Redirect): The browser does not cache. Every click hits your server. Essential for accurate click tracking.

Use 302 if analytics matter (they almost always do). Use 301 only for static, permanent mappings where tracking is not needed.

Redirect Sequence#

1. Client requests GET /abc1234
2. App server looks up "abc1234" in Redis
3. Cache hit → return 302 with Location header
4. Cache miss → query database → populate cache → return 302
5. Log the click event asynchronously

Analytics Tracking#

Never block the redirect path with analytics writes. Use an asynchronous pipeline.

Redirect Handler → Message Queue (Kafka) → Analytics Consumer → Analytics DB

Each click event includes:

Short code
Timestamp
IP address (for geo lookup)
User-Agent (for device/browser detection)
Referer header

Store aggregated analytics in a time-series database (ClickHouse, TimescaleDB) or a columnar store for efficient querying.

Link Expiration#

Two strategies for cleaning up expired links:

Lazy deletion: Check expires_at on every read. If expired, return 404 and optionally delete the record. Simple but leaves dead rows in the database.
Active cleanup: A background cron job periodically scans for expired rows and deletes them in batches. Keeps the database lean.

Use both together. Lazy deletion ensures correctness. Active cleanup prevents table bloat.

Rate Limiting#

Protect the creation endpoint from abuse.

Per-user rate limit: Authenticated users get a higher quota (e.g., 100 URLs/hour)
Per-IP rate limit: Anonymous users get a lower quota (e.g., 10 URLs/hour)
Implementation: Token bucket or sliding window counter in Redis

Key:   rate_limit:{user_id}:{window}
Value: counter
TTL:   window duration

High-Level Architecture#

                    ┌──────────────┐
                    │   DNS / CDN  │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │Load Balancer │
                    └──────┬───────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
        ┌─────▼─────┐┌────▼────┐┌─────▼─────┐
        │ App Server ││App Server││App Server │
        └─────┬─────┘└────┬────┘└─────┬─────┘
              │            │            │
        ┌─────▼────────────▼────────────▼─────┐
        │              Redis Cache             │
        └─────────────────┬───────────────────┘
                          │
        ┌─────────────────▼───────────────────┐
        │         Database (Primary)           │
        │         + Read Replicas              │
        └─────────────────┬───────────────────┘
                          │
        ┌─────────────────▼───────────────────┐
        │     Kafka → Analytics Pipeline       │
        └─────────────────────────────────────┘

Key Design Decisions Summary#

Decision	Recommendation	Reason
ID generation	Counter with range allocation	No collisions, horizontally distributable
Database	PostgreSQL (or DynamoDB at extreme scale)	Simple, reliable, sufficient for billions of rows
Cache	Redis with LRU eviction	Sub-millisecond reads, 90%+ hit rate
Redirect code	302	Enables accurate analytics tracking
Analytics	Async via Kafka	Never block the redirect path
Rate limiting	Token bucket in Redis	Protects write endpoints from abuse
Expiration	Lazy + active cleanup	Correctness and database hygiene

What Interviewers Are Really Testing#

The URL shortener question is not about the URL shortener. It tests whether you can:

Estimate scale and let numbers drive decisions
Identify the read-write ratio and optimize for the dominant path
Handle distributed ID generation without single points of failure
Design a caching strategy that actually moves the needle
Think about tradeoffs (301 vs 302, SQL vs NoSQL, consistency vs availability)

Nail these fundamentals, and the specific system almost does not matter.

Design, visualize, and share system architecture diagrams instantly on codelit.io.

This is article #179 in the Codelit engineering blog series.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

Build this architecture →

Comments

AI search

Build this architecture

Generate an interactive architecture for URL Shortener System Design in seconds.

Try it in Codelit →

URL Shortener System Design: The Complete Engineering Guide

URL Shortener System Design#

Functional Requirements#

Non-Functional Requirements#

Scale Estimation#

Short Code Generation: Base62 Encoding#

Approach 1: Counter-Based#

Approach 2: Hash-Based#

Hash Collision Handling#

Database Choice: SQL vs NoSQL#

SQL (PostgreSQL, MySQL)#

NoSQL (DynamoDB, Cassandra)#

Schema#

Caching with Redis#

Redirect Flow: 301 vs 302#

Redirect Sequence#

Analytics Tracking#

Link Expiration#

Rate Limiting#

High-Level Architecture#

Key Design Decisions Summary#

What Interviewers Are Really Testing#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Uber Real-Time Location System

E-Commerce Checkout System

URL Shortener Service

Build this architecture

URL Shortener System Design: The Complete Engineering Guide

URL Shortener System Design#

Functional Requirements#

Non-Functional Requirements#

Scale Estimation#

Short Code Generation: Base62 Encoding#

Approach 1: Counter-Based#

Approach 2: Hash-Based#

Hash Collision Handling#

Database Choice: SQL vs NoSQL#

SQL (PostgreSQL, MySQL)#

NoSQL (DynamoDB, Cassandra)#

Schema#

Caching with Redis#

Redirect Flow: 301 vs 302#

Redirect Sequence#

Analytics Tracking#

Link Expiration#

Rate Limiting#

High-Level Architecture#

Key Design Decisions Summary#

What Interviewers Are Really Testing#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Uber Real-Time Location System

E-Commerce Checkout System

URL Shortener Service

Build this architecture