URL Shortener System Design: The Complete Engineering Guide
URL Shortener System Design#
The URL shortener is one of the most popular system design interview questions, and for good reason. It touches distributed ID generation, read-heavy optimization, caching, analytics, and scaling decisions that show up in real production systems every day.
This guide walks through the complete design from requirements to production-ready architecture.
Functional Requirements#
Before drawing boxes, pin down what the system actually does.
- Shorten: Given a long URL, generate a unique short URL
- Redirect: Given a short URL, redirect to the original long URL
- Custom aliases: Users can optionally choose a custom short code
- Expiration: Links can have a TTL (time-to-live)
- Analytics: Track click counts, referrers, geography, device type
Non-Functional Requirements#
- High availability: The redirect path cannot go down
- Low latency: Redirects must complete in under 50ms at p99
- Read-heavy: Reads outnumber writes by 100:1 or more
- Scalability: Handle billions of redirects per month
- Durability: A shortened link must never lose its mapping
Scale Estimation#
Start with rough numbers to drive design decisions.
New URLs per day: 1 million
Reads per day: 100 million (100:1 ratio)
Reads per second: ~1,200 RPS (peak ~3,000)
Storage per record: ~500 bytes
Storage per year: ~180 GB
Total URLs over 5 years: ~1.8 billion
These numbers tell you that storage is modest, but read throughput matters. This is a caching problem more than a storage problem.
Short Code Generation: Base62 Encoding#
A 7-character base62 string (a-z, A-Z, 0-9) gives you 62^7 ≈ 3.5 trillion unique codes. That is far more than enough for most systems.
Approach 1: Counter-Based#
Use a global auto-incrementing counter and convert the integer to base62.
Counter: 1000000
Base62: "4c92"
Pros: Zero collisions, simple logic. Cons: Predictable URLs, single point of contention for the counter.
To distribute the counter, assign ID ranges to each application server. Server A gets 1–1,000,000, Server B gets 1,000,001–2,000,000, and so on. A coordination service like ZooKeeper manages range allocation.
Approach 2: Hash-Based#
Hash the long URL (MD5, SHA-256) and take the first 7 characters of the base62-encoded hash.
Pros: Deterministic — same URL always produces the same short code. Cons: Collisions are possible and must be handled.
Hash Collision Handling#
When a collision is detected on insert:
- Append and rehash: Concatenate a counter or timestamp to the input and hash again
- Bloom filter pre-check: Before hitting the database, check a Bloom filter to quickly rule out most collisions
- Retry with salt: Add a random salt, rehash, and retry (bounded retries)
The Bloom filter approach is efficient at scale. A Bloom filter with 1 billion entries and a 1% false positive rate requires roughly 1.2 GB of memory — easily fits on a single node.
Database Choice: SQL vs NoSQL#
SQL (PostgreSQL, MySQL)#
- Strong consistency guarantees
- Mature indexing on the short code column
- ACID transactions for collision-safe inserts
- Works well up to hundreds of millions of rows with proper indexing
NoSQL (DynamoDB, Cassandra)#
- Horizontal scaling out of the box
- Key-value access pattern is a natural fit (
short_code → long_url) - Higher write throughput at massive scale
- Eventual consistency is acceptable for URL mappings
Recommendation: For most URL shorteners, a SQL database with a unique index on short_code is simpler to operate and sufficient up to billions of rows. If you need multi-region writes or extreme write throughput, DynamoDB or Cassandra makes sense.
Schema#
CREATE TABLE urls (
id BIGINT PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
long_url TEXT NOT NULL,
user_id BIGINT,
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP,
click_count BIGINT DEFAULT 0
);
CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_expires_at ON urls(expires_at) WHERE expires_at IS NOT NULL;
Caching with Redis#
Since reads dominate, a cache layer is critical.
Client → Load Balancer → App Server → Redis Cache → Database
- Cache hit: Return the long URL directly from Redis (~1ms)
- Cache miss: Query the database, populate the cache, then return
- Eviction policy: LRU (Least Recently Used) works well — popular links stay cached, stale links get evicted
- TTL: Set cache TTL slightly shorter than link expiration to avoid serving expired links
With 20 GB of Redis memory and ~500 bytes per entry, you can cache roughly 40 million of the hottest URLs. At a 100:1 read-write ratio, cache hit rates above 90% are typical.
Redirect Flow: 301 vs 302#
This decision matters more than most engineers realize.
- 301 (Permanent Redirect): The browser caches the redirect. Subsequent visits skip your server entirely. Good for bandwidth savings, bad for analytics.
- 302 (Temporary Redirect): The browser does not cache. Every click hits your server. Essential for accurate click tracking.
Use 302 if analytics matter (they almost always do). Use 301 only for static, permanent mappings where tracking is not needed.
Redirect Sequence#
1. Client requests GET /abc1234
2. App server looks up "abc1234" in Redis
3. Cache hit → return 302 with Location header
4. Cache miss → query database → populate cache → return 302
5. Log the click event asynchronously
Analytics Tracking#
Never block the redirect path with analytics writes. Use an asynchronous pipeline.
Redirect Handler → Message Queue (Kafka) → Analytics Consumer → Analytics DB
Each click event includes:
- Short code
- Timestamp
- IP address (for geo lookup)
- User-Agent (for device/browser detection)
- Referer header
Store aggregated analytics in a time-series database (ClickHouse, TimescaleDB) or a columnar store for efficient querying.
Link Expiration#
Two strategies for cleaning up expired links:
- Lazy deletion: Check
expires_aton every read. If expired, return 404 and optionally delete the record. Simple but leaves dead rows in the database. - Active cleanup: A background cron job periodically scans for expired rows and deletes them in batches. Keeps the database lean.
Use both together. Lazy deletion ensures correctness. Active cleanup prevents table bloat.
Rate Limiting#
Protect the creation endpoint from abuse.
- Per-user rate limit: Authenticated users get a higher quota (e.g., 100 URLs/hour)
- Per-IP rate limit: Anonymous users get a lower quota (e.g., 10 URLs/hour)
- Implementation: Token bucket or sliding window counter in Redis
Key: rate_limit:{user_id}:{window}
Value: counter
TTL: window duration
High-Level Architecture#
┌──────────────┐
│ DNS / CDN │
└──────┬───────┘
│
┌──────▼───────┐
│Load Balancer │
└──────┬───────┘
│
┌────────────┼────────────┐
│ │ │
┌─────▼─────┐┌────▼────┐┌─────▼─────┐
│ App Server ││App Server││App Server │
└─────┬─────┘└────┬────┘└─────┬─────┘
│ │ │
┌─────▼────────────▼────────────▼─────┐
│ Redis Cache │
└─────────────────┬───────────────────┘
│
┌─────────────────▼───────────────────┐
│ Database (Primary) │
│ + Read Replicas │
└─────────────────┬───────────────────┘
│
┌─────────────────▼───────────────────┐
│ Kafka → Analytics Pipeline │
└─────────────────────────────────────┘
Key Design Decisions Summary#
| Decision | Recommendation | Reason |
|---|---|---|
| ID generation | Counter with range allocation | No collisions, horizontally distributable |
| Database | PostgreSQL (or DynamoDB at extreme scale) | Simple, reliable, sufficient for billions of rows |
| Cache | Redis with LRU eviction | Sub-millisecond reads, 90%+ hit rate |
| Redirect code | 302 | Enables accurate analytics tracking |
| Analytics | Async via Kafka | Never block the redirect path |
| Rate limiting | Token bucket in Redis | Protects write endpoints from abuse |
| Expiration | Lazy + active cleanup | Correctness and database hygiene |
What Interviewers Are Really Testing#
The URL shortener question is not about the URL shortener. It tests whether you can:
- Estimate scale and let numbers drive decisions
- Identify the read-write ratio and optimize for the dominant path
- Handle distributed ID generation without single points of failure
- Design a caching strategy that actually moves the needle
- Think about tradeoffs (301 vs 302, SQL vs NoSQL, consistency vs availability)
Nail these fundamentals, and the specific system almost does not matter.
Design, visualize, and share system architecture diagrams instantly on codelit.io.
This is article #179 in the Codelit engineering blog series.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
AI Architecture Review
Get an AI audit covering security gaps, bottlenecks, and scaling risks
Related articles
Try these templates
Uber Real-Time Location System
Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.
6 componentsE-Commerce Checkout System
Production checkout flow with Stripe payments, inventory management, and fraud detection.
11 componentsURL Shortener Service
Scalable URL shortening with analytics, custom aliases, and expiration.
7 componentsBuild this architecture
Generate an interactive architecture for URL Shortener System Design in seconds.
Try it in Codelit →
Comments