Design a URL Shortener — Step by Step System Design
The deceptively simple problem#
"Design a URL shortener like TinyURL." It sounds simple — map long URLs to short ones. But the details reveal core system design concepts: hashing, caching, database design, and horizontal scaling.
Requirements#
Functional:
- Shorten a URL → return a short link
- Redirect short link → original URL
- Custom aliases (optional)
- Expiration (optional)
- Click analytics
Scale:
- 100M URLs created per month
- 10:1 read-to-write ratio → 1B redirects per month
- Store for 5 years → 6B URLs total
The short URL format#
https://short.ly/abc123
The path abc123 is the key. With base62 encoding (a-z, A-Z, 0-9):
- 6 characters → 62^6 = 56.8 billion unique URLs
- 7 characters → 62^7 = 3.5 trillion
6 characters is enough for our scale.
ID generation strategies#
Auto-increment + base62#
Simple: database auto-increments an ID, encode it in base62.
ID: 12345 → base62: "dnh"
Problem: Predictable. Users can guess the next URL. Also, single database becomes a bottleneck.
Pre-generated IDs#
Generate random IDs in advance and store them in a pool. When a URL is shortened, pop an ID from the pool.
Pros: No collision. Fast (pre-computed). Distributed-friendly. Cons: Need to manage the pool. Slightly more complex.
Hash + truncate#
Hash the URL (MD5/SHA-256) and take the first 6-7 characters.
MD5("https://example.com/long-url") → "d41d8c" (first 6)
Problem: Collisions. Two different URLs can hash to the same prefix. Need collision detection + retry.
Snowflake-style#
Distributed unique ID generator. Combines timestamp + machine ID + sequence number. No coordination needed between servers.
Best for production — no collisions, no central bottleneck, naturally sortable by time.
Database design#
CREATE TABLE urls (
id CHAR(7) PRIMARY KEY, -- the short code
original TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP,
user_id UUID,
clicks BIGINT DEFAULT 0
);
Database choice:
- Read-heavy (10:1 ratio) → PostgreSQL with read replicas
- Massive scale → DynamoDB or Cassandra (key-value lookup is their sweet spot)
Caching for hot URLs#
The top 20% of URLs get 80% of traffic (Pareto principle). Cache them.
Request → Check Redis cache
↓ hit → redirect (1ms)
↓ miss → query database → cache result → redirect
Cache size: 20% of daily URLs. If 30M redirects/day with 20% hot → cache 6M entries. At ~500 bytes each = 3GB. Fits easily in Redis.
Eviction: LRU (least recently used). Hot URLs stay cached, cold ones get evicted.
The redirect: 301 vs 302#
301 (Permanent): Browser caches the redirect. Future requests don't hit your server. Good for performance, bad for analytics (you can't count clicks).
302 (Temporary): Browser always hits your server first. You can count every click. Slightly slower.
Use 302 if you need analytics. Use 301 if you want to reduce server load.
Click analytics#
Don't block the redirect to record analytics. Fire and forget:
- User requests
short.ly/abc123 - Server looks up URL, sends 302 redirect immediately
- Asynchronously: push click event to Kafka/SQS
- Analytics worker processes: geo, device, referrer, timestamp
The user gets their redirect in milliseconds. Analytics are processed in the background.
Scaling#
| Component | Strategy |
|---|---|
| App servers | Horizontal behind load balancer |
| Database | Partition by short code, read replicas |
| Cache | Redis Cluster, consistent hashing |
| ID generation | Snowflake (no coordination) |
| Analytics | Kafka → ClickHouse (async) |
See the full architecture#
On Codelit, search "URL shortener" in ⌘K to load the complete architecture — ID generator, Redis cache, CDN edge redirects, analytics pipeline, all connected with data flow.
Practice this interview question: search "URL shortener" on Codelit.io and explore every component.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
90+ Templates
Practice with real-world architectures — Uber, Netflix, Slack, and more
Related articles
Try these templates
Uber Real-Time Location System
Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.
6 componentsNetflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsE-Commerce Checkout System
Production checkout flow with Stripe payments, inventory management, and fraud detection.
11 componentsBuild this architecture
Generate an interactive architecture for Design a URL Shortener in seconds.
Try it in Codelit →
Comments