Design a Content Delivery Pipeline — From Upload to Global Distribution
Every piece of content is a pipeline#
When a creator uploads a video, image, or document, it doesn't just get stored. It goes through a multi-stage pipeline: validation, processing, enrichment, storage, and distribution.
YouTube processes 500 hours of video per minute. Instagram handles 100M+ photos per day. Understanding this pipeline teaches you async processing, distributed storage, and CDN architecture.
The pipeline stages#
1. Ingestion#
Accept uploads reliably:
- Chunked upload — Split large files into segments, upload in parallel, resume on failure
- Presigned URLs — Client uploads directly to S3, bypassing your server (reduces load)
- Deduplication — Hash the file, check if identical content already exists
- Virus scanning — Scan uploads before processing
2. Validation#
Before spending compute on processing:
- Format check — Is this actually a video/image/document?
- Size limits — Reject files exceeding maximum size
- Content policy — AI-based content moderation (NSFW, violence, copyright)
- Metadata extraction — Duration, resolution, codec, GPS location, EXIF data
3. Processing#
Transform content into deliverable formats:
Video:
Source (4K H.265) → Transcoder →
1080p H.264 @ 5Mbps
720p H.264 @ 2.5Mbps
480p H.264 @ 1Mbps
Audio AAC @ 128kbps
Thumbnails (10 per video)
Preview GIF
Subtitles (auto-generated)
Images:
Source (RAW/PNG) → Processor →
Large (1200px) WebP + JPEG
Medium (600px) WebP + JPEG
Thumbnail (200px) WebP + JPEG
Blur hash (placeholder)
Documents:
Source (PDF/DOCX) → Processor →
Text extraction
Page thumbnails
Searchable index
Sanitized version
4. Enrichment#
Add intelligence to raw content:
- Auto-tagging — ML classifies content (landscape, food, sports)
- Transcription — Speech-to-text for videos and podcasts
- Translation — Auto-translate captions to multiple languages
- Embedding generation — Vector embeddings for search and recommendations
5. Storage#
Multi-tier storage based on access patterns:
| Tier | Storage | Access | Cost |
|---|---|---|---|
| Hot | SSD/NVMe | Frequent (new content) | High |
| Warm | HDD/S3 Standard | Moderate (older content) | Medium |
| Cold | S3 Glacier | Rare (archived) | Low |
Lifecycle rules automatically move content between tiers based on age and access frequency.
6. Distribution#
Get content to users fast:
- CDN edge caching — Popular content cached at 200+ edge locations
- Adaptive bitrate — Stream quality adjusts to user's bandwidth
- Geographic routing — Serve from the closest edge server
- Pre-warming — Push viral content to edges before demand spikes
Architecture#
Upload → Ingestion API → Validation Queue → Processing Workers
↓
Enrichment Workers
↓
Storage (multi-tier)
↓
CDN Distribution
↓
User Playback/View
Key principle: Everything after ingestion is asynchronous. The user gets an "upload complete" response immediately, then processing happens in the background.
Scaling the pipeline#
Horizontal workers: Scale processing workers independently. During upload spikes, auto-scale transcoding workers from 10 to 100.
Priority queues: Premium users get priority processing. Viral content gets re-prioritized for faster CDN distribution.
Idempotent processing: If a worker crashes mid-transcode, another worker can restart from the source. No corrupted outputs.
Progress tracking: Store pipeline state in a database. Users see: "Processing... 40% → Generating thumbnails → Publishing."
Monitoring#
Track pipeline health:
- Queue depth — Processing backlog (alert when growing)
- Processing time — P95 transcode time per resolution
- Error rate — Failed conversions per hour
- Storage growth — Daily storage consumption trend
- CDN hit rate — Percentage served from edge vs origin
Visualize your content pipeline#
See how upload, processing, storage, and CDN connect — try Codelit to generate an interactive diagram of your content delivery pipeline.
Key takeaways#
- Async everything after upload — users don't wait for processing
- Chunked upload + presigned URLs for reliable large file handling
- Multi-format transcoding — same content in many resolutions/formats
- Content moderation before publishing — AI screening at upload time
- Multi-tier storage — hot/warm/cold based on access patterns
- CDN pre-warming for predictable spikes (launches, events)
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
Related articles
Try these templates
OpenAI API Request Pipeline
7-stage pipeline from API call to token generation, handling millions of requests per minute.
8 componentsNetflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsSpotify Music Streaming Platform
Music streaming with personalized recommendations, offline sync, and social features.
9 componentsBuild this architecture
Generate an interactive architecture for Design a Content Delivery Pipeline in seconds.
Try it in Codelit →
Comments