system-designinfrastructurestreaming

Design a Content Delivery Pipeline — From Upload to Global Distribution

March 24, 2026 4 min readBy Mo Discussion

Every piece of content is a pipeline#

When a creator uploads a video, image, or document, it doesn't just get stored. It goes through a multi-stage pipeline: validation, processing, enrichment, storage, and distribution.

YouTube processes 500 hours of video per minute. Instagram handles 100M+ photos per day. Understanding this pipeline teaches you async processing, distributed storage, and CDN architecture.

The pipeline stages#

1. Ingestion#

Accept uploads reliably:

Chunked upload — Split large files into segments, upload in parallel, resume on failure
Presigned URLs — Client uploads directly to S3, bypassing your server (reduces load)
Deduplication — Hash the file, check if identical content already exists
Virus scanning — Scan uploads before processing

2. Validation#

Before spending compute on processing:

Format check — Is this actually a video/image/document?
Size limits — Reject files exceeding maximum size
Content policy — AI-based content moderation (NSFW, violence, copyright)
Metadata extraction — Duration, resolution, codec, GPS location, EXIF data

3. Processing#

Transform content into deliverable formats:

Video:

Source (4K H.265) → Transcoder →
  1080p H.264 @ 5Mbps
  720p H.264 @ 2.5Mbps
  480p H.264 @ 1Mbps
  Audio AAC @ 128kbps
  Thumbnails (10 per video)
  Preview GIF
  Subtitles (auto-generated)

Images:

Source (RAW/PNG) → Processor →
  Large (1200px) WebP + JPEG
  Medium (600px) WebP + JPEG
  Thumbnail (200px) WebP + JPEG
  Blur hash (placeholder)

Documents:

Source (PDF/DOCX) → Processor →
  Text extraction
  Page thumbnails
  Searchable index
  Sanitized version

4. Enrichment#

Add intelligence to raw content:

Auto-tagging — ML classifies content (landscape, food, sports)
Transcription — Speech-to-text for videos and podcasts
Translation — Auto-translate captions to multiple languages
Embedding generation — Vector embeddings for search and recommendations

5. Storage#

Multi-tier storage based on access patterns:

Tier	Storage	Access	Cost
Hot	SSD/NVMe	Frequent (new content)	High
Warm	HDD/S3 Standard	Moderate (older content)	Medium
Cold	S3 Glacier	Rare (archived)	Low

Lifecycle rules automatically move content between tiers based on age and access frequency.

6. Distribution#

Get content to users fast:

CDN edge caching — Popular content cached at 200+ edge locations
Adaptive bitrate — Stream quality adjusts to user's bandwidth
Geographic routing — Serve from the closest edge server
Pre-warming — Push viral content to edges before demand spikes

Architecture#

Upload → Ingestion API → Validation Queue → Processing Workers
                                              ↓
                                    Enrichment Workers
                                              ↓
                                    Storage (multi-tier)
                                              ↓
                                    CDN Distribution
                                              ↓
                                    User Playback/View

Key principle: Everything after ingestion is asynchronous. The user gets an "upload complete" response immediately, then processing happens in the background.

Scaling the pipeline#

Horizontal workers: Scale processing workers independently. During upload spikes, auto-scale transcoding workers from 10 to 100.

Priority queues: Premium users get priority processing. Viral content gets re-prioritized for faster CDN distribution.

Idempotent processing: If a worker crashes mid-transcode, another worker can restart from the source. No corrupted outputs.

Progress tracking: Store pipeline state in a database. Users see: "Processing... 40% → Generating thumbnails → Publishing."

Monitoring#

Track pipeline health:

Queue depth — Processing backlog (alert when growing)
Processing time — P95 transcode time per resolution
Error rate — Failed conversions per hour
Storage growth — Daily storage consumption trend
CDN hit rate — Percentage served from edge vs origin

Visualize your content pipeline#

See how upload, processing, storage, and CDN connect — try Codelit to generate an interactive diagram of your content delivery pipeline.

Key takeaways#

Async everything after upload — users don't wait for processing
Chunked upload + presigned URLs for reliable large file handling
Multi-format transcoding — same content in many resolutions/formats
Content moderation before publishing — AI screening at upload time
Multi-tier storage — hot/warm/cold based on access patterns
CDN pre-warming for predictable spikes (launches, events)

{ }

Explore the YouTube architecture interactively

Try it →

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

api design

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

8 min read

system design

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

7 min read

api

API-First Design Methodology — Design Before You Implement

7 min read

Try these templates

OpenAI API Request Pipeline

7-stage pipeline from API call to token generation, handling millions of requests per minute.

8 components

Netflix Video Streaming Architecture

Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.

10 components

Spotify Music Streaming Platform

Music streaming with personalized recommendations, offline sync, and social features.

9 components

Build this architecture

Generate an interactive architecture for Design a Content Delivery Pipeline in seconds.

Try it in Codelit →

system-designinfrastructurestreaming

Design a Content Delivery Pipeline — From Upload to Global Distribution

March 24, 2026 4 min readBy Mo Discussion

Every piece of content is a pipeline#

When a creator uploads a video, image, or document, it doesn't just get stored. It goes through a multi-stage pipeline: validation, processing, enrichment, storage, and distribution.

YouTube processes 500 hours of video per minute. Instagram handles 100M+ photos per day. Understanding this pipeline teaches you async processing, distributed storage, and CDN architecture.

The pipeline stages#

1. Ingestion#

Accept uploads reliably:

Chunked upload — Split large files into segments, upload in parallel, resume on failure
Presigned URLs — Client uploads directly to S3, bypassing your server (reduces load)
Deduplication — Hash the file, check if identical content already exists
Virus scanning — Scan uploads before processing

2. Validation#

Before spending compute on processing:

Format check — Is this actually a video/image/document?
Size limits — Reject files exceeding maximum size
Content policy — AI-based content moderation (NSFW, violence, copyright)
Metadata extraction — Duration, resolution, codec, GPS location, EXIF data

3. Processing#

Transform content into deliverable formats:

Video:

Source (4K H.265) → Transcoder →
  1080p H.264 @ 5Mbps
  720p H.264 @ 2.5Mbps
  480p H.264 @ 1Mbps
  Audio AAC @ 128kbps
  Thumbnails (10 per video)
  Preview GIF
  Subtitles (auto-generated)

Images:

Source (RAW/PNG) → Processor →
  Large (1200px) WebP + JPEG
  Medium (600px) WebP + JPEG
  Thumbnail (200px) WebP + JPEG
  Blur hash (placeholder)

Documents:

Source (PDF/DOCX) → Processor →
  Text extraction
  Page thumbnails
  Searchable index
  Sanitized version

4. Enrichment#

Add intelligence to raw content:

Auto-tagging — ML classifies content (landscape, food, sports)
Transcription — Speech-to-text for videos and podcasts
Translation — Auto-translate captions to multiple languages
Embedding generation — Vector embeddings for search and recommendations

5. Storage#

Multi-tier storage based on access patterns:

Tier	Storage	Access	Cost
Hot	SSD/NVMe	Frequent (new content)	High
Warm	HDD/S3 Standard	Moderate (older content)	Medium
Cold	S3 Glacier	Rare (archived)	Low

Lifecycle rules automatically move content between tiers based on age and access frequency.

6. Distribution#

Get content to users fast:

CDN edge caching — Popular content cached at 200+ edge locations
Adaptive bitrate — Stream quality adjusts to user's bandwidth
Geographic routing — Serve from the closest edge server
Pre-warming — Push viral content to edges before demand spikes

Architecture#

Upload → Ingestion API → Validation Queue → Processing Workers
                                              ↓
                                    Enrichment Workers
                                              ↓
                                    Storage (multi-tier)
                                              ↓
                                    CDN Distribution
                                              ↓
                                    User Playback/View

Key principle: Everything after ingestion is asynchronous. The user gets an "upload complete" response immediately, then processing happens in the background.

Scaling the pipeline#

Horizontal workers: Scale processing workers independently. During upload spikes, auto-scale transcoding workers from 10 to 100.

Priority queues: Premium users get priority processing. Viral content gets re-prioritized for faster CDN distribution.

Idempotent processing: If a worker crashes mid-transcode, another worker can restart from the source. No corrupted outputs.

Progress tracking: Store pipeline state in a database. Users see: "Processing... 40% → Generating thumbnails → Publishing."

Monitoring#

Track pipeline health:

Queue depth — Processing backlog (alert when growing)
Processing time — P95 transcode time per resolution
Error rate — Failed conversions per hour
Storage growth — Daily storage consumption trend
CDN hit rate — Percentage served from edge vs origin

Visualize your content pipeline#

See how upload, processing, storage, and CDN connect — try Codelit to generate an interactive diagram of your content delivery pipeline.

Key takeaways#

Async everything after upload — users don't wait for processing
Chunked upload + presigned URLs for reliable large file handling
Multi-format transcoding — same content in many resolutions/formats
Content moderation before publishing — AI screening at upload time
Multi-tier storage — hot/warm/cold based on access patterns
CDN pre-warming for predictable spikes (launches, events)

{ }

Explore the YouTube architecture interactively

Try it →

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

api design

Build this architecture

Generate an interactive architecture for Design a Content Delivery Pipeline in seconds.

Try it in Codelit →

Design a Content Delivery Pipeline — From Upload to Global Distribution

Every piece of content is a pipeline#

The pipeline stages#

1. Ingestion#

2. Validation#

3. Processing#

4. Enrichment#

5. Storage#

6. Distribution#

Architecture#

Scaling the pipeline#

Monitoring#

Visualize your content pipeline#

Key takeaways#

Comments

Related articles

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

API-First Design Methodology — Design Before You Implement

Try these templates

OpenAI API Request Pipeline

Netflix Video Streaming Architecture

Spotify Music Streaming Platform

Build this architecture

Design a Content Delivery Pipeline — From Upload to Global Distribution

Every piece of content is a pipeline#

The pipeline stages#

1. Ingestion#

2. Validation#

3. Processing#

4. Enrichment#

5. Storage#

6. Distribution#

Architecture#

Scaling the pipeline#

Monitoring#

Visualize your content pipeline#

Key takeaways#

Comments

Related articles

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

API-First Design Methodology — Design Before You Implement

Try these templates

OpenAI API Request Pipeline

Netflix Video Streaming Architecture

Spotify Music Streaming Platform

Build this architecture